[jira] [Commented] (BEAM-769) Spark streaming tests fail on "nothing processed" if runtime env. is slow because timeout is hit before processing is done.
[ https://issues.apache.org/jira/browse/BEAM-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15597565#comment-15597565 ] ASF GitHub Bot commented on BEAM-769: - GitHub user amitsela opened a pull request: https://github.com/apache/incubator-beam/pull/1161 [BEAM-769] Spark streaming tests fail on "nothing processed" if runti… Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- …me env. is slow because timeout is hit before processing is done. Make graceful stop the default. Keep "pumping-in" the last batch in a mocked stream to handle overflowing batches in case of a graceful stop. Change tests accordingly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/amitsela/incubator-beam BEAM-769 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1161.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1161 commit 43c9e57ad8c7af022ba2f46ce4b8d20731f0766e Author: SelaDate: 2016-10-20T22:20:33Z [BEAM-769] Spark streaming tests fail on "nothing processed" if runtime env. is slow because timeout is hit before processing is done. Make graceful stop the default. Keep "pumping-in" the last batch in a mocked stream to handle overflowing batches in case of a graceful stop. Change tests accordingly. > Spark streaming tests fail on "nothing processed" if runtime env. is slow > because timeout is hit before processing is done. > --- > > Key: BEAM-769 > URL: https://issues.apache.org/jira/browse/BEAM-769 > Project: Beam > Issue Type: Bug > Components: runner-spark >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Amit Sela > > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1586/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1587/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1588/ > {code} > org.apache.beam.runners.spark.translation.streaming.FlattenStreamingTest.testFlattenUnbounded > org.apache.beam.runners.spark.translation.streaming.KafkaStreamingTest.testRun > org.apache.beam.runners.spark.translation.streaming.SimpleStreamingWordCountTest.testFixedWindows > {code} > The above tests use a hard-timeout (ungraceful stop) so if the runtime env. > is slow enough so that the batch is not done, it'll stop anyway and assert > and rightfully fail. > It's difficult to create locally because I never had trouble on my laptop. > Since Jenkins will be slow from time to time, it is reasonable enough to have > a more robust solution here : > # don't use checkpoint (Spark) if not necessary - only really necessary for > one test in {{KafkaStreamingTest}} and {{ResumeFromCheckpointStreamingTest}} > I think. > # allow for graceful stop - will take longer for each test, but should allow > the test to finish even if runtime env. is slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-769) Spark streaming tests fail on "nothing processed" if runtime env. is slow because timeout is hit before processing is done.
[ https://issues.apache.org/jira/browse/BEAM-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15597560#comment-15597560 ] Amit Sela commented on BEAM-769: [~lcwik] I agree, I'll commit and update https://github.com/apache/incubator-beam/pull/1143 accordingly. Since this is hard to replicate locally, it's best to give it a try on Jenkins ASAP. > Spark streaming tests fail on "nothing processed" if runtime env. is slow > because timeout is hit before processing is done. > --- > > Key: BEAM-769 > URL: https://issues.apache.org/jira/browse/BEAM-769 > Project: Beam > Issue Type: Bug > Components: runner-spark >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Amit Sela > > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1586/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1587/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1588/ > {code} > org.apache.beam.runners.spark.translation.streaming.FlattenStreamingTest.testFlattenUnbounded > org.apache.beam.runners.spark.translation.streaming.KafkaStreamingTest.testRun > org.apache.beam.runners.spark.translation.streaming.SimpleStreamingWordCountTest.testFixedWindows > {code} > The above tests use a hard-timeout (ungraceful stop) so if the runtime env. > is slow enough so that the batch is not done, it'll stop anyway and assert > and rightfully fail. > It's difficult to create locally because I never had trouble on my laptop. > Since Jenkins will be slow from time to time, it is reasonable enough to have > a more robust solution here : > # don't use checkpoint (Spark) if not necessary - only really necessary for > one test in {{KafkaStreamingTest}} and {{ResumeFromCheckpointStreamingTest}} > I think. > # allow for graceful stop - will take longer for each test, but should allow > the test to finish even if runtime env. is slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-769) Spark streaming tests fail on "nothing processed" if runtime env. is slow because timeout is hit before processing is done.
[ https://issues.apache.org/jira/browse/BEAM-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595461#comment-15595461 ] Luke Cwik commented on BEAM-769: I would prefer a bigger timeout over having flaky tests if we couldn't make it deterministic in some way. If a test never flakes, people won't have to look at it. > Spark streaming tests fail on "nothing processed" if runtime env. is slow > because timeout is hit before processing is done. > --- > > Key: BEAM-769 > URL: https://issues.apache.org/jira/browse/BEAM-769 > Project: Beam > Issue Type: Bug > Components: runner-spark >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Amit Sela > > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1586/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1587/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1588/ > {code} > org.apache.beam.runners.spark.translation.streaming.FlattenStreamingTest.testFlattenUnbounded > org.apache.beam.runners.spark.translation.streaming.KafkaStreamingTest.testRun > org.apache.beam.runners.spark.translation.streaming.SimpleStreamingWordCountTest.testFixedWindows > {code} > The above tests use a hard-timeout (ungraceful stop) so if the runtime env. > is slow enough so that the batch is not done, it'll stop anyway and assert > and rightfully fail. > It's difficult to create locally because I never had trouble on my laptop. > Since Jenkins will be slow from time to time, it is reasonable enough to have > a more robust solution here : > # don't use checkpoint (Spark) if not necessary - only really necessary for > one test in {{KafkaStreamingTest}} and {{ResumeFromCheckpointStreamingTest}} > I think. > # allow for graceful stop - will take longer for each test, but should allow > the test to finish even if runtime env. is slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)