[ https://issues.apache.org/jira/browse/BEAM-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606574#comment-15606574 ]
ASF GitHub Bot commented on BEAM-747: ------------------------------------- GitHub user markflyhigh opened a pull request: https://github.com/apache/incubator-beam/pull/1189 [BEAM-747] Fix FileChecksumMatcher That Inconsistent With Filesystem Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-<Jira issue #>] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Add retry in FileChecksumMatcher when following conditions happens: - IOException raised from filesystem - No file found from output directory - number of files found from fs doesn't equal to expected number, which is parsed from shard name using a name template. Default template "SSS-of-NNN" will be used when no template is specified. Default retry times are 4. Default sleep duration between each retry are 10s. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markflyhigh/incubator-beam file-matcher-read-retry Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1189.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1189 ---- commit 95392c860d3889b1bee4d5f0ba793465f63a0df2 Author: Mark Liu <mark...@markliu0.mtv.corp.google.com> Date: 2016-10-25T21:21:24Z [BEAM-747] Fix FileChecksumMatcher That Inconsistent With FS commit bc6a89263e4da379abec6e3cdf5bd94ace209d12 Author: Mark Liu <mark...@markliu0.mtv.corp.google.com> Date: 2016-10-25T21:45:46Z fixup! Improve Javadoc ---- > Text checksum verifier is not resilient to eventually consistent filesystems > ---------------------------------------------------------------------------- > > Key: BEAM-747 > URL: https://issues.apache.org/jira/browse/BEAM-747 > Project: Beam > Issue Type: Bug > Components: testing > Affects Versions: Not applicable > Reporter: Daniel Halperin > Assignee: Mark Liu > > Example 1: > https://builds.apache.org/job/beam_PreCommit_MavenVerify/3934/org.apache.beam$beam-examples-java/console > Here it looks like we need to retry listing files, at least a little bit, if > none are found. They did show up: > {code} > gsutil ls > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results\* > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-00000-of-00003 > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-00001-of-00003 > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-00002-of-00003 > {code} > Example 2: > https://builds.apache.org/job/beam_PostCommit_MavenVerify/org.apache.beam$beam-examples-java/1525/testReport/junit/org.apache.beam.examples/WordCountIT/testE2EWordCount/ > Here it looks like we need to fill in the shard template if the filesystem > does not give us a consistent result: > {code} > Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher > readLines > INFO: [0 of 1] Read 162 lines from file: > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-00000-of-00003 > Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher > readLines > INFO: [1 of 1] Read 144 lines from file: > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-00002-of-00003 > Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher > matchesSafely > INFO: Generated checksum for output data: > aec68948b2515e6ea35fd1ed7649c267a10a01e5 > {code} > We missed shard 1-of-3 and hence got the wrong checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)