Mark Liu commented on BEAM-747:

Yes, it's worth having retry in file path matching and reading in order to 
handle IO failures from filesystem and some special cases like no file is 

As for example2, one place to add sharding name template is the ouputpath 
argument passing to FileChecksumMatcher. Instead of using ".../result*, we can 
use ".../result*-of-*". This can avoid reading irrelevant files but can't 
guaranty all shards are read unless given total number of shards. 

The current thought in my mind is passing the number of shards from command 
line as an optional test option, then pass it to the verifier. Not sure if we 
have a better way to do that. Since from previous test results, I found that 
the number of shards is runner dependent.

> Text checksum verifier is not resilient to eventually consistent filesystems
> ----------------------------------------------------------------------------
>                 Key: BEAM-747
>                 URL: https://issues.apache.org/jira/browse/BEAM-747
>             Project: Beam
>          Issue Type: Bug
>          Components: testing
>    Affects Versions: Not applicable
>            Reporter: Daniel Halperin
>            Assignee: Mark Liu
> Example 1: 
> https://builds.apache.org/job/beam_PreCommit_MavenVerify/3934/org.apache.beam$beam-examples-java/console
> Here it looks like we need to retry listing files, at least a little bit, if 
> none are found. They did show up:
> {code}
> gsutil ls 
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results\*
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-00000-of-00003
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-00001-of-00003
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-00002-of-00003
> {code}
> Example 2: 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/org.apache.beam$beam-examples-java/1525/testReport/junit/org.apache.beam.examples/WordCountIT/testE2EWordCount/
> Here it looks like we need to fill in the shard template if the filesystem 
> does not give us a consistent result:
> {code}
> Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher 
> readLines
> INFO: [0 of 1] Read 162 lines from file: 
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-00000-of-00003
> Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher 
> readLines
> INFO: [1 of 1] Read 144 lines from file: 
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-00002-of-00003
> Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher 
> matchesSafely
> INFO: Generated checksum for output data: 
> aec68948b2515e6ea35fd1ed7649c267a10a01e5
> {code}
> We missed shard 1-of-3 and hence got the wrong checksum.

This message was sent by Atlassian JIRA

Reply via email to