[
https://issues.apache.org/jira/browse/BEAM-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924792#comment-15924792
]
Mark Liu edited comment on BEAM-1715 at 3/14/17 6:53 PM:
---------------------------------------------------------
File matcher contains function to describe
mismatch(https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tests/pipeline_verifiers.py#L116).
I think it's essentially a problem of GCS inconsistency, and it also happens
in Java. Output verification was executed twice. Failed at first time and
succeeded at second. (The second one is to retrieve mismatch message. This is
implementation detail of hamcrest.all_of).
I'd like to add ~20s wait before verification to have all output file ready on
GCS.
was (Author: markflyhigh):
File matcher contains function to describe
mismatch(https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tests/pipeline_verifiers.py#L116).
I think it's essentially a problem of GCS inconsistency, and it also happens
in Java. Output match was executed twice. Failed at first time and succeeded at
second. (The second one is to retrieve mismatch message. This is implementation
detail of hamcrest.all_of).
I'd like to add ~20s wait before verification to have all output file ready on
GCS.
> Wordcount on Dataflow checksum mismatch
> ---------------------------------------
>
> Key: BEAM-1715
> URL: https://issues.apache.org/jira/browse/BEAM-1715
> Project: Beam
> Issue Type: Bug
> Components: sdk-py
> Reporter: Ahmet Altay
> Assignee: Mark Liu
>
> This run failed:
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1502/consoleFull
> Output is:
> root: INFO: Read from given path
> gs://temp-storage-for-end-to-end-tests/py-wordcount-cloud/output/py-wordcount-1489489356/results*-of-*,
> 3179 lines, checksum: a9bcb4acd65daf8f6a9ac5e026de7803cc09f662.
> root: INFO: Read from given path
> gs://temp-storage-for-end-to-end-tests/py-wordcount-cloud/output/py-wordcount-1489489356/results*-of-*,
> 4784 lines, checksum: 33535a832b7db6d78389759577d4ff495980b9c0.
> Dataflow job:
> https://pantheon.corp.google.com/dataflow/job/2017-03-14_05_14_52-13300760513112605405?pli=1&project=apache-beam-testing
> Mark, can you confirm that this is actually a checksum mismatch and not a
> failures in the test framework? And could you also make the error more clear,
> add a message like mismatch ....
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)