[ https://issues.apache.org/jira/browse/BEAM-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924931#comment-15924931 ]
ASF GitHub Bot commented on BEAM-1715: -------------------------------------- GitHub user markflyhigh opened a pull request: https://github.com/apache/beam/pull/2244 [BEAM-1715] Fix Python WordCount on Dataflow Mismatch Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-<Jira issue #>] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- - Add more logs - Add sleep before verification in order to give more time to have files ready on FS, especially GCS. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markflyhigh/incubator-beam python-file-verifier-inconsistency Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2244.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2244 ---- commit c9d9dc45de8e6ad915fdc9bc48e304cb0ac5121d Author: Mark Liu <mark...@google.com> Date: 2017-03-14T19:43:45Z [BEAM-1715] Fix Python WordCount on Dataflow Mismatch commit 0a76a232a713dd35623d47e4dd40140b4b4f0f74 Author: Mark Liu <mark...@google.com> Date: 2017-03-14T19:47:18Z fixup! Remove unused import ---- > Wordcount on Dataflow checksum mismatch > --------------------------------------- > > Key: BEAM-1715 > URL: https://issues.apache.org/jira/browse/BEAM-1715 > Project: Beam > Issue Type: Bug > Components: sdk-py > Reporter: Ahmet Altay > Assignee: Mark Liu > > This run failed: > https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1502/consoleFull > Output is: > root: INFO: Read from given path > gs://temp-storage-for-end-to-end-tests/py-wordcount-cloud/output/py-wordcount-1489489356/results*-of-*, > 3179 lines, checksum: a9bcb4acd65daf8f6a9ac5e026de7803cc09f662. > root: INFO: Read from given path > gs://temp-storage-for-end-to-end-tests/py-wordcount-cloud/output/py-wordcount-1489489356/results*-of-*, > 4784 lines, checksum: 33535a832b7db6d78389759577d4ff495980b9c0. > Dataflow job: > https://pantheon.corp.google.com/dataflow/job/2017-03-14_05_14_52-13300760513112605405?pli=1&project=apache-beam-testing > Mark, can you confirm that this is actually a checksum mismatch and not a > failures in the test framework? And could you also make the error more clear, > add a message like mismatch .... -- This message was sent by Atlassian JIRA (v6.3.15#6346)