[
https://issues.apache.org/jira/browse/BEAM-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883391#comment-16883391
]
Valentyn Tymofieiev commented on BEAM-7463:
-------------------------------------------
[~pabloem] could you please take a look into BQ suites? your expertise in BQ
connector may help us track this down faster.
[~Juta], Do you remember why we removed sorting in Bigquery matcher? See:
https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134.
Does BQ guarantee some order of returned result? if not, then that may very
well cause the checksum errors, but there are also other errors with this test
like table not found and but "but" field is empty, perhaps we should
investigate these separately.
Also, to echo some points from conversation on
https://github.com/apache/beam/pull/8751 for visibility: class attributes
assigned in setup/teardown method are not shared between execution of
different tests cases defined in the same test class, so the race of table
cleanup is not obvious to me.
> Bigquery IO ITs are flaky: incorrect checksum
> ---------------------------------------------
>
> Key: BEAM-7463
> URL: https://issues.apache.org/jira/browse/BEAM-7463
> Project: Beam
> Issue Type: Bug
> Components: io-python-gcp
> Reporter: Valentyn Tymofieiev
> Assignee: Pablo Estrada
> Priority: Major
> Labels: currently-failing
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38
> ----------------------------------------------------------------------
> 15:03:38 Traceback (most recent call last):
> 15:03:38 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
> line 211, in test_big_query_new_types
> 15:03:38 big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
> line 82, in run_bq_pipeline
> 15:03:38 result = p.run()
> 15:03:38 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
> line 107, in run
> 15:03:38 else test_runner_api))
> 15:03:38 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
> line 406, in run
> 15:03:38 self._options).run(False)
> 15:03:38 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
> line 419, in run
> 15:03:38 return self.runner.run_pipeline(self, self._options)
> 15:03:38 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
> line 51, in run_pipeline
> 15:03:38 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError:
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38 but: Expected checksum is
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher?
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well:
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/1000/consoleFull
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)