[ 
https://issues.apache.org/jira/browse/BEAM-7463?focusedWorklogId=254238&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254238
 ]

ASF GitHub Bot logged work on BEAM-7463:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Jun/19 07:55
            Start Date: 05/Jun/19 07:55
    Worklog Time Spent: 10m 
      Work Description: Juta commented on issue #8751: [BEAM-7463] parallelize 
BQ IT tests
URL: https://github.com/apache/beam/pull/8751#issuecomment-498978628
 
 
   I tested this on a small usecase
   (documentation: 
https://nose.readthedocs.io/en/latest/doc_tests/test_multiprocess/multiprocess.html)
   
   ```
   a = []
   
   class ParallelTest(unittest.TestCase):
     _multiprocess_can_split_ = True
   
     def setUp(self):
       a.append('hello')
   
     def tearDown(self):
       a = []
   
     def test_a(self):
       self.assertEqual(a, ['hello'])
   
     def test_b(self):
       self.assertEqual(a, ['hello'])
   ```
   and run this with `nosetests --tests test.py --processes=2`
   
   If we do not specify `_multiprocess_can_split_ = True`: what happens is that 
the `a` variable is appended to twice and one of the tests will fail. However 
in the other case the test does not share the  `a` variable and both test 
succeed
   
   In the bigquery tests 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py#L79
 it is the `dataset_id` that is shared and that causes the problems in the test 
as explained above
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 254238)
    Time Spent: 1h  (was: 50m)

> Bigquery IO ITs are flaky: incorrect checksum
> ---------------------------------------------
>
>                 Key: BEAM-7463
>                 URL: https://issues.apache.org/jira/browse/BEAM-7463
>             Project: Beam
>          Issue Type: Bug
>          Components: io-python-gcp
>            Reporter: Valentyn Tymofieiev
>            Assignee: Juta Staes
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types 
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38 
> ----------------------------------------------------------------------
> 15:03:38 Traceback (most recent call last):
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
>  line 211, in test_big_query_new_types
> 15:03:38     big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
>  line 82, in run_bq_pipeline
> 15:03:38     result = p.run()
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 15:03:38     else test_runner_api))
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 15:03:38     self._options).run(False)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 15:03:38     return self.runner.run_pipeline(self, self._options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 51, in run_pipeline
> 15:03:38     hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError: 
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and 
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38      but: Expected checksum is 
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is 
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher? 
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>  
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well: 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/1000/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to