[ 
https://issues.apache.org/jira/browse/BEAM-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895633#comment-16895633
 ] 

Mark Liu commented on BEAM-7527:
--------------------------------

I'm back from vacation and back on this.

https://github.com/apache/beam/pull/9052 is a draft to isolate build directory 
for each test but cause other failures that needs more effort to fix. Instead 
of continuing on that, I'd go with another approach which is to reuse python 
distribution (tar file) in all integration tests. This simplifies build 
workflow and allow us remove source copy in integration test job. By making 
distribution file build non-paralleled, hopefully we can fix this race 
condition problem.

> Python 3 test parallelization causes  test flakines due to 
> ModuleNotFoundError.
> -------------------------------------------------------------------------------
>
>                 Key: BEAM-7527
>                 URL: https://issues.apache.org/jira/browse/BEAM-7527
>             Project: Beam
>          Issue Type: Sub-task
>          Components: test-failures
>            Reporter: Valentyn Tymofieiev
>            Assignee: Mark Liu
>            Priority: Blocker
>
> I am seeing several errors in Python SDK Integration test suites, such as 
> Dataflow ValidatesRunner and Python PostCommit that fail due to one of the 
> autogenerated files not being found.
> For example:
> {noformat}
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py:84:
>  UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully 
> supported. You may encounter buggy behavior or missing features.
>   'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
> Failure: ModuleNotFoundError (No module named 'beam_runner_api_pb2') ... 
> ERROR
> ======================================================================
> ERROR: Failure: ModuleNotFoundError (No module named 'beam_runner_api_pb2')
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/failure.py",
>  line 39, in runTest
>     raise self.exc_val.with_traceback(self.tb)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/loader.py",
>  line 418, in loadTestsFromName
>     addr.filename, addr.module)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py",
>  line 47, in importFromPath
>     return self.importFromDir(dir_path, fqname)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py",
>  line 94, in importFromDir
>     mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py",
>  line 245, in load_module
>     return load_package(name, filename)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py",
>  line 217, in load_package
>     return _load(spec)
>   File "<frozen importlib._bootstrap>", line 684, in _load
>   File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
>   File "<frozen importlib._bootstrap_external>", line 678, in exec_module
>   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py",
>  line 97, in <module>
>     from apache_beam import coders
>   File "/home/jenkins/jenkins-slave/workspace/beam_Pos
> tCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/__init__.py", line 
> 19, in <module>
>     from apache_beam.coders.coders import *
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coders.py",
>  line 32, in <module>
>     from apache_beam.coders import coder_impl
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coder_impl.py",
>  line 44, in <module>
>     from apache_beam.utils import windowed_value
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/windowed_value.py",
>  line 34, in <module>
>     from apache_beam.utils.timestamp import MAX_TIMESTAMP
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/timestamp.py",
>  line 34, in <module>
>     from apache_beam.portability import common_urns
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/common_urns.py",
>  line 25, in <module>
>     from apache_beam.portability.api import metrics_pb2
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/api/metrics_pb2.py",
>  line 16, in <module>
>     import beam_runner_api_pb2 as beam__runner__api__pb2
> ModuleNotFoundError: No module named 'beam_runner_api_pb2'
> {noformat}
> {noformat}
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py:84:
>  UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully 
> supported. You may encounter buggy behavior or missing features.
>   'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
> Failure: ModuleNotFoundError (No module named 'endpoints_pb2') ... 
> ERROR
> ======================================================================
> ERROR: Failure: ModuleNotFoundError (No module named 'endpoints_pb2')
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/failure.py",
>  line 39, in runTest
>     raise self.exc_val.with_traceback(self.tb)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/loader.py",
>  line 418, in loadTestsFromName
>     addr.filename, addr.module)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py",
>  line 47, in importFromPath
>     return self.importFromDir(dir_path, fqname)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py",
>  line 94, in importFromDir
>     mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py",
>  line 245, in load_module
>     return load_package(name, filename)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py",
>  line 217, in load_package
>     return _load(spec)
>   File "<frozen importlib._bootstrap>", line 684, in _load
>   File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
>   File "<frozen importlib._bootstrap_external>", line 678, in exec_module
>   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py",
>  line 97, in <module>
>     from apache_beam import coders
>   File "/home/jenkins/jenkins-slave/workspace/beam_PostCommi
> t_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/__init__.py", line 19, in 
> <module>
>     from apache_beam.coders.coders import *
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coders.py",
>  line 32, in <module>
>     from apache_beam.coders import coder_impl
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coder_impl.py",
>  line 44, in <module>
>     from apache_beam.utils import windowed_value
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/windowed_value.py",
>  line 34, in <module>
>     from apache_beam.utils.timestamp import MAX_TIMESTAMP
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/timestamp.py",
>  line 34, in <module>
>     from apache_beam.portability import common_urns
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/common_urns.py",
>  line 24, in <module>
>     from apache_beam.portability.api import beam_runner_api_pb2
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/api/beam_runner_api_pb2.py",
>  line 16, in <module>
>     import endpoints_pb2 as endpoints__pb2
> ModuleNotFoundError: No module named 'endpoints_pb2'
> {noformat}
> The rootcause is not clear, I suspect that it may be related to the way we 
> parallelize execution of Python test suites for 2.7, 3.5, 3.6, 3.7.
> cc: [~altay] [~markflyhigh] [~Juta] [~frederik]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to