[
https://issues.apache.org/jira/browse/BEAM-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976938#comment-16976938
]
Valentyn Tymofieiev commented on BEAM-8651:
-------------------------------------------
It is possible that the error is caused by https://bugs.python.org/issue35943,
which is fixed on cpython master, but not on 3.7 branch. This would explain why
I can still reproduce the error on Python. 3.7.5rc1. Also, as per
https://bugs.python.org/issue34572, pickling fixes will not be backported to
Python 3.5, 3.6.
> Python 3 portable pipelines sometimes fail with errors in
> StockUnpickler.find_class()
> -------------------------------------------------------------------------------------
>
> Key: BEAM-8651
> URL: https://issues.apache.org/jira/browse/BEAM-8651
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Valentyn Tymofieiev
> Assignee: Valentyn Tymofieiev
> Priority: Major
> Attachments: beam8651.py
>
>
> Several Beam users [1,2] reported an error which happens on Python 3 in
> StockUnpickler.find_class.
> So far I've seen reports of the error on Python 3.5, 3.6, and 3.7.1, on Flink
> and Dataflow runners. On Dataflow runner so far I have seen this in streaming
> pipelines only, which use portable SDK worker.
> Typical stack trace:
> {noformat}
> File
> "python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
> line 1148, in _create_pardo_operation
> dofn_data = pickler.loads(serialized_fn)
>
> File "python3.5/site-packages/apache_beam/internal/pickler.py", line 265,
> in loads
> return dill.loads(s)
>
> File "python3.5/site-packages/dill/_dill.py", line 317, in loads
>
> return load(file, ignore)
>
> File "python3.5/site-packages/dill/_dill.py", line 305, in load
>
> obj = pik.load()
>
> File "python3.5/site-packages/dill/_dill.py", line 474, in find_class
>
> return StockUnpickler.find_class(self, module, name)
>
> AttributeError: Can't get attribute 'ClassName' on <module 'ModuleName' from
> 'python3.5/site-packages/filename.py'>
> {noformat}
> According to Guenther from [1]:
> {quote}
> This looks exactly like a race condition that we've encountered on Python
> 3.7.1: There's a bug in some older 3.7.x releases that breaks the
> thread-safety of the unpickler, as concurrent unpickle threads can access a
> module before it has been fully imported. See
> https://bugs.python.org/issue34572 for more information.
> The traceback shows a Python 3.6 venv so this could be a different issue
> (the unpickle bug was introduced in version 3.7). If it's the same bug then
> upgrading to Python 3.7.3 or higher should fix that issue. One potential
> workaround is to ensure that all of the modules get imported during the
> initialization of the sdk_worker, as this bug only affects imports done by
> the unpickler.
> {quote}
> Opening this for visibility. Current open questions are:
> 1. Find a minimal example to reproduce this issue.
> 2. Figure out whether users are still affected by this issue on Python 3.7.3.
> 3. Communicate a workarounds for 3.5, 3.6 users affected by this.
> [1]
> https://lists.apache.org/thread.html/5581ddfcf6d2ae10d25b834b8a61ebee265ffbcf650c6ec8d1e69408@%3Cdev.beam.apache.org%3E
--
This message was sent by Atlassian Jira
(v8.3.4#803005)