[ 
https://issues.apache.org/jira/browse/BEAM-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902505#comment-16902505
 ] 

Valentyn Tymofieiev edited comment on BEAM-6158 at 8/7/19 10:23 PM:
--------------------------------------------------------------------

To clarify, this error is still happening. I updated the title and description 
to reflect this. 

https://github.com/apache/beam/pull/7710 removed this error on wordcount 
example, but it will still happen on other examples and may affect users 
migrating to Python 3 using currently released Beam SDKs. 

Removing a superclass call works around the issue. In particular, calling a 
superclass constructor for DoFn class [1] is not critical in current Beam SDK. 
A call of DoFn constructor triggers object initialization in [2], but there is 
also lazy initialization in place [3]. 

Nevertheless this issue may cause friction in other usecases and we should 
address it in future releases.  

[1] 
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/transforms/core.py#L422
[2] 
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/typehints/decorators.py#L201
[3] 
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/typehints/decorators.py#L204


was (Author: tvalentyn):
To clarify, this error is still happening. 
https://github.com/apache/beam/pull/7710 removed this error on wordcount 
example, but it will still happen on other examples and may affect users 
migrating to Python 3 using currently released Beam SDKs. 

Removing a superclass call works around the issue. In particular, calling a 
superclass constructor for DoFn class [1] is not critical in current Beam SDK. 
A call of DoFn constructor triggers object initialization in [2], but there is 
also lazy initialization in place [3]. 

Nevertheless this issue may cause friction in other usecases and we should 
address it in future releases.  

[1] 
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/transforms/core.py#L422
[2] 
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/typehints/decorators.py#L201
[3] 
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/typehints/decorators.py#L204

> Using --save_main_session fails on Python 3 when main module has superclass 
> constructor calls.
> ----------------------------------------------------------------------------------------------
>
>                 Key: BEAM-6158
>                 URL: https://issues.apache.org/jira/browse/BEAM-6158
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-harness
>            Reporter: Mark Liu
>            Assignee: Valentyn Tymofieiev
>            Priority: Major
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> A typical manifestation of this failure, which can be observed on several 
> Beam examples:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
>     "__main__", mod_spec)
>   File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
>     exec(code, run_globals)
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 164, in <module>                                                
>     run()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 158, in run                                                     
>     | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py",
>  line 426, in __exit__                                                        
>                  
>     self.run().wait_until_finish()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1338, in wait_until_finish                                       
>     (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:                               
>                                                              
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
>     self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session                                                    
>                                                
>     pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 280, in load_session                                                     
>                                                    
>     return dill.load_session(file_path)
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in 
> load_session
>     module = unpickler.load()
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in 
> find_class
>     return StockUnpickler.find_class(self, module, name)
> AttributeError: Can't get attribute 'ParseGameEventFn' on <module 
> 'dataflow_worker.start' from 
> '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat}
>  
> Note that the example has the following code [1]:
> {code:python}
> class ParseGameEventFn(beam.DoFn):
>   def __init__(self):
>     super(ParseGameEventFn, self).__init__()
> {code}
> https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81
> +cc: [~tvalentyn] [~robertwb] [~altay]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to