[ 
https://issues.apache.org/jira/browse/AIRFLOW-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095567#comment-17095567
 ] 

ASF GitHub Bot commented on AIRFLOW-6529:
-----------------------------------------

ashb commented on pull request #7128:
URL: https://github.com/apache/airflow/pull/7128#issuecomment-621300054


   @jhtimmins have you tried running with this PR? Is it closer to py3.8 
support with it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Serialization error occurs when the scheduler tries to run on macOS.
> --------------------------------------------------------------------
>
>                 Key: AIRFLOW-6529
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6529
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.8
>         Environment: macOS
> Python 3.8
> multiprocessing with spawn mode
>            Reporter: Kousuke Saruta
>            Assignee: Kousuke Saruta
>            Priority: Major
>
> When we try to run the scheduler on macOS, we will get a serialization error 
> like as follows.
> {code}
>   ____________       _____________
>  ____    |__( )_________  __/__  /________      __
> ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
> ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
>  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
> [2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor 
> SequentialExecutor
> [2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the 
> scheduler
> [2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each file 
> at most -1 times
> [2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files 
> in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files 
> in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned 
> tasks for active dag runs
> [2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when 
> executing execute_helper
> Traceback (most recent call last):
>   File 
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
>  line 1498, in _execute
>     self._execute_helper()
>   File 
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
>  line 1531, in _execute_helper
>     self.processor_agent.start()
>   File 
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py",
>  line 348, in start
>     self._process.start()
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line 
> 121, in start
>     self._popen = self._Popen(self)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 
> 224, in _Popen
>     return _default_context.get_context().Process._Popen(process_obj)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line 
> 283, in _Popen
>     return Popen(process_obj)
>   File 
> "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 
> 32, in __init__
>     super().__init__(process_obj)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line 
> 19, in __init__
>     self._launch(process_obj)
>   File 
> "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 
> 47, in _launch
>     reduction.dump(process_obj, fp)
>   File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line 
> 60, in dump
>     ForkingPickler(file, protocol).dump(obj)
> AttributeError: Can't pickle local object 
> 'SchedulerJob._execute.<locals>.processor_factory'
> {code}
> The reason is scheduler try to run subprocesses using multiprocessing with 
> spawn mode and the mode tries to pickle objects. In this case, 
> `processor_factory` inner method is tried to be pickled.
> Actually, as of Python 3.8, spawn mode is the default mode in macOS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to