[
https://issues.apache.org/jira/browse/AIRFLOW-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012711#comment-17012711
]
ASF GitHub Bot commented on AIRFLOW-6529:
-----------------------------------------
sarutak commented on pull request #7128: [AIRFLOW-6529] Serialization error
occurs when the scheduler tries to run on macOS.
URL: https://github.com/apache/airflow/pull/7128
When we try to run the scheduler on macOS, we will get a serialization error
like as follows.
```
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor
SequentialExecutor
[2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the
scheduler
[2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each
file at most -1 times
[2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files
in /Users/sarutak/airflow/dags
[2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files
in /Users/sarutak/airflow/dags
[2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned
tasks for active dag runs
[2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when
executing execute_helper
Traceback (most recent call last):
File
"/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
line 1498, in _execute
self._execute_helper()
File
"/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
line 1531, in _execute_helper
self.processor_agent.start()
File
"/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py",
line 348, in start
self._process.start()
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line
121, in start
self._popen = self._Popen(self)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line
224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line
283, in _Popen
return Popen(process_obj)
File
"/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line
32, in __init__
super().__init__(process_obj)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line
19, in __init__
self._launch(process_obj)
File
"/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line
47, in _launch
reduction.dump(process_obj, fp)
File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line
60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object
'SchedulerJob._execute.<locals>.processor_factory'
```
The reason is scheduler try to run subprocesses using multiprocessing with
spawn mode.
Actually, as of Python 3.8, spawn mode is the default mode in macOS.
---
Issue link: WILL BE INSERTED BY
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
- [x] Description above provides context of the change
- [x] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN =
JIRA ID<sup>*</sup>
- [x] Unit tests coverage for changes (not needed for documentation changes)
- [x] Commits follow "[How to write a good git commit
message](http://chris.beams.io/posts/git-commit/)"
- [x] Relevant documentation is updated including usage instructions.
- [x] I will engage committers as explained in [Contribution Workflow
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
<sup>*</sup> For document-only changes commit message can start with
`[AIRFLOW-XXXX]`.
---
In case of fundamental code change, Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party
License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
Read the [Pull Request
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
for more information.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Serialization error occurs when the scheduler tries to run on macOS.
> --------------------------------------------------------------------
>
> Key: AIRFLOW-6529
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6529
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.10.8
> Environment: macOS
> Python 3.8
> multiprocessing with spawn mode
> Reporter: Kousuke Saruta
> Assignee: Kousuke Saruta
> Priority: Major
>
> When we try to run the scheduler on macOS, we will get a serialization error
> like as follows.
> {code}
> ____________ _____________
> ____ |__( )_________ __/__ /________ __
> ____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
> ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
> _/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
> [2020-01-10 19:54:41,974] {executor_loader.py:59} INFO - Using executor
> SequentialExecutor
> [2020-01-10 19:54:41,983] {scheduler_job.py:1462} INFO - Starting the
> scheduler
> [2020-01-10 19:54:41,984] {scheduler_job.py:1469} INFO - Processing each file
> at most -1 times
> [2020-01-10 19:54:41,984] {scheduler_job.py:1472} INFO - Searching for files
> in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1474} INFO - There are 27 files
> in /Users/sarutak/airflow/dags
> [2020-01-10 19:54:42,025] {scheduler_job.py:1527} INFO - Resetting orphaned
> tasks for active dag runs
> [2020-01-10 19:54:42,059] {scheduler_job.py:1500} ERROR - Exception when
> executing execute_helper
> Traceback (most recent call last):
> File
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
> line 1498, in _execute
> self._execute_helper()
> File
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
> line 1531, in _execute_helper
> self.processor_agent.start()
> File
> "/Users/sarutak/work/oss/airflow-env/master-python3.8.1/lib/python3.8/site-packages/airflow/utils/dag_processing.py",
> line 348, in start
> self._process.start()
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/process.py", line
> 121, in start
> self._popen = self._Popen(self)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line
> 224, in _Popen
> return _default_context.get_context().Process._Popen(process_obj)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/context.py", line
> 283, in _Popen
> return Popen(process_obj)
> File
> "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line
> 32, in __init__
> super().__init__(process_obj)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_fork.py", line
> 19, in __init__
> self._launch(process_obj)
> File
> "/opt/python/3.8.1/lib/python3.8/multiprocessing/popen_spawn_posix.py", line
> 47, in _launch
> reduction.dump(process_obj, fp)
> File "/opt/python/3.8.1/lib/python3.8/multiprocessing/reduction.py", line
> 60, in dump
> ForkingPickler(file, protocol).dump(obj)
> AttributeError: Can't pickle local object
> 'SchedulerJob._execute.<locals>.processor_factory'
> {code}
> The reason is scheduler try to run subprocesses using multiprocessing with
> spawn mode.
> Actually, as of Python 3.8, spawn mode is the default mode in macOS.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)