[ 
https://issues.apache.org/jira/browse/AIRFLOW-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896028#comment-16896028
 ] 

ASF subversion and git services commented on AIRFLOW-5035:
----------------------------------------------------------

Commit f82f8998c72e28986052540c6318d4210390bd1d in airflow's branch 
refs/heads/v1-10-stable from Ash Berlin-Taylor
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=f82f899 ]

[AIRFLOW-5035] Replace multiprocessing.Manager with a golang-"channel" style 
(#5615)

Under heavy load (and exasperated by having `--run-duration 600`) we
found that the multiprocessing.Manager processes could be left alive.
They would consume no CPU as they were just polling on a socket, but
they would consume memory.

Instead of trying to track down all the places we might have leaked a
process I have just removed manager's from the scheduler entirely, and
re-written the multiprocessing how I would if I was writing this in
golang channels - passing objects/messages over a single channel, and
shutting down when done (so we don't need a "Done" signal, and we can
.poll() on the channel to see if there is anything to receive.

(cherry picked from commit e78cdefee364a4fdab1190f1e7491d5b28eb4417)


> Dag parsing process can leave orphan processes under heavy load
> ---------------------------------------------------------------
>
>                 Key: AIRFLOW-5035
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5035
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.10.4
>            Reporter: Ash Berlin-Taylor
>            Assignee: Ash Berlin-Taylor
>            Priority: Blocker
>             Fix For: 1.10.4
>
>
> As reported by James Meickle on the mailing list, under certain cases it is 
> possible that the multiprocessing.Manager process can end up orphaned when 
> the scheduler shuts down.
> This is relating to a new merge in 1.10.4 (i.e. 1.10.3 wasn't affected)
> The "orphan" process is massively exasperated but having a {{--run-duration 
> 600}}, but we should try and fix this if we can.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to