The scheduler_zombie_task_threshold var is the number of seconds a job can
live without heartbeat before being killed as a zombie and rescheduled.

Reference -
https://github.com/apache/incubator-airflow/blob/48dab65adc69cd924fd918c6a2934006971fb25d/airflow/config_templates/default_airflow.cfg#L381-L384

*Taylor Edmiston*
TEdmiston.com <https://www.tedmiston.com/> | Blog
<http://blog.tedmiston.com>
Stack Overflow CV <https://stackoverflow.com/story/taylor> | LinkedIn
<https://www.linkedin.com/in/tedmiston/> | AngelList
<https://angel.co/taylor>


On Mon, Mar 19, 2018 at 12:59 PM, Matthew Housley <[email protected]
> wrote:

> Hi Twinkle,
> Airflow 1.7 reached end of life roughly a year ago.
> https://cwiki.apache.org/confluence/display/AIRFLOW/
> Airflow+Release+Planning+and+Supported+Release+Lifetime
>
> Could you do some testing to see if you can reproduce this issue with
> Airflow 1.9?
> best,
> Matt
>
> On Fri, Mar 16, 2018 at 4:56 AM twinkle <[email protected]>
> wrote:
>
> > Hi,
> >
> > I am using airflow v1.7.1.3. Some of the tasks in the pipeline gets
> killed
> > as Zombie.
> >
> > A pattern in that has come out that it happens in those jobs which are
> > downloading data from MySql DB.
> >
> > I am doing the following steps in those tasks:
> >
> > 1. getting connection from db, using Hook
> > 2. Execute the query
> > 3. Use csv_write to write the results in csv format.
> > 4. flush the file
> > 5. close the query cursor and then the connection
> > 6. run the gc ( as otherwise we get memory issues)
> >
> >
> > Sometimes the task is successful and sometimes it is not. This behaviour
> > require some manual monitoring, which is not desirable.
> >
> > What can I do to make sure that tasks do not get killed as zombie.
> >
> > Also, I found that there is a property scheduler_zombie_task_threshold,
> if
> > i increase it , then what are the impactful area for this?
> >
> > Regards,
> > Twinkle
> >
>

Reply via email to