Hi,

I am using airflow v1.7.1.3. Some of the tasks in the pipeline gets killed
as Zombie.

A pattern in that has come out that it happens in those jobs which are
downloading data from MySql DB.

I am doing the following steps in those tasks:

1. getting connection from db, using Hook
2. Execute the query
3. Use csv_write to write the results in csv format.
4. flush the file
5. close the query cursor and then the connection
6. run the gc ( as otherwise we get memory issues)


Sometimes the task is successful and sometimes it is not. This behaviour
require some manual monitoring, which is not desirable.

What can I do to make sure that tasks do not get killed as zombie.

Also, I found that there is a property scheduler_zombie_task_threshold, if
i increase it , then what are the impactful area for this?

Regards,
Twinkle

Reply via email to