André Pinto created AIRFLOW-2067:
------------------------------------
Summary: Scheduler abruptly failing tasks
Key: AIRFLOW-2067
URL: https://issues.apache.org/jira/browse/AIRFLOW-2067
Project: Apache Airflow
Issue Type: Bug
Components: scheduler, worker
Affects Versions: 1.9.0
Reporter: André Pinto
We have a massive DAG with hundreds of tasks (responsible to orchestrate the
daily conversion of all the data sets we have from JSON into Parquet). Since we
updated to the latest version (1.9.0) we have been occasionally getting some
apparently random failures on some of these tasks.
They happen in a way that the logs are not uploaded to S3, but looking at the
file system I can find them. They are not very useful though. Example:
root@efe05e677183:~/airflow/logs/emr_json_to_parquet_rescheduled/conversion_delete_output_prod.consolidated_user_search_deduper.all_user_searches_dedup.user_search.Search/2018-01-25T06:00:00#
cat 1.log
[2018-01-26 06:01:18,118] \{cli.py:374} INFO - Running on host efe05e677183
[2018-01-26 06:01:18,253] \{models.py:1197} INFO - Dependencies all met for
<TaskInstance:
emr_json_to_parquet_rescheduled.conversion_delete_output_prod.consolidated_user_search_deduper.all_user_searches_dedup.user_search.Search
2018-01-25 06:00:00 [queued]>
[2018-01-26 06:01:19,202] \{models.py:1197} INFO - Dependencies all met for
<TaskInstance:
emr_json_to_parquet_rescheduled.conversion_delete_output_prod.consolidated_user_search_deduper.all_user_searches_dedup.user_search.Search
2018-01-25 06:00:00 [queued]>
[2018-01-26 06:01:19,202] \{models.py:1407} INFO -
--------------------------------------------------------------------------------
Starting attempt 1 of 3
--------------------------------------------------------------------------------
All of them are similar, as if the process was killed at the beginning without
having time to upload the log file to S3.
Our Airflow instance is running in LocalExecutor mode.
Other smaller DAGs do not seem to experience this problem.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)