nathan warshauer created AIRFLOW-1868:
-----------------------------------------
Summary: Packaged Dags not added to dag table, unable to execute
tasks
Key: AIRFLOW-1868
URL: https://issues.apache.org/jira/browse/AIRFLOW-1868
Project: Apache Airflow
Issue Type: Bug
Environment: airflow 1.8.2, celery, rabbitMQ, mySQL, aws
Reporter: nathan warshauer
Attachments: Screen Shot 2017-11-29 at 2.31.02 PM.png, Screen Shot
2017-11-29 at 4.40.39 PM.png, Screen Shot 2017-11-29 at 4.42.39 PM.png
.zip files in the dag directory do not appear to be getting added to the dag
table on the airflow database. When a .zip file is placed within the dags
folder and it contains executable .py files, the dag_id should be added to the
dag table and airflow should allow the dag to be unpaused and run through the
web server.
SELECT distinct dag.dag_id AS dag_dag_id FROM dag confirms the dag does not
exist in the dags table but shows up on the UI with the warning message "This
Dag seems to be existing only locally" however the dag exists in all 3 dag
directories (master and two workers) and the airflow.cfg has donot_pickle = True
When the dag is triggered manually via airflow trigger_dag <dag_id> the process
goes to the web server and does not execute any tasks. When I go to the task
and click start through the UI the task will execute successfully and shows the
attached state upon completion. When I do not do this process the tasks will
not enter the queue and the run sits idle as the 3rd attached image shows.
Basically, the dag CAN run manually from the zip BUT the scheduler and
underlying database tables appear to not be functioning correctly for packaged
dags.
Please let me know if I can provide any additional information regarding this
issue, or if you all have any leads that I can check out for resolving this.
dag = DAG('MY-DAG-NAME',
default_args=default_args,
schedule_interval='*/5 * * * *',
max_active_runs=1,
dagrun_timeout=timedelta(minutes=4, seconds=30))
default_args = {
'depends_on_past': False,
'email': ['[email protected]'],
'email_on_failure': True,
'email_on_retry': False,
'owner': 'airflow',
'provide_context': True,
'retries': 0,
'retry_delay': timedelta(minutes=5),
'start_date': datetime(2017,11,28)
}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)