[jira] [Commented] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711580#comment-16711580 ] Andrew Harmon commented on AIRFLOW-3369: fyi, tested in 1.10.2 and the issue still exists > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > 2 DAG runs are created. 2018-11-18 and 108-11-17 > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699168#comment-16699168 ] Andrew Harmon commented on AIRFLOW-3369: i left a dag paused for a few days, then un-paused it today on 11/26. It did indeed create 2 DAG runs. 11/25 and 11/24. So the seems to appear anytime you unpause a dag and 2+ schedule intervals have passed > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > 2 DAG runs are created. 2018-11-18 and 108-11-17 > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693165#comment-16693165 ] Andrew Harmon commented on AIRFLOW-3369: I’m not sure, haven’t tried it yet. I may try to set my schedule to a shorter interval so I can let a few pass and then un-pause again. > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > 2 DAG runs are created. 2018-11-18 and 108-11-17 > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692131#comment-16692131 ] Andrew Harmon commented on AIRFLOW-3369: a workaround until this gets fixed. If you set the start date = to the most recent interval it will only schedule 1 DAG run. for example, if deploying your dag on 11/19 and it should run daily. Set your start_date to 11/18. It will not schedule the 11/17 DAG run in this scenario. > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > 2 DAG runs are created. 2018-11-18 and 108-11-17 > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Harmon updated AIRFLOW-3369: --- Description: If you create a DAG with catchup=False, when it is un-paused, it creates 2 dag runs. One for the most recent scheduled interval (expected) and one for the interval before that (unexpected). *Sample DAG* {code:java} from airflow import DAG from datetime import datetime from airflow.operators.dummy_operator import DummyOperator dag = DAG( dag_id='DummyTest', start_date=datetime(2018,1,1), catchup=False ) do = DummyOperator( task_id='dummy_task', dag=dag ) {code} *Result:* 2 DAG runs are created. 2018-11-18 and 108-11-17 *Expected Result:* Only 1 DAG run should have been created (2018-11-18) was: If you create a DAG with catchup=False, when it is un-paused, it creates 2 dag runs. One for the most recent scheduled interval (expected) and one for the interval before that (unexpected). *Sample DAG* {code:java} from airflow import DAG from datetime import datetime from airflow.operators.dummy_operator import DummyOperator dag = DAG( dag_id='DummyTest', start_date=datetime(2018,1,1), catchup=False ) do = DummyOperator( task_id='dummy_task', dag=dag ) {code} *Result:* . *Expected Result:* Only 1 DAG run should have been created (2018-11-18) > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > 2 DAG runs are created. 2018-11-18 and 108-11-17 > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Harmon updated AIRFLOW-3369: --- Attachment: image.png Description: If you create a DAG with catchup=False, when it is un-paused, it creates 2 dag runs. One for the most recent scheduled interval (expected) and one for the interval before that (unexpected). *Sample DAG* {code:java} from airflow import DAG from datetime import datetime from airflow.operators.dummy_operator import DummyOperator dag = DAG( dag_id='DummyTest', start_date=datetime(2018,1,1), catchup=False ) do = DummyOperator( task_id='dummy_task', dag=dag ) {code} *Result:* . *Expected Result:* Only 1 DAG run should have been created (2018-11-18) was: If you create a DAG with catchup=False, when it is un-paused, it creates 2 dag runs. One for the most recent scheduled interval (expected) and one for the interval before that (unexpected). *Sample DAG* {code:java} from airflow import DAG from datetime import datetime from airflow.operators.dummy_operator import DummyOperator dag = DAG( dag_id='DummyTest', start_date=datetime(2018,1,1), catchup=False ) do = DummyOperator( task_id='dummy_task', dag=dag ) {code} *Result:* !image-2018-11-19-13-41-49-961.png! *Expected Result:* Only 1 DAG run should have been created (2018-11-18) > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > . > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
Andrew Harmon created AIRFLOW-3369: -- Summary: Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) Key: AIRFLOW-3369 URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 Project: Apache Airflow Issue Type: Bug Affects Versions: 1.10.0 Reporter: Andrew Harmon If you create a DAG with catchup=False, when it is un-paused, it creates 2 dag runs. One for the most recent scheduled interval (expected) and one for the interval before that (unexpected). *Sample DAG* {code:java} from airflow import DAG from datetime import datetime from airflow.operators.dummy_operator import DummyOperator dag = DAG( dag_id='DummyTest', start_date=datetime(2018,1,1), catchup=False ) do = DummyOperator( task_id='dummy_task', dag=dag ) {code} *Result:* !image-2018-11-19-13-41-49-961.png! *Expected Result:* Only 1 DAG run should have been created (2018-11-18) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2680) Don't automatically percolate skipped state
[ https://issues.apache.org/jira/browse/AIRFLOW-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688469#comment-16688469 ] Andrew Harmon commented on AIRFLOW-2680: I'm looking for a solution to the same issue. I don't think you always want the SKIP status to propagate downstream. It would be nice to have some more control of this > Don't automatically percolate skipped state > --- > > Key: AIRFLOW-2680 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2680 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Andrei-Alin Popescu >Assignee: Andrei-Alin Popescu >Priority: Major > > Dear Airflow Maintainers, > > As part of our workflow, we have cases where all the upstream of a certain > task A can be skipped. In this case, airflow seems to automatically mark A as > skipped. > However, this does not quite work for us, since there are changes external to > the DAG which A needs to process, regardless of whether its upstream ran or > not. Additionally, we require A to get into an "upstream_failed" state and > not run if any its upstream tasks failed. > I don't see a trigger rule to cover this, so what would be the best way to > achieve this? I was thinking we could attach a DummyOperator as an upstream > to A, which in a way marks the fact that A depends on some external data and > needs to run anyway, but this can get really ugly for big DAGs. > I was also thinking we could have a new trigger_rule, e.g. "no_failure", > which would only trigger tasks if no upstream has failed. It differs from > "all_success" in that it will also trigger if all upstream has been skipped, > rather than percolating the skipped state on. > I'd really appreciate your feedback on this, and I'd like to know if in fact > there is already a good way of doing this with airflow that I don't know of. -- This message was sent by Atlassian JIRA (v7.6.3#76005)