Ash, If I omit start_date it I get the error Task is missing the start_date parameter
What should I set it to then? On Thu, Apr 18, 2019 at 1:03 PM Ash Berlin-Taylor <[email protected]> wrote: > Do not set start_date to now. That will _always_ be wrong. > https://airflow.apache.org/faq.html#what-s-the-deal-with-start-date > > > On 18 Apr 2019, at 12:13, Pawel Bartoszek <[email protected]> > wrote: > > > > Hi, > > > > When I set start_date to datetime.now() ie > > > > DAG( > > dag_id="dag", > > start_date=datetime.now(), > > schedule_interval="0 2 * * *", > > default_view="graph", > > orientation="TB", > > concurrency=1, > > max_active_runs=1, > > catchup=False > > ) > > > > I get following info in task instance details > > > > DependencyReason > > Execution Date The execution date is 2019-04-18T11:09:16.193396+00:00 but > > this is before the task's start date 2019-04-18T11:10:42.607861+00:00. > > Execution Date The execution date is 2019-04-18T11:09:16.193396+00:00 but > > this is before the task's DAG's start date > 2019-04-18T11:10:42.607861+00:00. > > Dagrun Running Task instance's dagrun did not exist: Unknown reason. > > > > I though execution date should be set to 2019-04-19 02:00 ? > > > > > > On Wed, Apr 17, 2019 at 8:37 PM Chao-Han Tsai <[email protected]> > wrote: > > > >> Hi Pawel, > >> > >> I think you can change the start_date to later dates to avoid the > DagRun of > >> 2019-04-16 02:00 being scheduled. > >> > >> Chao-Han > >> > >> On Wed, Apr 17, 2019 at 10:13 AM Pawel Bartoszek < > >> [email protected]> wrote: > >> > >>> Hi, > >>> > >>> Let's say I deploy the following DAG at 2019-04-17 5 PM > >>> > >>> DAG( > >>> dag_id="dag", > >>> start_date=datetime(year=2018, month=1, day=1, hour=2, > minute=0), > >>> schedule_interval="0 2 * * *, > >>> default_view="graph", > >>> orientation="TB", > >>> concurrency=1, > >>> max_active_runs=1, > >>> catchup=False) > >>> > >>> > >>> I noticed that DAG will be first scheduled for yesterday ie 2019-04-16 > 2 > >>> AM. How can I avoid this? I want the DAG to be scheduled in the future > >>> according to the cron expression ie 2019-04-18 2 AM. > >>> > >>> Setting schedule_interval as > >>> > >>> schedule_interval=timedelta(hours=24), > >>> > >>> correct me if I am wrong but Airflow seems to schedule DAG 24 hours in > >> the > >>> past from the time DAG was deployed. > >>> > >>> Thanks, > >>> Pawel > >>> > >> > >> > >> -- > >> > >> Chao-Han Tsai > >> > >
