Do not set to datetime.now(). You could set to 2019-04-18 and it will start scheduling at 2019-04-18 2 AM.
Chen On Thu, Apr 18, 2019, 08:55 Pawel Bartoszek <[email protected]> wrote: > Ash, If I omit start_date it I get the error > Task is missing the start_date parameter > > What should I set it to then? > > On Thu, Apr 18, 2019 at 1:03 PM Ash Berlin-Taylor <[email protected]> wrote: > > > Do not set start_date to now. That will _always_ be wrong. > > https://airflow.apache.org/faq.html#what-s-the-deal-with-start-date > > > > > On 18 Apr 2019, at 12:13, Pawel Bartoszek < > [email protected]> > > wrote: > > > > > > Hi, > > > > > > When I set start_date to datetime.now() ie > > > > > > DAG( > > > dag_id="dag", > > > start_date=datetime.now(), > > > schedule_interval="0 2 * * *", > > > default_view="graph", > > > orientation="TB", > > > concurrency=1, > > > max_active_runs=1, > > > catchup=False > > > ) > > > > > > I get following info in task instance details > > > > > > DependencyReason > > > Execution Date The execution date is 2019-04-18T11:09:16.193396+00:00 > but > > > this is before the task's start date 2019-04-18T11:10:42.607861+00:00. > > > Execution Date The execution date is 2019-04-18T11:09:16.193396+00:00 > but > > > this is before the task's DAG's start date > > 2019-04-18T11:10:42.607861+00:00. > > > Dagrun Running Task instance's dagrun did not exist: Unknown reason. > > > > > > I though execution date should be set to 2019-04-19 02:00 ? > > > > > > > > > On Wed, Apr 17, 2019 at 8:37 PM Chao-Han Tsai <[email protected]> > > wrote: > > > > > >> Hi Pawel, > > >> > > >> I think you can change the start_date to later dates to avoid the > > DagRun of > > >> 2019-04-16 02:00 being scheduled. > > >> > > >> Chao-Han > > >> > > >> On Wed, Apr 17, 2019 at 10:13 AM Pawel Bartoszek < > > >> [email protected]> wrote: > > >> > > >>> Hi, > > >>> > > >>> Let's say I deploy the following DAG at 2019-04-17 5 PM > > >>> > > >>> DAG( > > >>> dag_id="dag", > > >>> start_date=datetime(year=2018, month=1, day=1, hour=2, > > minute=0), > > >>> schedule_interval="0 2 * * *, > > >>> default_view="graph", > > >>> orientation="TB", > > >>> concurrency=1, > > >>> max_active_runs=1, > > >>> catchup=False) > > >>> > > >>> > > >>> I noticed that DAG will be first scheduled for yesterday ie > 2019-04-16 > > 2 > > >>> AM. How can I avoid this? I want the DAG to be scheduled in the > future > > >>> according to the cron expression ie 2019-04-18 2 AM. > > >>> > > >>> Setting schedule_interval as > > >>> > > >>> schedule_interval=timedelta(hours=24), > > >>> > > >>> correct me if I am wrong but Airflow seems to schedule DAG 24 hours > in > > >> the > > >>> past from the time DAG was deployed. > > >>> > > >>> Thanks, > > >>> Pawel > > >>> > > >> > > >> > > >> -- > > >> > > >> Chao-Han Tsai > > >> > > > > >
