Really happy to hear this moving forward. Thanks Bolke! On Tue, Nov 14, 2017 at 7:44 AM Bolke de Bruin <[email protected]> wrote:
> See inline answers below. > > Verstuurd vanaf mijn iPad > > > Op 14 nov. 2017 om 16:33 heeft Heistermann, Till < > [email protected]> het volgende geschreven: > > > > Hi Bolke, > > > > This looks great. > > > > We have had the requirement to run DAGs in different local time zones > for a while, so far we worked around the limitation on dag-level to > automate most of our DST switches. > > > > How would the approach behave in the DST-Switch corner cases? > > > > For the regular case, I understand that if start_date=datetime(2017, 1, > 1, 8, 30, 0, tzinfo=“Europe/Amsterdam”) and the schedule is “30 8 * * *”, > the DST switch would work as expected, and the dag would get scheduled at > 7:30 am UTC in European Winter and 6:30 UTC in European Summer. > > Actually no. For cron defined schedules we will always use local time, but > naive. This means your 8.30 schedule will always happen 8.30 local time > regardless. > > > > > However, if start_date=datetime(2017, 1, 1, 2, 30, 0, > tzinfo=“Europe/Amsterdam”) and the schedule is “30 2 * * *”, would we skip > a nightly run in March and have two nightly runs in October? > > This seems like the correct thing to do from a time zone logic point of > view, although I can imagine that there are many operational use cases > where the user wants something different. > > I have to verify what happens. I think what will happen is that it will > run at 3.30 as we convert to naive local time (dst unaware) add the > interval convert back to UTC. UTC will then translate to 3.30 local time > which is btw equal to 2.30 local time. > > Execution_date will be in UTC. The DAG will store time zone information so > you can decide yourself what you want to do with that. > > > > > > If start_date=datetime(2017, 1, 1, 8, 30, 0, tzinfo=“Europe/Amsterdam”) > and the schedule is timedelta(days=14), would a DST switch actually occur? > > There is some ambiguity in this case, depending on the > timedelta(days=14) being understood as either “14 days in local calendar” > or 14*24*60*60 seconds on the system clock. > > I’m not sure what the expected behaviour should be in this case. > > For timedeltas DST is in effect. It is assumed here that you want to run X > hours later, not at a specific time. Obviously if you want to keep the old > behavior (and this is the default) keep your Timezone at Utc. > > > > > Cheers, > > Till > > > > > > On 13.11.17, 19:47, "Ash Berlin-Taylor" <[email protected]> > wrote: > > > > This sounds like an awesome change! > > > > I'm happy to review (will take a look tomorrow) but won't be a > suitable tester as all our DAGs operate in UTC. > > > > -ash > > > > > >> On 13 Nov 2017, at 18:09, Bolke de Bruin <[email protected]> wrote: > >> > >> Hi All, > >> > >> I just want to make you aware that I am creating patches that make > Airflow timezone aware. The gist of the idea is that Airflow internally > will use and store UTC everywhere. This allows you to have start_date = > datetime(2017, 1, 1, tzinfo=“Europe/Amsterdam”) and Airflow will properly > take care of day light savings time. If you are using cron we will make > sure to always run at the exact time (end of interval of course) which you > specify even when DST is in effect, e.g. 8.00am is always 8.00am regardless > of if a day lights savings time has happened. DAGs that don’t have a > timezone associated, get a default timezone that is configurable. > >> > >> In AIRFLOW-288 I am tracking what needs to be done, but I am 80% there. > As the patches are invasive particularly in tests (everything needs a > timezone basically) less so in other areas I like to raise special > attention to a couple of places where this has impact. > >> > >> 1. All database DateTime fields are converted to timezone aware > Timestamp fields. This impacts MySQL deployments particularly as MySQL was > storing DateTime fields, which cannot be made timezone aware. Also, to make > sure conversion happens properly we set the connection time zone to UTC. > This is supported by Postgres and MySQL. However, it is not supported by > SQLServer. So if you are running outside of UTC you need to take special > care when upgrading. > >> > >> 2. Thou shall not use datetime.now() and datetime.utcnow() when writing > code for core (operators, sensors, scheduler etc) Airflow (in DAGs your can > still use it). Both create naive date times (yes even utcnow() ). You can > use airflow.utils.timezone utcnow() for this. As you will not be able to > store naive datetime fields anymore you will notice soon enough. > >> > >> Finally, and that is the main reason fir this email, I am looking for > feedback and testers. The PR can be found here: > https://github.com/apache/incubator-airflow/pull/2781 it doesn’t pass the > tests yet, but you can see that I am working hard on that ;-). > >> > >> Cheers > >> Bolke > > > > > > >
