This is the issue:
[2018-08-05 22:08:21,952] {jobs.py:906} INFO - NEXT RUN DATE: 2018-08-03
00:00:00+00:00 tzinfo: <Timezone [UTC]>
[2018-08-05 22:08:22,007] {jobs.py:1425} INFO - Created <DagRun
example_http_operator @ 2018-08-03 02:00:00+02:00:
scheduled__2018-08-03T00:00:00+00:00, externally triggered: False>
[2018-08-05 22:08:24,651] {jobs.py:906} INFO - NEXT RUN DATE: 2018-08-04
02:00:00+02:00 tzinfo: psycopg2.tz.FixedOffsetTimezone(offset=120, name=None)
[2018-08-05 22:08:24,696] {jobs.py:1425} INFO - Created <DagRun
example_http_operator @ 2018-08-04 02:00:00+02:00:
scheduled__2018-08-04T02:00:00+02:00, externally triggered: False>
Notice at line 1+2: that the next run date is correctly in UTC but from the DB
it gets a +2. At the next bit (3+4) we get a psycopg2.tz.FixedOffsetTimezone
which should be set to UTC according to the specs of
https://github.com/spoqa/sqlalchemy-utc
<https://github.com/spoqa/sqlalchemy-utc> , but it isn’t.
So changing your setting of the DB to UTC fixes the symptom but not the cause.
B.
> On 5 Aug 2018, at 22:03, Ash Berlin-Taylor <[email protected]>
> wrote:
>
> Sorry for being terse before.
>
> So the issue is that the ts loaded from the DB is not in UTC, it's in GB/+01
> (the default of the DB server)
>
> For me, on a currently running 1.9 (no TZ) db:
>
> airflow=# select * from task_instance;
> get_op | example_http_operator | 2018-07-23 00:00:00
>
> This date time appears in the log url, and the path it looks at on S3 is
>
> .../example_http_operator/2018-07-23T00:00:00/1.log
>
> If my postgres server has a default timezone of GB (which the one running on
> my laptop does), and I then apply the migration then it is converted to that
> local time.
>
> airflow=# select * from task_instance;
> get_op | example_http_operator | 2018-07-23 01:00:00+01
>
> airflow=# set timezone=UTC;
> airflow=# select * from task_instance;
> get_op | example_http_operator | 2018-07-23 00:00:00+00
>
>
> This is all okay so far. The migration has kept the column at the same moment
> in time.
>
> The issue come when the UI tries to display logs for this old task: because
> the timezone of the connection is not UTC, PG returns a date with a +01 TZ.
> Thus after the migration this old task tries to look for a log file of
>
> .../example_http_operator/2018-07-23T01:00:00/1.log
>
> which doesn't exist - it's changed the time it has rendered from midnight (in
> v1.9) to 1am (in v1.10).
>
> (This is with my change to log_filename_template from UPDATING.md in my other
> branch)
>
> Setting the timezone to UTC per connection means the behaviour of Airflow
> doesn't change depending on how the server is configured.
>
> -ash
>
>> On 5 Aug 2018, at 20:58, Bolke de Bruin <[email protected]> wrote:
>>
>> Digging in a bit further.
>>
>> {{{{ ti.dag_id }}}}/{{{{ ti.task_id }}}}/{{{{ ts }}}}/{{{{ try_number
>> }}}}.log
>>
>> is the format
>>
>> ts = execution_date.isoformat and should be in UTC afaik.
>>
>> something is weird tbh.
>>
>> B.
>>
>>
>>> On 5 Aug 2018, at 21:32, Bolke de Bruin <[email protected]> wrote:
>>>
>>> Ash,
>>>
>>> Reading your proposed changes on your “set-timezone-to-utc” branch and
>>> below analysis, I am not sure what you are perceiving as an issue.
>>>
>>> For conversion we assume everything is stored in UTC and in a naive format.
>>> Conversion then adds the timezone information. This results in the following
>>>
>>> postgres timezone = “Europe/Amsterdam”
>>>
>>>
>>> airflow=# select * from task_instance;
>>> get_op | example_http_operator | 2018-07-27 02:00:00+02
>>>
>>> airflow=# set timezone=UTC;
>>> airflow=# select * from task_instance;
>>> get_op | example_http_operator | 2018-07-27 00:00:00+00
>>>
>>> If we don’t set the timezone in the connection postgres assumes server
>>> timezone (in my case “Europe/Amsterdam”). So every datetime Airflow
>>> receives will be in “Europe/Amsterdam” format. However as we defined the
>>> model to use UTCDateTime it will always convert the returned DateTime to
>>> UTC.
>>>
>>> If we have configured Airflow to support something else as UTC as the
>>> default timezone or a DAG has a associated timezone we only convert to that
>>> timezone when calculating the next runtime (not for cron btw). Nowhere else
>>> and thus we are UTC everywhere.
>>>
>>> What do you think is inconsistent?
>>>
>>> Bolke
>>>
>>>
>>>
>>>
>>>
>>>
>>>> On 5 Aug 2018, at 18:13, Ash Berlin-Taylor
>>>> <[email protected]> wrote:
>>>>
>>>> Relating to 2): I'm not sure that the upgrade from timezoneless to
>>>> timezone aware colums in the task instance is right, or at least it's not
>>>> what I expected.
>>>>
>>>> Before weren't all TZs from schedule dates etc in UTC? For the same task
>>>> instance (these outputs from psql directly):
>>>>
>>>> before: execution_date=2017-09-04 00:00:00
>>>> after: execution_date=2017-09-04 01:00:00+01
>>>>
>>>> **Okay the migration is fine**. It appears that the migration has done the
>>>> right thing, but my local DB I'm testing with has a Timezone of GB set, so
>>>> Postgres converts it to that TZ on returning an object.
>>>>
>>>> 3) Do we need to set the TZ of the connection to UTC in SQLAlchemy to have
>>>> consistent behaviour? Is this possible some how? I don't know SQLAlchemy
>>>> that well.
>>>>
>>>>
>>>> -ash
>>>>
>>>>> On 5 Aug 2018, at 16:01, Ash Berlin-Taylor
>>>>> <[email protected]> wrote:
>>>>>
>>>>> 1.) Missing UPDATING note about change of task_log_reader to now always
>>>>> being "task" (was "s3.task" before.). Logging config is much simpler now
>>>>> though. This may be particular to my logging config, but given how much
>>>>> of a pain it was to set up S3 logging in 1.9 I have shared my config with
>>>>> some people in the Gitter chat so It's not just me.
>>>>>
>>>>> 2) The path that log-files are written to in S3 has changed (again - this
>>>>> happened from 1.8 to 1.9). I'd like to avoid having to move all of my log
>>>>> files again to continue viewing them. The change is that the path now (in
>>>>> 1.10) has a timezone in it, and the date is in local time, before it was
>>>>> UTC:
>>>>>
>>>>> before: 2018-07-23T00:00:00/1.log
>>>>> after: 2018-07-23T01:00:00+01:00/1.log
>>>>>
>>>>> We can possibly get away with an updating note about this to set a custom
>>>>> log_filename_template. Testing this now.
>>>>>
>>>>>> On 5 Aug 2018, at 15:00, Ash Berlin-Taylor <[email protected]> wrote:
>>>>>>
>>>>>> -1(binding) from me.
>>>>>>
>>>>>> Installed with:
>>>>>>
>>>>>> AIRFLOW_GPL_UNIDECODE=yes pip install
>>>>>> 'https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz#egg=apache-airflow[emr
>>>>>>
>>>>>> <https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz#egg=apache-airflow[emr>,
>>>>>> s3, crypto]>=1.10'
>>>>>>
>>>>>> Install went fine.
>>>>>>
>>>>>> Our DAGs that use SparkSubmitOperator are now failing as there is now a
>>>>>> hard dependency on the Kubernetes client libs, but the `emr` group
>>>>>> doesn't mention this.
>>>>>>
>>>>>> Introduced in https://github.com/apache/incubator-airflow/pull/3112
>>>>>> <https://github.com/apache/incubator-airflow/pull/3112>
>>>>>>
>>>>>> I see two options for this - either conditionally enable k8s:// support
>>>>>> if the import works, or (less preferred) add kube-client to the emr deps
>>>>>> (which I like less)
>>>>>>
>>>>>> Sorry - this is the first time I've been able to test it.
>>>>>>
>>>>>> I will install this dep manually and continue testing.
>>>>>>
>>>>>> -ash
>>>>>>
>>>>>> (Normally no time at home due to new baby, but I got a standing desk,
>>>>>> and a carrier meaning she can sleep on me and I can use my laptop. Win!)
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 4 Aug 2018, at 22:32, Bolke de Bruin <[email protected]
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>
>>>>>>> Bump.
>>>>>>>
>>>>>>> Committers please cast your vote.
>>>>>>>
>>>>>>> B.
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>>> On 3 Aug 2018, at 13:23, Driesprong, Fokko <[email protected]
>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>
>>>>>>>> +1 Binding
>>>>>>>>
>>>>>>>> Installed it using: SLUGIFY_USES_TEXT_UNIDECODE=yes pip install
>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz
>>>>>>>>
>>>>>>>> <https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz>
>>>>>>>>
>>>>>>>> Cheers, Fokko
>>>>>>>>
>>>>>>>> 2018-08-03 9:47 GMT+02:00 Bolke de Bruin <[email protected]>:
>>>>>>>>
>>>>>>>>> Hey all,
>>>>>>>>>
>>>>>>>>> I have cut Airflow 1.10.0 RC3. This email is calling a vote on the
>>>>>>>>> release,
>>>>>>>>> which will last for 72 hours. Consider this my (binding) +1.
>>>>>>>>>
>>>>>>>>> Airflow 1.10.0 RC 3 is available at:
>>>>>>>>>
>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/ <
>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/>
>>>>>>>>>
>>>>>>>>> apache-airflow-1.10.0rc3+incubating-source.tar.gz is a source release
>>>>>>>>> that
>>>>>>>>> comes with INSTALL instructions.
>>>>>>>>> apache-airflow-1.10.0rc3+incubating-bin.tar.gz is the binary Python
>>>>>>>>> "sdist"
>>>>>>>>> release.
>>>>>>>>>
>>>>>>>>> Public keys are available at:
>>>>>>>>>
>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/ <
>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/>
>>>>>>>>>
>>>>>>>>> The amount of JIRAs fixed is over 700. Please have a look at the
>>>>>>>>> changelog.
>>>>>>>>> Since RC2 the following has been fixed:
>>>>>>>>>
>>>>>>>>> * [AIRFLOW-2817] Force explicit choice on GPL dependency
>>>>>>>>> * [AIRFLOW-2716] Replace async and await py3.7 keywords
>>>>>>>>> * [AIRFLOW-2810] Fix typo in Xcom model timestamp
>>>>>>>>>
>>>>>>>>> Please note that the version number excludes the `rcX` string as well
>>>>>>>>> as the "+incubating" string, so it's now simply 1.10.0. This will
>>>>>>>>> allow us
>>>>>>>>> to rename the artifact without modifying the artifact checksums when
>>>>>>>>> we
>>>>>>>>> actually release.
>>>>>>>>>
>>>>>>>>> WARNING: Due to licensing requirements you will need to set
>>>>>>>>> SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when
>>>>>>>>> installing or upgrading. We will try to remove this requirement for
>>>>>>>>> the
>>>>>>>>> next release.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Bolke
>>>>>>
>>>>>
>>>>
>>>
>>
>