[openstack-dev] [nova] Timestamp formats in the REST API

Mark McLoughlin Tue, 29 Apr 2014 06:56:46 -0700

Hey

In this patch:


  https://review.openstack.org/83681

by Ghanshyam Mann, we encountered an unusual situation where a timestamp
in the returned XML looked like this:

  2014-04-08 09:00:14.399708+00:00

What appeared to be unusual was that the timestamp had both sub-second
time resolution and timezone information. It was felt that this wasn't a
valid timestamp format and then some debate about how to 'fix' it:

  https://review.openstack.org/87563

Anyway, this lead me down a bit of a rabbit hole, so I'm going to
attempt to document some findings.

Firstly, some definitions:

  - Python's datetime module talk about datetime objects being 'naive' 
    or 'aware'

        https://docs.python.org/2.7/library/datetime.html

       "A datetime object d is aware if d.tzinfo is not None and
        d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None,
        or if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns
        None, d is naive."

    (Most people will have encountered this already, but I'm including 
    it for completeness)

  - The ISO8601 time and date format specifies timestamps like this:

        2014-04-29T11:37:00Z

    with many variations. One distinguishing aspect of the ISO8601 
    format is the 'T' separating date and time. RFC3339 is very closely
    related and serves as easily accessible documentation of the format:

       http://www.ietf.org/rfc/rfc3339.txt

  - The Python iso8601 library allows parsing this time format, but 
    also allows subtle variations that don't conform to the standard 
    like omitting the 'T' separator:

      >>> import iso8601
      >>> iso8601.parse_date('2014-04-29 11:37:00Z')
      datetime.datetime(2014, 4, 29, 11, 37, tzinfo=<iso8601.iso8601.Utc object 
at 0x214b050>)

    Presumably this is for the pragmatic reason that when you stringify 
    a datetime object, the resulting string uses ' ' as a separator:

      >>> import datetime
      >>> str(datetime.datetime(2014, 4, 29, 11, 37))
      '2014-04-29 11:37:00'

And now some observations on what's going on in Nova:

  - We don't store timezone information in the database, but all our 
    timestamps are relative to UTC nonetheless.

  - The objects code automatically adds the UTC to naive datetime 
    objects:

        if value.utcoffset() is None:
            value = value.replace(tzinfo=iso8601.iso8601.Utc())

    so code that is ported to objects may now be using aware datetime 
    objects where they were previously using naive objects.

  - Whether we store sub-second resolution timestamps in the database 
    appears to be database specific. In my quick tests, we store that 
    information in sqlite but not MySQL.

  - However, timestamps added by SQLAlchemy when you do e.g. save() do 
    include sub-second information, so some DB API calls may return 
    sub-second timestamps even when that information isn't stored in 
    the database.

In our REST APIs, you'll essentially see one of three time formats. I'm
calling them 'isotime', 'strtime' and 'xmltime':

  - 'isotime' - this is the result from timeutils.isotime(). It 
    includes timezone information (i.e. a 'Z' prefix) but not 
    microseconds. You'll see this in places where we stringify the 
    datetime objects in the API layer using isotime() before passing 
    them to the JSON/XML serializers.

  - 'strtime' - this is the result from timeutils.strtime(). It doesn't 
    include timezone information but does include decimal seconds. This 
    is what jsonutils.dumps() uses when we're serializing API responses 

  - 'xmltime' or 'str(datetime)' format - this is just what you get 
    when you stringify a datetime using str(). If the datetime is tz 
    aware or includes non-zero microseconds, then that information will 
    be included in the result. This is a significant different versus 
    the other two formats where it is clear whether tz and microsecond 
    information is included in the string.

but there are some caveats:

  - I don't know how significant it is these days, but timestamps will 
    be serialized to strtime format when going over RPC, but won't be 
    de-serialized on the remote end. This could lead to a situation 
    where the API layer tries and stringify a strtime formatted string
    using timeutils.isotime(). (see below for a description of those 
    formats)

  - In at least one place - e.g. the 'updated' timestamp for v2
    extensions - we hardcode the timestamp as strings in the code and 
    don't currently use one of the formats above.


My conclusions from all that:

  1) This sucks

  2) At the very least, we should be clear in our API samples tests 
     which of the three formats we expect - we should only change the 
     format used in a given part of the API after considering any 
     compatibility considerations

  3) We should unify on a single format in the v3 API - IMHO, we should 
     be explicit about use of the UTC timezone and we should avoid 
     including microseconds unless there's a clear use case. In other 
     words, we should use the 'isotime' format.

  4) The 'xmltime' format is just a dumb historical mistake and since 
     XML support is now firmly out of favor, let's not waste time 
     improving the timestamp situation in XML.

  5) We should at least consider moving to a single format in the v2 
     (JSON) API. IMHO, moving from strtime to isotime for fields like 
     created_at and updated_at would be highly unlikely to cause any 
     real issues for API users.

(Following up this email with some patches that I'll link to, but I want
to link to this email from the patches themselves)

Mark.


_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] Timestamp formats in the REST API

Reply via email to