> > I'm slightly leaning towards %:z because changing the semantics of %z > could be construed as a backwards-incompatible change (albeit a minor one). > I know some people have been asking for a "strict" version of the dateutil > parser, and people do tend to use parsers for string validation. Adding the > %:z option has the advantage that it's unambiguously backwards compatible, > and it can be added to strftime if that is deemed desirable.
I think the issue in dateutil is a different one as the parser is fully flexible. Here, even if it can be claimed as a backwards-incompatible change (same could have been done in glibc) it seems quite fragile if you are using isoparse with %z to check that your offset does not have a ':'. Whilst in dateutil it is true that it can happen that sometimes it will parse happily things that "don't seem to be a date" (but they can actually be interpreted as so). Moreover, (ideally) this will get on a new Python version (3.7) not on a random patch. Last but not least, as a user, if you don't even read the docs. Would you not agree with %z being able to parse iso standard offsets? I actually found it surprising that it could not. I'd just keep it simple. I strongly prefer: "%z parses RFC-822/ISO 8601 standard utc offset" (what you usually work with). Over: if your offsets have a colon, use "%:z" if they dont, use "%z" if they can use Zulu remember to check for "Z" as well. BUT! As said, no authority here :) On 21 October 2017 at 17:12, Paul G <[email protected]> wrote: > Back to the subject of how to handle +-HH:MM, I think the only really > viable candidates are %z and %:z, so I think the question boils down to > whether, with strptime, we care more about consistency with GNU / glibc's > strptime (which apparently do implement %z to cover both HHMM and HH:MM) or > whether we care more about users being able to specific *exactly* the > string they want to match (e.g. allowing users to specify that a colon > found in a time zone offset is an error condition). > > I'm slightly leaning towards %:z because changing the semantics of %z > could be construed as a backwards-incompatible change (albeit a minor one). > I know some people have been asking for a "strict" version of the dateutil > parser, and people do tend to use parsers for string validation. Adding the > %:z option has the advantage that it's unambiguously backwards compatible, > and it can be added to strftime if that is deemed desirable. > > Best, > > Paul > > On 10/21/2017 09:07 AM, Mario Corchero wrote: > > Sorry, hit send by mistake on the previous message. > > > > That is fine for parsing, but my issue with this is symmetry with > strftime. > > > > > > I can agree with having a %:z for support in strftime but I think that > is a > > separate change. The issue I opened with the attached PR focused only in > > strptime to facilitate the discussion. > > > > Again, what is the alternative? > > > > > > Making %z accept time-offset rfc3339 compatible. > > > > I have a working strptime: > > > > > > Ouch, except for the fractionals seconds (which was not part of the issue > > raised) I had also a patch for the colon and another for supporting 'Z' > as > > reported in the bug tracker. I was mentioning working with Paul in the > > implementation of isoparse, as even if it might look simple it has caused > > many long-standing discussions in the past. > > > > On 21 October 2017 at 13:55, Mario Corchero <[email protected]> wrote: > > > >> > >> > >> On 21 October 2017 at 13:18, Oren Tirosh <[email protected]> wrote: > >> > >>> > >>> On Sat, 21 Oct 2017 at 13:24, Mario Corchero <[email protected]> > wrote: > >>> > >>>> My opinion (as a user, I have no authority here whatsoever) > >>>> > >>>> *1) About parsing colons in offsets with strptime* > >>>> > >>>> I think having %z support both +-HH:MM and +-HHMM would be the best > >>>> choice, as it seems the simplest for me as a user. > >>>> I'd go even further, making %z support ':' and 'Z', *a la glibc*. > >>>> This effectively means that %z can now parse: Z, ±hh:mm, ±hhmm, or ±hh > >>>> > >>> > >>> That is fine for parsing, but my issue with this is symmetry with > >>> strftime. If the same extensions are also implemented for formatting (I > >>> have a prototype) then you need some way to specify whether you want a > : > >>> separator or not. The %z will have to remain without colon on > formatting > >>> for backward compatibility. > >>> > >>> So l agree that the parser can be safely made more liberal in what it > >>> accepts, but the formatter must be strict and specific in what it > produces. > >>> > >>> I think this gives the best experience to the strptime user. It > >>>> basically makes the time-offset rfc3339 > >>>> <https://tools.ietf.org/html/rfc3339> compatible. > >>>> > >>> > >>> Yes, that's the goal. > >>> > >>> *2) Adding a handy function to build a datetime from a string > serialized > >>>> with isoformat* > >>>> Absolutely agree on having an isoparse. That would be amazing, we can > >>>> even build it on top of 1). > >>>> > >>> > >>> ...and building it on top of 1 requires several extensions and > variants. > >>> People here seem to be a bit taken aback by the scope of these > extensions. > >>> I understand this reaction, but I maintain that most or all this > complexity > >>> is necessary if you want to implement this on to of strptime rather > than a > >>> custom isoparse(). > >>> > >>> *Side note:* > >>>> I am not totally in favour with "%?:z" (probably because I am leaning > >>>> on %z doing the parsing for both and ?z will have no place on > strftime). > >>>> I think this starts to add way too much complexity to just say "parse > a > >>>> time-offset". > >>>> > >>> > >>> Again, what is the alternative? If you want a parser that accepts the > >>> output of isoformat() for all possible datetime values (except custom > >>> tzinfo) then it needs to support a missing tz offset as indicating a > naive > >>> timestamp. > >>> > >>> You can say that the real source of the asymmetry here is not with my > >>> proposal but rather in the underlying strftime/strptime: on > formatting, %z > >>> yields an empty string for a naive timestamp rather that producing an > >>> error. But on parsing, it refuses to parse a timestamp with no offset. > A > >>> truly symmetric implementation would have accepted it as an naive > >>> timestamp. > >>> > >>> Too late for %z because it must remain backward compatible, but perhaps > >>> %:z can be made to accept a missing offset as a naive timestamp. The > user > >>> can then check for naive timestamp and reject them if they are > unacceptable > >>> in that context, rather than specifying whether a missing timestamp is > >>> acceptable or not in the format string. I have no problem with either > >>> solution. > >>> > >>>> > >>>> *Implementation:* > >>>> I am happy to work with PaulG in the isoparse implementation if we > >>>> decide to go with it and if he wants to get involved :) > >>>> > >>> > >>> I have a working strptime: > >>> https://github.com/orent/cpython/tree/strptime_extensions > >>> > >>> isoparse() on top of this strptime is a trivial one-liner. > >>> > >>> Oren > >>> > >>>> > >>>> > >>>> *Thanks:* > >>>> Thanks for dedicating time to this, I think that even if minor this > >>>> would be a killer addition to 3.7 if we manage to get it through. > >>>> > >>>> On 21 October 2017 at 07:34, Oren Tirosh <[email protected]> wrote: > >>>> > >>>>> ok, let's try to separate the issues and choices on each one: > >>>>> > >>>>> 1. Extending strptime to support time zone offset with : separator: > >>>>> Should a single directive accepts either hhmm or by:mm or use two > >>>>> separate directives? > >>>>> > >>>>> 2. Round tripping of isoformat() back to datetime value: > >>>>> Implement custom isoparse() function or extend strptime so isoparse > >>>>> simply calls strptime with a default format? > >>>>> Support all variations produced by isoformat or just a subset? > >>>>> (Variations include with/without fraction, with/without tz and > separator > >>>>> choice) > >>>>> > >>>>> I suggest 1 separate directives 2a extend strptime and 2b support all > >>>>> variations. Do you have different preferences on any of these > questions? > >>>>> > >>>>> I understand that the number of extensions to support this seems > >>>>> excessive to you. > >>>>> > >>>>> Technically, my proposed "%.f" is not really necessary. I added it > for > >>>>> completeness. We can keep using ".%f" for non-optional fraction and > define > >>>>> "%?f" to implicitly include the dot. > >>>>> > >>>>> The distinction between "%z", "%:z" and "%?:z"" can also be narrowed > >>>>> down. This can be done, for example, by making "%z" and "%?s" always > accept > >>>>> hhmm with or without the : separator. > >>>>> > >>>>> On Fri, 20 Oct 2017 at 17:16, Paul G <[email protected]> wrote: > >>>>> > >>>>>> I think this would be a much bigger change to the strptime interface > >>>>>> than is actually warranted, and probably would add in additional, > >>>>>> unnecessary complexity by introducing the concept of optional > matches. > >>>>>> Adding the capability to match HH:MM offsets is a reasonable > extension > >>>>>> partially because that is a standard representation that is > currently *not* > >>>>>> covered by strptime, and the fact that that's how isoformat() > represents > >>>>>> the offset just makes this lack all the more acute. > >>>>>> > >>>>>> I think it should be uncontroversial to add *one* of these two %z > >>>>>> extensions to Python 3 without getting bogged down in allowing a > single > >>>>>> strptime string to match any output from `.isoformat`. > >>>>>> > >>>>>> That said, I'm also very much in favor of a `.isoparse` or > >>>>>> `.fromisoformat` constructor that *is* the inverse of `isoformat`, > which > >>>>>> should solve the issue without sweeping changes to how `strptime` > works. > >>>>>> > >>>>>> On 10/19/2017 04:07 PM, Oren Tirosh wrote: > >>>>>>> https://github.com/orent/cpython/tree/strptime_extensions > >>>>>>> > >>>>>>> %:z - matches +HH:MM > >>>>>>> %?:z - optional %:z > >>>>>>> %.f - equivalent to .%f > >>>>>>> %?.f - optional %.f > >>>>>>> %?t - matches ' ' or 'T' > >>>>>>> > >>>>>>> What they all have in common is that together they make it possible > >>>>>> to > >>>>>>> write a strptime format that matches all possible output variations > >>>>>> of > >>>>>>> datetime.__str__/ datetime.isoformat. > >>>>>>> > >>>>>>> The time zone not only supports the : separator but also allows > >>>>>> making the > >>>>>>> entire component optional, as isoformat() will add it only for > aware > >>>>>>> datetime objects. The seconds fraction is dropped from the default > >>>>>> string > >>>>>>> representation if the datetime represents a whole second. Since it > is > >>>>>>> dropped along with the decimal dot, I first made "%.f" that > includes > >>>>>> the > >>>>>>> dot and then created the optional variant. Finally, "%?t" can be > >>>>>> used to > >>>>>>> accept a timestamp with either of the separators defined in > iso8601. > >>>>>>> > >>>>>>> It is quite absurd that datetime cannot parse its own string > >>>>>>> representation. Using these extensions an .isoparse() method may be > >>>>>> added > >>>>>>> that calls strptime('%Y-%m-%d%?t%H:%M:%S%?.f%?:z') and supports > full > >>>>>>> round-tripping of all possible datetime values that do not not use > a > >>>>>> custom > >>>>>>> tzinfo. > >>>>>>> > >>>>>>> Oren > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Thu, 19 Oct 2017 at 17:06, Paul G <[email protected]> wrote: > >>>>>>>> > >>>>>>>> There is a new issue about the %z directive in strptime on the > issue > >>>>>>> tracker: https://bugs.python.org/issue31800 (linked to a few > related > >>>>>>> issues), and a linked PR expanding the definition of %z to match > >>>>>> HH:MM: > >>>>>>> https://github.com/python/cpython/pull/4015 > >>>>>>>> > >>>>>>>> I think either adding a %:z directive or expanding the definition > >>>>>> of %z > >>>>>>> would be pretty important, and I think there's a good case to be > >>>>>> made for > >>>>>>> either one. To summarize the arguments for people on the mailing > >>>>>> list: > >>>>>>>> > >>>>>>>> The argument for expanding the definition of %z that I find > >>>>>> strongest is > >>>>>>> that according to the linux man pages ( > >>>>>>> http://man7.org/linux/man-pages/man3/strptime.3.html ), while %z > >>>>>> generates > >>>>>>> +-HHMM in strftime, strptime is supposed to match "An RFC-822/ISO > >>>>>> 8601 > >>>>>>> standard timezone specification",and ISO 8601 uses +-HH:MM, so if > >>>>>> we're > >>>>>>> following those linux pages, we should be accepting the version > with > >>>>>> the > >>>>>>> colon. > >>>>>>>> > >>>>>>>> The argument that I find most compelling for adding a %:z > directive > >>>>>> are: > >>>>>>>> > >>>>>>>> 1. maintains the symmetry between strftime and strptime > >>>>>>>> 2. allows users to be stricter about their datetime format > >>>>>>>> 3. has precedent in that GNU's `date` command accepts %z, %:z > >>>>>> and > >>>>>>> %::z formats > >>>>>>>> > >>>>>>>> Can we establish some consensus on which should be done so that it > >>>>>> can be > >>>>>>> implemented? > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> > >>>>>>>> Paul > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Datetime-SIG mailing list > >>>>>>>> [email protected] > >>>>>>>> https://mail.python.org/mailman/listinfo/datetime-sig > >>>>>>>> The PSF Code of Conduct applies to this mailing list: > >>>>>>> https://www.python.org/psf/codeofconduct/ > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Datetime-SIG mailing list > >>>>>>> [email protected] > >>>>>>> https://mail.python.org/mailman/listinfo/datetime-sig > >>>>>>> The PSF Code of Conduct applies to this mailing list: > >>>>>> https://www.python.org/psf/codeofconduct/ > >>>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Datetime-SIG mailing list > >>>>>> [email protected] > >>>>>> https://mail.python.org/mailman/listinfo/datetime-sig > >>>>>> The PSF Code of Conduct applies to this mailing list: > >>>>>> https://www.python.org/psf/codeofconduct/ > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Datetime-SIG mailing list > >>>>> [email protected] > >>>>> https://mail.python.org/mailman/listinfo/datetime-sig > >>>>> The PSF Code of Conduct applies to this mailing list: > >>>>> https://www.python.org/psf/codeofconduct/ > >>>>> > >>>>> > >>>> > >> > > > > > > > > _______________________________________________ > > Datetime-SIG mailing list > > [email protected] > > https://mail.python.org/mailman/listinfo/datetime-sig > > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > > > > > _______________________________________________ > Datetime-SIG mailing list > [email protected] > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > >
_______________________________________________ Datetime-SIG mailing list [email protected] https://mail.python.org/mailman/listinfo/datetime-sig The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/
