Sorry, hit send by mistake on the previous message. That is fine for parsing, but my issue with this is symmetry with strftime.
I can agree with having a %:z for support in strftime but I think that is a separate change. The issue I opened with the attached PR focused only in strptime to facilitate the discussion. Again, what is the alternative? Making %z accept time-offset rfc3339 compatible. I have a working strptime: Ouch, except for the fractionals seconds (which was not part of the issue raised) I had also a patch for the colon and another for supporting 'Z' as reported in the bug tracker. I was mentioning working with Paul in the implementation of isoparse, as even if it might look simple it has caused many long-standing discussions in the past. On 21 October 2017 at 13:55, Mario Corchero <[email protected]> wrote: > > > On 21 October 2017 at 13:18, Oren Tirosh <[email protected]> wrote: > >> >> On Sat, 21 Oct 2017 at 13:24, Mario Corchero <[email protected]> wrote: >> >>> My opinion (as a user, I have no authority here whatsoever) >>> >>> *1) About parsing colons in offsets with strptime* >>> >>> I think having %z support both +-HH:MM and +-HHMM would be the best >>> choice, as it seems the simplest for me as a user. >>> I'd go even further, making %z support ':' and 'Z', *a la glibc*. >>> This effectively means that %z can now parse: Z, ±hh:mm, ±hhmm, or ±hh >>> >> >> That is fine for parsing, but my issue with this is symmetry with >> strftime. If the same extensions are also implemented for formatting (I >> have a prototype) then you need some way to specify whether you want a : >> separator or not. The %z will have to remain without colon on formatting >> for backward compatibility. >> >> So l agree that the parser can be safely made more liberal in what it >> accepts, but the formatter must be strict and specific in what it produces. >> >> I think this gives the best experience to the strptime user. It >>> basically makes the time-offset rfc3339 >>> <https://tools.ietf.org/html/rfc3339> compatible. >>> >> >> Yes, that's the goal. >> >> *2) Adding a handy function to build a datetime from a string serialized >>> with isoformat* >>> Absolutely agree on having an isoparse. That would be amazing, we can >>> even build it on top of 1). >>> >> >> ...and building it on top of 1 requires several extensions and variants. >> People here seem to be a bit taken aback by the scope of these extensions. >> I understand this reaction, but I maintain that most or all this complexity >> is necessary if you want to implement this on to of strptime rather than a >> custom isoparse(). >> >> *Side note:* >>> I am not totally in favour with "%?:z" (probably because I am leaning >>> on %z doing the parsing for both and ?z will have no place on strftime). >>> I think this starts to add way too much complexity to just say "parse a >>> time-offset". >>> >> >> Again, what is the alternative? If you want a parser that accepts the >> output of isoformat() for all possible datetime values (except custom >> tzinfo) then it needs to support a missing tz offset as indicating a naive >> timestamp. >> >> You can say that the real source of the asymmetry here is not with my >> proposal but rather in the underlying strftime/strptime: on formatting, %z >> yields an empty string for a naive timestamp rather that producing an >> error. But on parsing, it refuses to parse a timestamp with no offset. A >> truly symmetric implementation would have accepted it as an naive >> timestamp. >> >> Too late for %z because it must remain backward compatible, but perhaps >> %:z can be made to accept a missing offset as a naive timestamp. The user >> can then check for naive timestamp and reject them if they are unacceptable >> in that context, rather than specifying whether a missing timestamp is >> acceptable or not in the format string. I have no problem with either >> solution. >> >>> >>> *Implementation:* >>> I am happy to work with PaulG in the isoparse implementation if we >>> decide to go with it and if he wants to get involved :) >>> >> >> I have a working strptime: >> https://github.com/orent/cpython/tree/strptime_extensions >> >> isoparse() on top of this strptime is a trivial one-liner. >> >> Oren >> >>> >>> >>> *Thanks:* >>> Thanks for dedicating time to this, I think that even if minor this >>> would be a killer addition to 3.7 if we manage to get it through. >>> >>> On 21 October 2017 at 07:34, Oren Tirosh <[email protected]> wrote: >>> >>>> ok, let's try to separate the issues and choices on each one: >>>> >>>> 1. Extending strptime to support time zone offset with : separator: >>>> Should a single directive accepts either hhmm or by:mm or use two >>>> separate directives? >>>> >>>> 2. Round tripping of isoformat() back to datetime value: >>>> Implement custom isoparse() function or extend strptime so isoparse >>>> simply calls strptime with a default format? >>>> Support all variations produced by isoformat or just a subset? >>>> (Variations include with/without fraction, with/without tz and separator >>>> choice) >>>> >>>> I suggest 1 separate directives 2a extend strptime and 2b support all >>>> variations. Do you have different preferences on any of these questions? >>>> >>>> I understand that the number of extensions to support this seems >>>> excessive to you. >>>> >>>> Technically, my proposed "%.f" is not really necessary. I added it for >>>> completeness. We can keep using ".%f" for non-optional fraction and define >>>> "%?f" to implicitly include the dot. >>>> >>>> The distinction between "%z", "%:z" and "%?:z"" can also be narrowed >>>> down. This can be done, for example, by making "%z" and "%?s" always accept >>>> hhmm with or without the : separator. >>>> >>>> On Fri, 20 Oct 2017 at 17:16, Paul G <[email protected]> wrote: >>>> >>>>> I think this would be a much bigger change to the strptime interface >>>>> than is actually warranted, and probably would add in additional, >>>>> unnecessary complexity by introducing the concept of optional matches. >>>>> Adding the capability to match HH:MM offsets is a reasonable extension >>>>> partially because that is a standard representation that is currently >>>>> *not* >>>>> covered by strptime, and the fact that that's how isoformat() represents >>>>> the offset just makes this lack all the more acute. >>>>> >>>>> I think it should be uncontroversial to add *one* of these two %z >>>>> extensions to Python 3 without getting bogged down in allowing a single >>>>> strptime string to match any output from `.isoformat`. >>>>> >>>>> That said, I'm also very much in favor of a `.isoparse` or >>>>> `.fromisoformat` constructor that *is* the inverse of `isoformat`, which >>>>> should solve the issue without sweeping changes to how `strptime` works. >>>>> >>>>> On 10/19/2017 04:07 PM, Oren Tirosh wrote: >>>>> > https://github.com/orent/cpython/tree/strptime_extensions >>>>> > >>>>> > %:z - matches +HH:MM >>>>> > %?:z - optional %:z >>>>> > %.f - equivalent to .%f >>>>> > %?.f - optional %.f >>>>> > %?t - matches ' ' or 'T' >>>>> > >>>>> > What they all have in common is that together they make it possible >>>>> to >>>>> > write a strptime format that matches all possible output variations >>>>> of >>>>> > datetime.__str__/ datetime.isoformat. >>>>> > >>>>> > The time zone not only supports the : separator but also allows >>>>> making the >>>>> > entire component optional, as isoformat() will add it only for aware >>>>> > datetime objects. The seconds fraction is dropped from the default >>>>> string >>>>> > representation if the datetime represents a whole second. Since it is >>>>> > dropped along with the decimal dot, I first made "%.f" that includes >>>>> the >>>>> > dot and then created the optional variant. Finally, "%?t" can be >>>>> used to >>>>> > accept a timestamp with either of the separators defined in iso8601. >>>>> > >>>>> > It is quite absurd that datetime cannot parse its own string >>>>> > representation. Using these extensions an .isoparse() method may be >>>>> added >>>>> > that calls strptime('%Y-%m-%d%?t%H:%M:%S%?.f%?:z') and supports full >>>>> > round-tripping of all possible datetime values that do not not use a >>>>> custom >>>>> > tzinfo. >>>>> > >>>>> > Oren >>>>> > >>>>> > >>>>> > >>>>> > On Thu, 19 Oct 2017 at 17:06, Paul G <[email protected]> wrote: >>>>> >> >>>>> >> There is a new issue about the %z directive in strptime on the issue >>>>> > tracker: https://bugs.python.org/issue31800 (linked to a few related >>>>> > issues), and a linked PR expanding the definition of %z to match >>>>> HH:MM: >>>>> > https://github.com/python/cpython/pull/4015 >>>>> >> >>>>> >> I think either adding a %:z directive or expanding the definition >>>>> of %z >>>>> > would be pretty important, and I think there's a good case to be >>>>> made for >>>>> > either one. To summarize the arguments for people on the mailing >>>>> list: >>>>> >> >>>>> >> The argument for expanding the definition of %z that I find >>>>> strongest is >>>>> > that according to the linux man pages ( >>>>> > http://man7.org/linux/man-pages/man3/strptime.3.html ), while %z >>>>> generates >>>>> > +-HHMM in strftime, strptime is supposed to match "An RFC-822/ISO >>>>> 8601 >>>>> > standard timezone specification",and ISO 8601 uses +-HH:MM, so if >>>>> we're >>>>> > following those linux pages, we should be accepting the version with >>>>> the >>>>> > colon. >>>>> >> >>>>> >> The argument that I find most compelling for adding a %:z directive >>>>> are: >>>>> >> >>>>> >> 1. maintains the symmetry between strftime and strptime >>>>> >> 2. allows users to be stricter about their datetime format >>>>> >> 3. has precedent in that GNU's `date` command accepts %z, %:z >>>>> and >>>>> > %::z formats >>>>> >> >>>>> >> Can we establish some consensus on which should be done so that it >>>>> can be >>>>> > implemented? >>>>> >> >>>>> >> Best, >>>>> >> >>>>> >> Paul >>>>> >> >>>>> >> _______________________________________________ >>>>> >> Datetime-SIG mailing list >>>>> >> [email protected] >>>>> >> https://mail.python.org/mailman/listinfo/datetime-sig >>>>> >> The PSF Code of Conduct applies to this mailing list: >>>>> > https://www.python.org/psf/codeofconduct/ >>>>> > >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > Datetime-SIG mailing list >>>>> > [email protected] >>>>> > https://mail.python.org/mailman/listinfo/datetime-sig >>>>> > The PSF Code of Conduct applies to this mailing list: >>>>> https://www.python.org/psf/codeofconduct/ >>>>> > >>>>> >>>>> _______________________________________________ >>>>> Datetime-SIG mailing list >>>>> [email protected] >>>>> https://mail.python.org/mailman/listinfo/datetime-sig >>>>> The PSF Code of Conduct applies to this mailing list: >>>>> https://www.python.org/psf/codeofconduct/ >>>>> >>>> >>>> _______________________________________________ >>>> Datetime-SIG mailing list >>>> [email protected] >>>> https://mail.python.org/mailman/listinfo/datetime-sig >>>> The PSF Code of Conduct applies to this mailing list: >>>> https://www.python.org/psf/codeofconduct/ >>>> >>>> >>> >
_______________________________________________ Datetime-SIG mailing list [email protected] https://mail.python.org/mailman/listinfo/datetime-sig The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/
