On Sat, 21 Oct 2017 at 13:24, Mario Corchero <[email protected]> wrote:
> My opinion (as a user, I have no authority here whatsoever) > > *1) About parsing colons in offsets with strptime* > > I think having %z support both +-HH:MM and +-HHMM would be the best > choice, as it seems the simplest for me as a user. > I'd go even further, making %z support ':' and 'Z', *a la glibc*. > This effectively means that %z can now parse: Z, ±hh:mm, ±hhmm, or ±hh > That is fine for parsing, but my issue with this is symmetry with strftime. If the same extensions are also implemented for formatting (I have a prototype) then you need some way to specify whether you want a : separator or not. The %z will have to remain without colon on formatting for backward compatibility. So l agree that the parser can be safely made more liberal in what it accepts, but the formatter must be strict and specific in what it produces. I think this gives the best experience to the strptime user. It basically > makes the time-offset rfc3339 <https://tools.ietf.org/html/rfc3339> > compatible. > Yes, that's the goal. *2) Adding a handy function to build a datetime from a string serialized > with isoformat* > Absolutely agree on having an isoparse. That would be amazing, we can even > build it on top of 1). > ...and building it on top of 1 requires several extensions and variants. People here seem to be a bit taken aback by the scope of these extensions. I understand this reaction, but I maintain that most or all this complexity is necessary if you want to implement this on to of strptime rather than a custom isoparse(). *Side note:* > I am not totally in favour with "%?:z" (probably because I am leaning on > %z doing the parsing for both and ?z will have no place on strftime). > I think this starts to add way too much complexity to just say "parse a > time-offset". > Again, what is the alternative? If you want a parser that accepts the output of isoformat() for all possible datetime values (except custom tzinfo) then it needs to support a missing tz offset as indicating a naive timestamp. You can say that the real source of the asymmetry here is not with my proposal but rather in the underlying strftime/strptime: on formatting, %z yields an empty string for a naive timestamp rather that producing an error. But on parsing, it refuses to parse a timestamp with no offset. A truly symmetric implementation would have accepted it as an naive timestamp. Too late for %z because it must remain backward compatible, but perhaps %:z can be made to accept a missing offset as a naive timestamp. The user can then check for naive timestamp and reject them if they are unacceptable in that context, rather than specifying whether a missing timestamp is acceptable or not in the format string. I have no problem with either solution. > > *Implementation:* > I am happy to work with PaulG in the isoparse implementation if we decide > to go with it and if he wants to get involved :) > I have a working strptime: https://github.com/orent/cpython/tree/strptime_extensions isoparse() on top of this strptime is a trivial one-liner. Oren > > > *Thanks:* > Thanks for dedicating time to this, I think that even if minor this would > be a killer addition to 3.7 if we manage to get it through. > > On 21 October 2017 at 07:34, Oren Tirosh <[email protected]> wrote: > >> ok, let's try to separate the issues and choices on each one: >> >> 1. Extending strptime to support time zone offset with : separator: >> Should a single directive accepts either hhmm or by:mm or use two >> separate directives? >> >> 2. Round tripping of isoformat() back to datetime value: >> Implement custom isoparse() function or extend strptime so isoparse >> simply calls strptime with a default format? >> Support all variations produced by isoformat or just a subset? >> (Variations include with/without fraction, with/without tz and separator >> choice) >> >> I suggest 1 separate directives 2a extend strptime and 2b support all >> variations. Do you have different preferences on any of these questions? >> >> I understand that the number of extensions to support this seems >> excessive to you. >> >> Technically, my proposed "%.f" is not really necessary. I added it for >> completeness. We can keep using ".%f" for non-optional fraction and define >> "%?f" to implicitly include the dot. >> >> The distinction between "%z", "%:z" and "%?:z"" can also be narrowed >> down. This can be done, for example, by making "%z" and "%?s" always accept >> hhmm with or without the : separator. >> >> On Fri, 20 Oct 2017 at 17:16, Paul G <[email protected]> wrote: >> >>> I think this would be a much bigger change to the strptime interface >>> than is actually warranted, and probably would add in additional, >>> unnecessary complexity by introducing the concept of optional matches. >>> Adding the capability to match HH:MM offsets is a reasonable extension >>> partially because that is a standard representation that is currently *not* >>> covered by strptime, and the fact that that's how isoformat() represents >>> the offset just makes this lack all the more acute. >>> >>> I think it should be uncontroversial to add *one* of these two %z >>> extensions to Python 3 without getting bogged down in allowing a single >>> strptime string to match any output from `.isoformat`. >>> >>> That said, I'm also very much in favor of a `.isoparse` or >>> `.fromisoformat` constructor that *is* the inverse of `isoformat`, which >>> should solve the issue without sweeping changes to how `strptime` works. >>> >>> On 10/19/2017 04:07 PM, Oren Tirosh wrote: >>> > https://github.com/orent/cpython/tree/strptime_extensions >>> > >>> > %:z - matches +HH:MM >>> > %?:z - optional %:z >>> > %.f - equivalent to .%f >>> > %?.f - optional %.f >>> > %?t - matches ' ' or 'T' >>> > >>> > What they all have in common is that together they make it possible to >>> > write a strptime format that matches all possible output variations of >>> > datetime.__str__/ datetime.isoformat. >>> > >>> > The time zone not only supports the : separator but also allows making >>> the >>> > entire component optional, as isoformat() will add it only for aware >>> > datetime objects. The seconds fraction is dropped from the default >>> string >>> > representation if the datetime represents a whole second. Since it is >>> > dropped along with the decimal dot, I first made "%.f" that includes >>> the >>> > dot and then created the optional variant. Finally, "%?t" can be used >>> to >>> > accept a timestamp with either of the separators defined in iso8601. >>> > >>> > It is quite absurd that datetime cannot parse its own string >>> > representation. Using these extensions an .isoparse() method may be >>> added >>> > that calls strptime('%Y-%m-%d%?t%H:%M:%S%?.f%?:z') and supports full >>> > round-tripping of all possible datetime values that do not not use a >>> custom >>> > tzinfo. >>> > >>> > Oren >>> > >>> > >>> > >>> > On Thu, 19 Oct 2017 at 17:06, Paul G <[email protected]> wrote: >>> >> >>> >> There is a new issue about the %z directive in strptime on the issue >>> > tracker: https://bugs.python.org/issue31800 (linked to a few related >>> > issues), and a linked PR expanding the definition of %z to match HH:MM: >>> > https://github.com/python/cpython/pull/4015 >>> >> >>> >> I think either adding a %:z directive or expanding the definition of >>> %z >>> > would be pretty important, and I think there's a good case to be made >>> for >>> > either one. To summarize the arguments for people on the mailing list: >>> >> >>> >> The argument for expanding the definition of %z that I find strongest >>> is >>> > that according to the linux man pages ( >>> > http://man7.org/linux/man-pages/man3/strptime.3.html ), while %z >>> generates >>> > +-HHMM in strftime, strptime is supposed to match "An RFC-822/ISO 8601 >>> > standard timezone specification",and ISO 8601 uses +-HH:MM, so if we're >>> > following those linux pages, we should be accepting the version with >>> the >>> > colon. >>> >> >>> >> The argument that I find most compelling for adding a %:z directive >>> are: >>> >> >>> >> 1. maintains the symmetry between strftime and strptime >>> >> 2. allows users to be stricter about their datetime format >>> >> 3. has precedent in that GNU's `date` command accepts %z, %:z and >>> > %::z formats >>> >> >>> >> Can we establish some consensus on which should be done so that it >>> can be >>> > implemented? >>> >> >>> >> Best, >>> >> >>> >> Paul >>> >> >>> >> _______________________________________________ >>> >> Datetime-SIG mailing list >>> >> [email protected] >>> >> https://mail.python.org/mailman/listinfo/datetime-sig >>> >> The PSF Code of Conduct applies to this mailing list: >>> > https://www.python.org/psf/codeofconduct/ >>> > >>> > >>> > >>> > _______________________________________________ >>> > Datetime-SIG mailing list >>> > [email protected] >>> > https://mail.python.org/mailman/listinfo/datetime-sig >>> > The PSF Code of Conduct applies to this mailing list: >>> https://www.python.org/psf/codeofconduct/ >>> > >>> >>> _______________________________________________ >>> Datetime-SIG mailing list >>> [email protected] >>> https://mail.python.org/mailman/listinfo/datetime-sig >>> The PSF Code of Conduct applies to this mailing list: >>> https://www.python.org/psf/codeofconduct/ >>> >> >> _______________________________________________ >> Datetime-SIG mailing list >> [email protected] >> https://mail.python.org/mailman/listinfo/datetime-sig >> The PSF Code of Conduct applies to this mailing list: >> https://www.python.org/psf/codeofconduct/ >> >> >
_______________________________________________ Datetime-SIG mailing list [email protected] https://mail.python.org/mailman/listinfo/datetime-sig The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/
