Maybe I should just reject PEP 495 in disgust. :-) I think #2 is the only reasonable solution (of these three). Of all the existing semantics we're trying to preserve, I find interzone comparison the unholiest. (With the possible exceptions of the case where both zones are known to be forever-fixed-offset, such as datetime.timezone instances and pytz.utc, and even possibly the fixed-offset zones that pytz returns from localize(). How exactly we're going to recognize those is a different question, though I have an opinion there too.)
On Mon, Sep 7, 2015 at 6:57 PM, Alexander Belopolsky < [email protected]> wrote: > The good news that other than a few editorial changes there is only one > issue which keeps me from declaring PEP 495 complete. The bad news is that > the remaining issue is subtle and while several solutions have been > proposed, neither stands out as an obviously right. > > The Problem > ----------- > > PEP 495 requires that the value of the fold attribute is ignored when two > aware datetime objects that share tzinfo are compared. This is motivated > by the reasons of backward compatibility: we want the value of fold to only > matter in conversions from one zone to another and not in arithmetic within > a single timezone. > > As Tim pointed out, this rule is in conflict with the only requirement > that a hash function must satisfy: if two objects compare as equal, their > hashes should be equal as well. > > Let t0 and t1 be two times in the fold that differ only by the value of > their fold attribute: t0.fold == 0, t1.fold == 1. Let u0 = > t0.astimezone(utc) and u1 = t1.astimezone(t1). PEP 495 requires that u0 < > u1. (In fact, this is the main purpose of the PEP to disambiguate between > t0 and t1 so that conversion to UTC is well defined.) However, by the > current PEP 495 rules, t0 == t1 is True, by the pre-PEP rule (and the PEP > rule that fold is ignored in comparisons) we also have t0 == u0 and t1 == > u1. So, we have (a) a violation of the transitivity of ==: u0 == t0 == t1 > == u1 does not imply u0 == u1 which is bad enough by itself, and (b) since > hash(u0) can be equal to hash(u1) only by a lucky coincidence, the rule > "equality of objects implies equality of hashes" leads to contradiction > because applying it to the chain u0 == t0 == t1 == u1, we get hash(u0) == > hash(t0) == hash(t1) == hash(u1) which is now a chain of equalities of > integers and on integers == is transitive, so we have hash(u0) == hash(u1) > which as we said can only happen by a lucky coincidence. > > > The Root of the Problem > ----------------------- > > The rules of arithmetic on aware datetime objects already cause some basic > mathematical identities to break. The problem described above is avoided > by not having a way to represent u1 in the timezone where u0 and u1 map to > the same local time. We still have a surprising u0 < u1, but > u0.astimezone(local) == u1.astimezone(local), but it does not rise to the > level of a hash invariant violation because u0.astimezone(local) and > u1.astimezone(local) are not only equal: they are identical in all other > ways and if we convert them back to UTC - they both convert to u0. > > The root of the hash problem is not in the t0 == t1 is True rule. It is > in u0 == t0. The later equality is just too fragile: if you add > timedelta(hour=1) to both sides to this equation, then (assuming an > ordinary 1 hour fall-back fold), you will get two datetime objects that are > no longer equal. (Indeed, local to utc equality t == u is defined as t - > t.utcoffset() == u.replace(tzinfo=t.tzinfo), but when you add 1 hour to t0, > utcoffset() changes so the equality that held for t0 and u0 will no longer > hold for t0 + timedelta(hour=1) and u0 + timedelta(hour=1).) > > PEP 495 gives us a way to break the u0 == t0 equality by replacing t0 with > an "equal" object t1 and simultaneously have u0 == t0, t0 == t1 and t1 != > u0. > > > The Solutions > ------------- > > Tim suggested several solutions to this problem, but by his own admission > neither is more than "grudgingly acceptable." For completeness, I will > also present my "non-solution." > > Solution 0: Ignore the problem. Since PEP 495 does not by itself > introduce any tzinfo implementations with variable utcoffset(), it does not > create a hash invariant violation. I call this a non-solution because it > would once again punt an unsolvable problem to tzinfo implementors. It is > unsolvable for *them* because without some variant of the rejected PEP 500, > they will have no control over datetime comparisons or hashing. > > Solution 1: Make t1 > t0. > > Solution 2: Leave t1 == t0, but make t1 != u1. > > > Request for Comments > -------------------- > > I will not discuss pros and cons on the two solutions because my goal here > was only to state the problem, identify the root case and indicate the > possible solutions. Those interested in details can read Tim's excellent > explanations in the "Another round on error-checking" [1] and "Another > approach to 495's glitches" [2] threads. > > I "bcc" python-dev in a hope that someone in the expanded forum will > either say "of course solution N is the right one and here is why" or "here > is an obviously right solution - how could you guys miss it." > > > [1]: > https://mail.python.org/pipermail/datetime-sig/2015-September/000622.html > [2]: > https://mail.python.org/pipermail/datetime-sig/2015-September/000716.html > > > _______________________________________________ > Datetime-SIG mailing list > [email protected] > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido)
_______________________________________________ Datetime-SIG mailing list [email protected] https://mail.python.org/mailman/listinfo/datetime-sig The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/
