[ijs] > I stop following for the week and the world goes mad. I've > lost count of the number of times I've thought, "Are you > out of your *mind*!?" while reading this thread. You actually > considered breaking the __hash__ invariant?
It went unnoticed for some time that the original PEP 495 _did_ break it. Not intentionally. "Unintended consequence." Alex resisted accepting that it was a fatal problem at first, but was converted to One Of Us after a single night's intense torture ;-) ... > I'm assuming that the moment of temporary insanity has > passed and you consider the __hash__ invariant to be sacrosanct. Of course! > The problem here is that someone (Alexander, I think?) > demonstrated a method of producing a tzinfo class and b > and c to make this true, *given arbitrary a and d*. Equality > may not be transitive, but equality of hashes is, which > means that __hash__ must be constant over equivalence > classes in the transitive closure of the relation defined by > __eq__. In this case, this boils down to "if __hash__ ignores > fold, all datetime objects must have the same hash". Alex also sketched an approach to constructing a far higher-quality hash (than a constant function), but it required having, in advance (of the first hash() call), all tzinfos that could possibly be used across a program's run. For example, if we knew in advance there was only one possible non-fixed-offset zone Z, hash(x) could convert x to zone Z. then convert the result of that (ignoring its `fold`) to a timestamp (as a timedelta object) relative to 0001-01-01 00:00:00 in Z, then hash the timestamp. Then all spellings in all zones of one of the times in a Z fold would have the same hash. It's clever, but can't see a way to make it practical. There's nothing, e.g., to stop code from building a brand new tzinfo as a big string containing Python code, and compiling the string at runtime. > I imagine the performance implications of this are not acceptable. Heh. We could try a constant hash function and see whether anyone noticed. That would be fun :-) > There is no satisfactory way of weaseling out of this; _Something_ has to give, yes. "Satisfactory" is Guido's call. Weaseling is our job. I already did a small test to convince myself people _would_ notice if we removed dicts from the language. They're the real source of this problem ;-) > datetime equality is timeline equality now and forever, unless > you're willing to give up one of backward compatibility, the > __hash__ invariant, or the ability to implement new tzinfo classes. > (The tzinfo in the example was contrived but not buggy.) No tzinfo contrivance is necessary. The hash problem in the original PEP could be provoked using any zone whatsoever in which there's a fold (like, say, US/Eastern). I think you have in mind part of Alex's sketch of a better-than-constant hash, where zones were indeed contrived just to illustrate how nasty it _could_ get. Guido is least fond of by-magic interzone comparison, and that's what we've been picking on. All worm-arounds so far would sacrifice trichotomy in some (or all) cases of "problem times", by declaring that some problem times wouldn't compare equal to any datetime in any other zone. In the latest version of that, there would be no change to comparison results so long as pre-495 tzinfos were used. If you started to use post-495 tzinfos, that's your choice: then you get by-magic `fold` set correctly in all cases, correct zone conversions in all cases, and correct by-magic interzone subtraction in all cases - at the cost of living with that all problem times (whether in a gap or a fold) would compare "not equal" to all datetimes in all other zones. My own code couldn't care less (I've never used an interzone comparison outside of lines in datetime's test suite). You _could_ still compare them, but you'd either have to convert to a zone in which they were not problem times (timezone.utc would always work for this) first, or use by-magic interzone subtraction and check the sign of the result. So, given that a user would have to "do something" to have even the possibility of suffering a surprise that will probably never happen in their life, "not satisfactory" isn't a slam dunk. Luckily, PEP 20 is crystal clear about the right decision in this case. _______________________________________________ Datetime-SIG mailing list [email protected] https://mail.python.org/mailman/listinfo/datetime-sig The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/
