Hi Paul,

The change is basically an optimization. I'm uncomfortable to design
it "only" for dateutil. What if tomorrow someone has to store an
arbitrary Python object, rather than just an integer (in range [0;
254]), into a datetime for a different optimization?

Moreover, I dislike adding a *public* method for an *internal* cache.

Right now, it is not possible to create a weak reference to a
datetime. If we make it possible, it would be possible to have an
external cache implemented with weakref.WeakSet to clear old entries
when a datetime object is detroyed.

What do you think of adding a private "_cache" attribute which would
be an arbitrary Python object? (None by default)

Victor

Le mar. 7 mai 2019 à 21:46, Paul Ganssle <p...@ganssle.io> a écrit :
>
> Greetings all,
>
> I have one last feature request that I'd like added to datetime for Python 
> 3.8, and this one I think could use some more discussion, the addition of a 
> "time zone index cache" to the datetime object. The rationale is laid out in 
> detail in bpo-35723. The general problem is that currently, every invocation 
> of utcoffset, tzname and dst needs to do full, independent calculations of 
> the time zone offsets, even for time zones where the mapping is guaranteed to 
> be stable because datetimes are immutable. I have a proof of concept 
> implementation: PR #11529.
>
> I'm envisioning that the `datetime` class will add a private `_tzidx` 
> single-byte member (it seems that this does not increase the size of the 
> datetime object, because it's just using an unused alignment byte). 
> `datetime` will also add a `tzidx()` method, which will return `_tzidx` if 
> it's been set and otherwise it will call `self.tzinfo.tzidx()`.  If 
> `self.tzinfo.tzidx()` returns a number between 0 and 254 (inclusive), it sets 
> `_tzidx` to this value. tzidx() then returns whatever self.tzinfo.tzidx() 
> returned.
>
> The value of this is that as far as I can tell, nearly all non-trivial tzinfo 
> implementations construct a list of possible offsets, and implement 
> utcoffset(), tzname() and dst() by calculating an index into that list and 
> returning it. There are almost always less than 255 distinct offsets. By 
> adding this cache on the datetime, we're using a small amount of 
> currently-unused memory to prevent unnecessary calculations about a given 
> datetime. The feature is entirely opt-in, and has no downsides if it goes 
> unused, and it makes it possible to write tzinfo implementations that are 
> both lazy and as fast as the "eager calculation" mode that pytz uses (and 
> that causes many problems for pytz's users).
>
> I have explored the idea of using an lru cache of some sort on the tzinfo 
> object itself, but there are two problems with this:
>
> 1. Calculating the hash of a datetime calls .utcoffset(), which means that it 
> is necessary to, at minimum, do a `replace` on the datetime (and constructing 
> a new datetime is a pretty considerable speed hit)
>
> 2. It will be a much bigger memory cost, since my current proposal uses 
> approximately zero additional memory (not sure if the alignment stuff is 
> platform-dependent or something, but it doesn't use additional memory on my 
> linux computer).
>
> I realize this proposal is somewhat difficult to wrap your head around, so if 
> anyone would like to chat with me about it in person, I'll be at PyCon 
> sprints until Thursday morning.
>
> Best,
> Paul
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com



-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to