On Tue, Jan 12, 2021 at 04:32:14PM +0200, Serhiy Storchaka wrote:
> 12.01.21 12:02, Steven D'Aprano пише:

> > I propose a method:
> > 
> >     @functools.lru_cache()
> >     def function(arg):
> >         ...
> > 
> >     function.cache_export()
> > 
> > that returns a dictionary {arg: value} representing the cache. It 
> > wouldn't be the cache itself, just a shallow copy of the cache data.
> 
> What if the function supports multiple arguments (including passed by
> keyword)? Note that internal representation of the key is an
> implementation detail, so you need to invent and specify some new
> representation. For example return a list of tuples (args, kwargs, result).

Sure. That's a detail that can be worked out once we agree that this is 
a useful feature.


> Depending on the implementation, getting the list of all arguments can
> have larger that linear complexity.

I don't mind. Efficiency is not a priority for this. This is an 
introspection feature for development and debugging, not a feature for 
production. I don't expect it to be called in tight loops. I expect to 
use it from the REPL while I am debugging my code.

I might have to rethink if it was exponentionally slow, but O(n log n) 
like sorting would be fine; I'd even consider O(n**2) acceptable, with a 
documentation note that exporting large caches may be slow.


> Other cache implementations can contain additional information: the
> number of hits for every value, times. Are you interesting to get that
> information too or ignore it?

No.


> Currently the cache is thread-safe in CPython, but getting all arguments
> and values may not be (or we will need to add a synchronization overhead
> for every call of the cached function).

Can you explain further why the cached function needs additional 
syncronisation overhead?

I am quite happy for exporting to be thread-unsafe, so long as it 
doesn't crash. Don't export the cache if it is being updated from 
another thread, or you might get inaccurate results.

To be clear:

- If you export the cache from one thread while another thread is 
  reading the cache, I expect that would be safe.

- If you export the cache from one thread while another thread is 
  *modifying* the cache, I expect that the only promise we make is
  that there shouldn't be a segfault.



> And finally, what is your use case? Is it important enough to justify
> the cost?

I think so or I wouldn't have asked :-)

There shouldn't be much (or any?) runtime cost on the cache except for 
the presence of an additional method. The exported data is just a 
snapshot, it doesn't have to be a view of the cache. Changing the 
exported snapshot will not change the cache.

My use-case is debugging functions that are using an LRU cache, 
specifically complex recursive functions. I have some functions where:

    f(N)

ends up calling itself many times, but not in any obvious pattern:

    f(N-1), f(N-2), f(N-5), f(N-7), f(N-12), f(N-15), f(N-22), ...

for example. So each call to f() could make dozens of recursive calls, 
if N is big enough, and there are gaps in the calls.

I was having trouble with the function, and couldn't tell if the right 
arguments where going into the cache. What I wanted to do was peek at 
the cache and see which keys were ending up in the cache and compare 
that to what I expected.

I did end up get the function working, but I think it would have been 
much easier if I could have seen what was inside the cache and how the 
cache was changing from one call to the next.

So this is why I don't care about performance (within reason). My use 
case is interactive debugging.


-- 
Steve
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EV2W2DMXSONPHUYXGQD5HK3BIUTFIEVU/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to