Brian Lee <brian.kihoon....@gmail.com> added the comment:

Thanks for clarifying - I see now that the docs specifically call out the lack 
of guarantees here with "usually but not always regard them as equivalent".

I did want to specifically explain the context of my bug; 

1. NumPy's strings have some unexpected behavior because they have fixed-length 
strings (represented inline) and var-length strings (which are pointers to 
Python strings). Various arcana dictate which version you get, and wrappers 
like pandas.read_csv can also throw a wrench in the mix. It is quite easy for 
the nominal "string type" to change from under you, which is how I stumbled on 
this bug.

2. I was using functools.cache as a way to intern objects and short-circuit 
otherwise very expensive equality calculations by reducing them to pointer 
comparisons - hence my desire for exact cache hits when typed=False.

While I agree this is Working As Documented, it does not Work As Expected in my 
opinion. I would expect the stdlib optimized implementation to follow the same 
behavior as this naive implementation, which does consider "hello world" and 
np.str_("hello world") to be equivalent.

def cache(func):
  _cache = {}
  @functools.wraps(func)
  def wrapped(*args, **kwargs):
    cache_key = tuple(args) + tuple(kwargs.items())
    if cache_key not in _cache:
      _cache[cache_key] = func(*args, **kwargs)
    return _cache[cache_key]
  return wrapped

----------
title: functools.lru_cache does not consider strings and numpy strings as 
equivalent -> functools.lru_cache does not guarantee cache hits for equal values

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue44992>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to