Ma Lin schrieb am 31.12.18 um 14:02: > On 18-12-31 19:47, Antoine Pitrou wrote: >> The complaint is that the global cache is still too costly. >> See measurements in https://bugs.python.org/issue35559 > > In this issue, using a global variable `_has_non_base16_digits` [1] will > accelerate 30%. > Is re module's internal cache [2] so bad? > > If rewrite re module's cache with C and use a custom data structure, maybe > we will get a small speedup. > > [1] `_has_non_base16_digits` in PR11287 > [1] https://github.com/python/cpython/pull/11287/files > > [2] re module's internal cache code: > [2] https://github.com/python/cpython/blob/master/Lib/re.py#L268-L295 > > _cache = {} # ordered! > _MAXCACHE = 512 > def _compile(pattern, flags): > # internal: compile pattern > if isinstance(flags, RegexFlag): > flags = flags.value > try: > return _cache[type(pattern), pattern, flags] > except KeyError: > pass > ...
I wouldn't be surprised if the slowest part here was the isinstance() check. Maybe the RegexFlag class could implement "__hash__()" as "return hash(self.value)" ? Stefan _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/