[issue37723] important performance regression on regular expression parsing

yannvgn Wed, 31 Jul 2019 09:28:26 -0700


yannvgn <[email protected]> added the comment:


> Indeed, it was not expected that the character set contains hundreds of 
> thousands items. What is its size in your real code?

> Could you please show benchmarking results for different implementations and 
> different sizes?

I can't precisely answer that, but sacremoses (a tokenization package) for 
example is strongly impacted. See 
https://github.com/alvations/sacremoses/issues/61#issuecomment-516401853

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue37723>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37723] important performance regression on regular expression parsing

Reply via email to