[issue22410] Locale dependent regexps on different locales

2017-04-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Opened issue30215 for more comprehensive solution. -- stage: -> resolved status: open -> closed ___ Python tracker

[issue22410] Locale dependent regexps on different locales

2014-10-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: Patch looks good to me. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22410 ___ ___ Python-bugs-list mailing

[issue22410] Locale dependent regexps on different locales

2014-10-30 Thread Roundup Robot
Roundup Robot added the comment: New changeset 6d2788f9b20a by Serhiy Storchaka in branch '2.7': Issue #22410: Module level functions in the re module now cache compiled https://hg.python.org/cpython/rev/6d2788f9b20a New changeset cbdc658b7797 by Serhiy Storchaka in branch '3.4': Issue #22410:

[issue22410] Locale dependent regexps on different locales

2014-10-30 Thread Roundup Robot
Roundup Robot added the comment: New changeset d565dbf576f9 by Serhiy Storchaka in branch '2.7': Fixed compile error in issue #22410. The _locale module is optional. https://hg.python.org/cpython/rev/d565dbf576f9 New changeset 0c016fa378db by Serhiy Storchaka in branch '3.4': Fixed compile

[issue22410] Locale dependent regexps on different locales

2014-10-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank you for your review Antoine. Committed patch has fixed only part of the problem. It doesn't fix the problem of explicitly compiled patterns. Better solution requires changes to the _sre module. -- resolution: - fixed stage: patch review -

[issue22410] Locale dependent regexps on different locales

2014-10-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: If there are no objections I'll commit the patch. -- assignee: - serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22410 ___

[issue22410] Locale dependent regexps on different locales

2014-09-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Moved the import to the top level as Antoine suggested. -- Added file: http://bugs.python.org/file36659/re_locale_caching3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22410

[issue22410] Locale dependent regexps on different locales

2014-09-19 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file36653/re_locale_caching2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22410 ___

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Yes, it is possible to build full property table for bytes regexps at regexp compile time. But it is impossible for unicode regexps (issue22407). And in any case this doesn't solve original problem: re.match(pattern, string, re.L|re.I) can return unexpected

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Matthew Barnett
Matthew Barnett added the comment: When you lookup the pattern in the cache, include the current locale as part of the key if the pattern is locale-sensitive (you can let it be None if the pattern is not locale-sensitive). -- ___ Python tracker

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which implements Matthew's suggestion. It significant slow down the use of locale-sensitive regular expressions, there is a possibility for race condition between compiling and matching, and it doesn't solve the issue for explicitly cached

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: Rather than introduce a perf regression in 2.7 and 3.4, I would suggest to simply fix the issue in 3.5. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22410

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Matthew Barnett
Matthew Barnett added the comment: @Serhiy: You're overlooking that the LOCALE flag could be inline, e.g. r'(?L)\w+'. Basically, if you've seen the pattern before, you know whether it has an inline LOCALE flag; if you haven't seen the pattern before, you'll need to parse it anyway, and then

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Good catch Matthew! After fixing this and yet one bug (LC_CTYPE should be used instead of LC_ALL), and adding more optimizations, the performance is increased. Now the result of above microbenchmark is 18.5 usec per loop. -- Added file:

[issue22410] Locale dependent regexps on different locales

2014-09-18 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file36651/re_locale_caching.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22410 ___

[issue22410] Locale dependent regexps on different locales

2014-09-14 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: Locale-specific case-insensitive regular expression matching works only when the pattern was compiled on the same locale as used for matching. Due to caching this can cause unexpected result. Attached script demonstrates this (it requires two locales:

[issue22410] Locale dependent regexps on different locales

2014-09-14 Thread Matthew Barnett
Matthew Barnett added the comment: The support for locales in the re module is limited to those with 1 byte per character, and only for a few properties (those provided by the underlying C library), so maybe it could do the following: If the LOCALE flag is set, then read the current locale