durin42 created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers.
REVISION SUMMARY Per https://bugs.python.org/issue29995, re.escape() used to over-escape regular expression strings, but in Python 3.7 that's been fixed, which also improved the performance of re.escape(). Since it's both an output change for us *and* a perfomance win, let's just effectively backport the new behavior to hg on all Python versions. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D3839 AFFECTED FILES mercurial/utils/stringutil.py CHANGE DETAILS diff --git a/mercurial/utils/stringutil.py b/mercurial/utils/stringutil.py --- a/mercurial/utils/stringutil.py +++ b/mercurial/utils/stringutil.py @@ -23,6 +23,25 @@ pycompat, ) +# regex special chars pulled from https://bugs.python.org/issue29995 +# which was part of Python 3.7. +_respecial = pycompat.bytestr(b'()[]{}?*+-|^$\\.# \t\n\r\v\f') +_regexescapemap = {ord(i): (b'\\' + i).decode('latin1') for i in _respecial} + +def reescape(pat): + """Drop-in replacement for re.escape.""" + # NOTE: it is intentional that this works on unicodes and not + # bytes, as it's only possible to do the escaping with + # unicode.translate, not bytes.translate. Sigh. + wantuni = True + if isinstance(pat, bytes): + wantuni = False + pat = pat.decode('latin1') + pat = pat.translate(_regexescapemap) + if wantuni: + return pat + return pat.encode('latin1') + def pprint(o, bprefix=False): """Pretty print an object.""" if isinstance(o, bytes): To: durin42, #hg-reviewers Cc: mercurial-devel _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel