https://github.com/python/cpython/commit/3001464248edfba76fc23d4a8107dc24f2807d46
commit: 3001464248edfba76fc23d4a8107dc24f2807d46
branch: main
author: Cody Maloney <[email protected]>
committer: vstinner <[email protected]>
date: 2025-11-28T17:46:10Z
summary:

gh-141968: Use take_bytes in re._compiler (#141995)

Removes a copy going from bytearray to bytes.

Co-authored-by: Victor Stinner <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>

files:
A Misc/NEWS.d/next/Library/2025-11-26-14-20-10.gh-issue-141968.W139Pv.rst
M Lib/re/_compiler.py

diff --git a/Lib/re/_compiler.py b/Lib/re/_compiler.py
index 20dd561d1c1520..c2ca8e25abe34d 100644
--- a/Lib/re/_compiler.py
+++ b/Lib/re/_compiler.py
@@ -375,7 +375,7 @@ def _optimize_charset(charset, iscased=None, fixup=None, 
fixes=None):
     # less significant byte is a bit index in the chunk (just like the
     # CHARSET matching).
 
-    charmap = bytes(charmap) # should be hashable
+    charmap = charmap.take_bytes() # should be hashable
     comps = {}
     mapping = bytearray(256)
     block = 0
diff --git 
a/Misc/NEWS.d/next/Library/2025-11-26-14-20-10.gh-issue-141968.W139Pv.rst 
b/Misc/NEWS.d/next/Library/2025-11-26-14-20-10.gh-issue-141968.W139Pv.rst
new file mode 100644
index 00000000000000..c5375707814ff5
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2025-11-26-14-20-10.gh-issue-141968.W139Pv.rst
@@ -0,0 +1,2 @@
+Remove data copy from :mod:`re` compilation of regexes with large charsets
+by using :meth:`bytearray.take_bytes`.

_______________________________________________
Python-checkins mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/python-checkins.python.org
Member address: [email protected]

Reply via email to