Your message dated Sat, 15 Mar 2025 10:35:58 +0000
with message-id <[email protected]>
and subject line Bug#1096009: fixed in ply 3.11-8
has caused the Debian Bug report #1096009,
regarding python3-ply: ply parser signatures change between Python 3.12 and
3.13 causing severe performance issues
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
1096009: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1096009
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: python3-ply
Version: 3.11-7
Severity: important
Tags: patch
X-Debbugs-Cc: [email protected]
Dear Maintainer,
ply uses the __doc__ of each method to calculate a signature for its
parsers (this is all within yacc.py). Between Python 3.12 and 3.13, the
way __doc__ is extracted has changed, with common whitespace being
removed from the docstring.
https://docs.python.org/3/whatsnew/3.13.html#other-language-changes
The result is that all packaged parsers (such as the one in
python3-phply) are invalid when run with Python 3.13, and the parser is
then regenerated. This takes a substantial amount of time - enough to
slow down the test suite of translate-toolkit from a few seconds to a
few minutes. The time is all spent repeatedly regenerating the parser
rather than just reading it off the disk.
Additionally, any packaged parsers will differ between Python 3.12 and
3.13, and so dh_python3 can't correctly collapse them into
dist-pacakges.
(See #1095792 for a bit more - this was from an initial look at just
python3-phply while this really is a bigger issue with the python3-ply
package.)
The attached patch undertakes whitespace normalisation on the signature:
- the signature inside a cached model is normalised
- the signature calculated from the source module is normalised
With this normalisation:
- both Python 3.12 and 3.13 generate the same signatures on pacakge
rebuilds, so our own packaged parsers will be OK
- the normalised signature from old parsers generated by Python 3.12
actually matches the normalised new signature when read in, meaning
that it's not a cache miss
I've tested this patch by:
a) all new packages
- first building the python3-ply package with this patch
- then building a python3-phply package with this patch
- then testing with the translate-toolkit test suite to check for
performance issues
b) existing packages
- first building the python3-ply package with this patch
- keeping the current python3-phply from sid
- then testing with the translate-toolkit test suite to check for
performance issues
Regards
Stuart
>From d93be36fddd970aedb1c0da345d255cef1028e1e Mon Sep 17 00:00:00 2001
From: Stuart Prescott <[email protected]>
Date: Sat, 15 Feb 2025 15:42:55 +1100
Subject: [PATCH 1/2] Add patch to normalise whitespace in signature
Addresses performance issues seen with phply and translate-toolkit.
---
debian/patches/series | 1 +
.../signature-whitespace-normalisation.patch | 73 +++++++++++++++++++
2 files changed, 74 insertions(+)
create mode 100644 debian/patches/signature-whitespace-normalisation.patch
diff --git a/debian/patches/series b/debian/patches/series
index c008b1c..9ddb106 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1,2 +1,3 @@
replace-removed-assert_-with-assertTrue.patch
relax-lex-tabversion-check.patch
+signature-whitespace-normalisation.patch
diff --git a/debian/patches/signature-whitespace-normalisation.patch
b/debian/patches/signature-whitespace-normalisation.patch
new file mode 100644
index 0000000..b204fe1
--- /dev/null
+++ b/debian/patches/signature-whitespace-normalisation.patch
@@ -0,0 +1,73 @@
+Description: Normalise the whitespace in the docstring for signature
+ The docstring is used in the calculation of the signature of a parser, but
+ the whitespace in the docstring can change between Python interpreter
+ versions, most notably with Python 3.13 that strips common whitespace from
+ the front of the docstring.
+ .
+ Without normalisation of the docstring, loading the parser is a cache miss
+ every time, which is observed as a signicant performance overhead. (See the
+ translate-toolkit test performance and #1095792 for example)
+ .
+ With this normalisation patch, every parsetab.py needs to be rebuilt; it is
+ impossible to make a patch that turns the Python 3.13 __doc__ back into the
+ Python 3.12 __doc__ for backwards compatibility.
+Author: Stuart Prescott <[email protected]>
+--- a/ply/yacc.py
++++ b/ply/yacc.py
+@@ -1995,7 +1995,7 @@
+ self.lr_productions.append(MiniProduction(*p))
+
+ self.lr_method = parsetab._lr_method
+- return parsetab._lr_signature
++ return _normalize(parsetab._lr_signature)
+
+ def read_pickle(self, filename):
+ try:
+@@ -2022,14 +2022,13 @@
+ self.lr_productions.append(MiniProduction(*p))
+
+ in_f.close()
+- return signature
++ return _normalize(signature)
+
+ # Bind all production function names to callable objects in pdict
+ def bind_callables(self, pdict):
+ for p in self.lr_productions:
+ p.bind(pdict)
+
+-
+ #
-----------------------------------------------------------------------------
+ # === LR Generator ===
+ #
+@@ -2983,7 +2982,7 @@
+ parts.append(f[3])
+ except (TypeError, ValueError):
+ pass
+- return ''.join(parts)
++ return _normalize(''.join(parts))
+
+ #
-----------------------------------------------------------------------------
+ # validate_modules()
+@@ -3134,7 +3133,7 @@
+ if isinstance(item, (types.FunctionType, types.MethodType)):
+ line = getattr(item, 'co_firstlineno',
item.__code__.co_firstlineno)
+ module = inspect.getmodule(item)
+- p_functions.append((line, module, name, item.__doc__))
++ p_functions.append((line, module, name,
_normalize(item.__doc__)))
+
+ # Sort all of the actions by line number; make sure to stringify
+ # modules to make them sortable, since `line` may not uniquely sort
all
+@@ -3500,3 +3499,13 @@
+
+ parse = parser.parse
+ return parser
++
++
++def _normalize(s):
++ # Normalize the whitespace in the docstring - this can vary between
++ # Python versions, with changes in Python 3.13
++ # https://docs.python.org/3/whatsnew/3.13.html#other-language-changes
++ if s:
++ s = re.sub(" +", " ", s)
++ s = re.sub("\n ", "\n", s)
++ return s
--
2.39.5
>From afed2a2b953d715a89918b7616a5372a52a5193f Mon Sep 17 00:00:00 2001
From: Stuart Prescott <[email protected]>
Date: Sat, 15 Feb 2025 15:43:03 +1100
Subject: [PATCH 2/2] Add WIP changelog
---
debian/changelog | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/debian/changelog b/debian/changelog
index 71b226e..d606d8b 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+ply (3.11-7.1) UNRELEASED; urgency=medium
+
+ * Add patch to normalise signature across whitespace changes (and therefore
+ across Python interpreter versions).
+
+ -- Stuart Prescott <[email protected]> Sat, 15 Feb 2025 15:41:25 +1100
+
ply (3.11-7) unstable; urgency=medium
* Control: remove team from uploaders.
--
2.39.5
--- End Message ---
--- Begin Message ---
Source: ply
Source-Version: 3.11-8
Done: Jeroen Ploemen <[email protected]>
We believe that the bug you reported is fixed in the latest version of
ply, which is due to be installed in the Debian FTP archive.
A summary of the changes between this version and the previous one is
attached.
Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to [email protected],
and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software
pp.
Jeroen Ploemen <[email protected]> (supplier of updated ply package)
(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing [email protected])
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Format: 1.8
Date: Sat, 15 Mar 2025 08:04:40 +0000
Source: ply
Built-For-Profiles: noudeb
Architecture: source
Version: 3.11-8
Distribution: experimental
Urgency: medium
Maintainer: Jeroen Ploemen <[email protected]>
Changed-By: Jeroen Ploemen <[email protected]>
Closes: 1096009
Changes:
ply (3.11-8) experimental; urgency=medium
.
[ Stuart Prescott ]
* Add patch to normalise signature across whitespace changes (and therefore
across Python interpreter versions). (Closes: #1096009)
.
[ Jeroen Ploemen ]
* Copyright: bump packaging year.
* Bump Standards-Version to 4.7.2 (from 4.7.0; no further changes).
Checksums-Sha1:
aa6604e279d70a3546e4818ee4370dd4ad85e0be 1977 ply_3.11-8.dsc
ee7b45ae374a4e492d88ca2d8bcee3f4c205c4e8 12832 ply_3.11-8.debian.tar.xz
a231b54b098ee57bd14b9e1ab22d80d8cbfc3d0f 14875 ply_3.11-8_source.buildinfo
Checksums-Sha256:
a00392819d83692f3459a19c566c974dfbabba5a9cffa003a19bc0d68e0dc598 1977
ply_3.11-8.dsc
c94d3d202e041fa048144697cf0293734ea7dd581849f6142c2b684769ace528 12832
ply_3.11-8.debian.tar.xz
d0f7c10d44e6fc5432f674799133c999dfff97dab9e0a352c100ba1d3278539d 14875
ply_3.11-8_source.buildinfo
Files:
2391f709354948655c198529bbb062af 1977 python optional ply_3.11-8.dsc
6c67c712e200fe92140f09300a64a71c 12832 python optional ply_3.11-8.debian.tar.xz
2040fa6e298f830cafca953de3255024 14875 python optional
ply_3.11-8_source.buildinfo
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEd8lhnEnWos3N8v+qQoMEoXSNzHoFAmfVNMMACgkQQoMEoXSN
zHqyGhAAmu2mWu8t/iS6vPUbHywC8hRlYCOfmH/beqpLJ3ii7tNYzenu/2r5WJ6H
mYlt/Q7en1rQbOBrgGq2troJY5jwc1LXvCMMNH877+gbpTfcfnmJ3e2Ybl7yFasX
w4cXMzk/V/Mm1Xo/y0i5iDlB4aghmYziBYEQsBoeABtVdOjPCHgSLygKLcXN3iEY
A5jDeyxSmFZB0jEsFRUtbm6Z7o4dglwRHeRglYOr4LXasQswrcGm7L3z75B+D4JF
cbfll2yeJhSLfH8N9e5cJzOEcxHpwJOk+wmU8v2S/JoJYHdOvMXG4UTXSbEniJ1q
aUN0djxj9+kR2JDtuus/h9N/KgjXYSk1ikj7aPWL6EJKXAZR85T+Dbs8bv/X/cQA
qaxX4fkVy/hqF/p8zcxqEgEj+ZKKbd8j/XVYwScQLwXPvPmOvHqzO9N15RVGqD2A
EGvsrxq8P1JvQqYcv3cHAjRMM3VoCa6glUuKnmNQr8bBM3zBZa6sHz8JP2VyTpdc
0eb1LDnntFKrFZtuSbF2iZ6mm6+nOeTzS3Z6V7K4m2+ZZeFOQSO845JwDnQSJ6cQ
bheIKbFm8jzlOinPVyiwjVRnNE7YW23zjhQWpyx4gvEo6WngY0Fjb8E16AsLyHbh
wiHxtfzGJ3QoRXdI0tk0fxTdtBmI+xGxMZiFYd0YF50Yz/Oae3M=
=0De2
-----END PGP SIGNATURE-----
pgpAZ8eeyG4BM.pgp
Description: PGP signature
--- End Message ---