[issue46750] some code paths in ssl and _socket still import idna unconditionally
Shivaram Lingamneni added the comment: I wanted to check in about the status of this patch. Here's the case for the patch, as I understand it: 1. It is not a novel optimization, it just consistently applies design decisions that were made previously (RFE #1472176 and bpo-22127). 2. The performance impact of the initial import of encodings.idna and its transitive dependencies is in fact macroscopic relative to the baseline costs of the interpreter: 5 milliseconds to import the modules and 500 KB in increased RSS, relative to baselines of approximately 50 milliseconds to set up and tear down an interpreter and 10 MB in RSS. Here are the relevant benchmarks, first for time: ```python import time start = time.time() 'a'.encode('idna') print(time.time() - start) ``` and for memory: ```python import os def rss(): os.system('grep VmRSS /proc/' + str(os.getpid()) + '/status') rss() 'a'.encode('idna') rss() ``` Are there potential changes to this patch that would mitigate your concerns? -- ___ Python tracker <https://bugs.python.org/issue46750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46750] some code paths in ssl and _socket still import idna unconditionally
Shivaram Lingamneni added the comment: (Looks like it was 15 milliseconds when measuring inside `python -i`; I'm not sure what the root cause of the difference is, but clearly the 5 millisecond measurement from regular `python` is more accurate.) -- ___ Python tracker <https://bugs.python.org/issue46750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46750] some code paths in ssl and _socket still import idna unconditionally
Shivaram Lingamneni added the comment: Sorry, I should have been more clear: I am including the initial costs of importing stringprep and unicodedata. On my system: $ python3 -c "import time; start = time.time(); r = 'a'.encode('idna'); elapsed = time.time() - start; print(elapsed)" 0.0053806304931640625 So the earlier measurement of 15 milliseconds was excessive (I'm not sure what happened) but it's the right order of magnitude: I can reproduce 5 milliseconds reliably. -- ___ Python tracker <https://bugs.python.org/issue46750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46750] some code paths in ssl and _socket still import idna unconditionally
Shivaram Lingamneni added the comment: Thanks for the prompt response. As evidence that this was of concern to the development team in the past, here's an issue where the unnecessary import of idna was treated as a regression: https://bugs.python.org/issue22127 The discussion there also examines the semantic change produced by the optimization (some invalid labels making it to a DNS lookup instead of being rejected) and doesn't consider it to be a breaking change (albeit a reason not to backport). (I also see references in documentation to a discussion labeled "RFE #1472176", but am unable to find the actual bug tracker or database entry this refers to.) A time cost of 15 milliseconds seems accurate to me. The RAM cost on my release build of Python 3.8.10 is about 600 KB in RSS (this is approximately 5% of the baseline interpreter usage). I cannot reproduce the claim that `urllib.parse` imports stringprep or unicodedata: python3 -c "import sys, urllib.parse; assert 'stringprep' not in sys.modules" python3 -c "import sys, urllib.parse; assert 'unicodedata' not in sys.modules" I am developing a new lightweight http library that does use urllib.parse; on my system, these patches allow it to function without importing stringprep, idna, or unicodedata: https://github.com/slingamn/mureq -- ___ Python tracker <https://bugs.python.org/issue46750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46750] some code paths in ssl and _socket still import idna unconditionally
Change by Shivaram Lingamneni : -- keywords: +patch pull_requests: +29484 stage: -> patch review pull_request: https://github.com/python/cpython/pull/31328 ___ Python tracker <https://bugs.python.org/issue46750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46750] some code paths in ssl and _socket still import idna unconditionally
New submission from Shivaram Lingamneni : Importing the idna encoding has a significant time and memory cost. Therefore, the standard library tries to avoid importing it when it's not needed (i.e. when the domain name is already pure ASCII), e.g. in Lib/http/client.py and Modules/socketmodule.c with `idna_converter`. However, there are code paths that still attempt to encode or decode as idna unconditionally, in particular Lib/ssl.py and _socket.getaddrinfo. Here's a one-line test case: python3 -c "import sys, urllib.request; urllib.request.urlopen('https://www.google.com'); assert 'encodings.idna' not in sys.modules" These code paths can be converted using existing code to do the import conditionally (I'll send a PR). -- assignee: christian.heimes components: Interpreter Core, Library (Lib), SSL messages: 413229 nosy: christian.heimes, slingamn priority: normal severity: normal status: open title: some code paths in ssl and _socket still import idna unconditionally type: resource usage versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31711] ssl.SSLSocket.send(b"") fails
Shivaram Lingamneni added the comment: Are there any possible next steps on this? This issue is very counterintuitive and challenging to debug --- it commonly presents as a nondeterministic edge case, and it appears to be a failed system call but doesn't show up in strace. Thanks for your time. -- nosy: +slingamn ___ Python tracker <https://bugs.python.org/issue31711> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16298] httplib.HTTPResponse.read could potentially leave the socket opened forever
Changes by Shivaram Lingamneni sling...@cs.stanford.edu: -- nosy: +slingamn ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16298 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com