[issue46750] some code paths in ssl and _socket still import idna unconditionally

2022-02-16 Thread Shivaram Lingamneni


Shivaram Lingamneni  added the comment:

I wanted to check in about the status of this patch. Here's the case for the 
patch, as I understand it:

1. It is not a novel optimization, it just consistently applies design 
decisions that were made previously (RFE #1472176 and bpo-22127).
2. The performance impact of the initial import of encodings.idna and its 
transitive dependencies is in fact macroscopic relative to the baseline costs 
of the interpreter: 5 milliseconds to import the modules and 500 KB in 
increased RSS, relative to baselines of approximately 50 milliseconds to set up 
and tear down an interpreter and 10 MB in RSS.

Here are the relevant benchmarks, first for time:


```python
import time
start = time.time()
'a'.encode('idna')
print(time.time() - start)
```

and for memory:

```python
import os
def rss():
os.system('grep VmRSS /proc/' + str(os.getpid()) + '/status')
rss()
'a'.encode('idna')
rss()
```

Are there potential changes to this patch that would mitigate your concerns?

--

___
Python tracker 
<https://bugs.python.org/issue46750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46750] some code paths in ssl and _socket still import idna unconditionally

2022-02-14 Thread Shivaram Lingamneni


Shivaram Lingamneni  added the comment:

(Looks like it was 15 milliseconds when measuring inside `python -i`; I'm not 
sure what the root cause of the difference is, but clearly the 5 millisecond 
measurement from regular `python` is more accurate.)

--

___
Python tracker 
<https://bugs.python.org/issue46750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46750] some code paths in ssl and _socket still import idna unconditionally

2022-02-14 Thread Shivaram Lingamneni


Shivaram Lingamneni  added the comment:

Sorry, I should have been more clear: I am including the initial costs of 
importing stringprep and unicodedata. On my system:

$ python3 -c "import time; start = time.time(); r = 'a'.encode('idna'); elapsed 
= time.time() - start; print(elapsed)"
0.0053806304931640625

So the earlier measurement of 15 milliseconds was excessive (I'm not sure what 
happened) but it's the right order of magnitude: I can reproduce 5 milliseconds 
reliably.

--

___
Python tracker 
<https://bugs.python.org/issue46750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46750] some code paths in ssl and _socket still import idna unconditionally

2022-02-14 Thread Shivaram Lingamneni


Shivaram Lingamneni  added the comment:

Thanks for the prompt response. As evidence that this was of concern to the 
development team in the past, here's an issue where the unnecessary import of 
idna was treated as a regression:

https://bugs.python.org/issue22127

The discussion there also examines the semantic change produced by the 
optimization (some invalid labels making it to a DNS lookup instead of being 
rejected) and doesn't consider it to be a breaking change (albeit a reason not 
to backport).

(I also see references in documentation to a discussion labeled "RFE #1472176", 
but am unable to find the actual bug tracker or database entry this refers to.)

A time cost of 15 milliseconds seems accurate to me. The RAM cost on my release 
build of Python 3.8.10 is about 600 KB in RSS (this is approximately 5% of the 
baseline interpreter usage).

I cannot reproduce the claim that `urllib.parse` imports stringprep or 
unicodedata:

python3 -c "import sys, urllib.parse; assert 'stringprep' not in 
sys.modules"

python3 -c "import sys, urllib.parse; assert 'unicodedata' not in 
sys.modules"

I am developing a new lightweight http library that does use urllib.parse; on 
my system, these patches allow it to function without importing stringprep, 
idna, or unicodedata:

https://github.com/slingamn/mureq

--

___
Python tracker 
<https://bugs.python.org/issue46750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46750] some code paths in ssl and _socket still import idna unconditionally

2022-02-14 Thread Shivaram Lingamneni


Change by Shivaram Lingamneni :


--
keywords: +patch
pull_requests: +29484
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/31328

___
Python tracker 
<https://bugs.python.org/issue46750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46750] some code paths in ssl and _socket still import idna unconditionally

2022-02-14 Thread Shivaram Lingamneni


New submission from Shivaram Lingamneni :

Importing the idna encoding has a significant time and memory cost. Therefore, 
the standard library tries to avoid importing it when it's not needed (i.e. 
when the domain name is already pure ASCII), e.g. in Lib/http/client.py and 
Modules/socketmodule.c with `idna_converter`.

However, there are code paths that still attempt to encode or decode as idna 
unconditionally, in particular Lib/ssl.py and _socket.getaddrinfo. Here's a 
one-line test case:

python3 -c "import sys, urllib.request; 
urllib.request.urlopen('https://www.google.com'); assert 'encodings.idna' not 
in sys.modules"

These code paths can be converted using existing code to do the import 
conditionally (I'll send a PR).

--
assignee: christian.heimes
components: Interpreter Core, Library (Lib), SSL
messages: 413229
nosy: christian.heimes, slingamn
priority: normal
severity: normal
status: open
title: some code paths in ssl and _socket still import idna unconditionally
type: resource usage
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue46750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31711] ssl.SSLSocket.send(b"") fails

2019-06-25 Thread Shivaram Lingamneni


Shivaram Lingamneni  added the comment:

Are there any possible next steps on this?

This issue is very counterintuitive and challenging to debug --- it commonly 
presents as a nondeterministic edge case, and it appears to be a failed system 
call but doesn't show up in strace.

Thanks for your time.

--
nosy: +slingamn

___
Python tracker 
<https://bugs.python.org/issue31711>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16298] httplib.HTTPResponse.read could potentially leave the socket opened forever

2012-11-13 Thread Shivaram Lingamneni

Changes by Shivaram Lingamneni sling...@cs.stanford.edu:


--
nosy: +slingamn

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16298
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com