[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread Martin Panter

Martin Panter added the comment:

Your modifications look sensible David; thanks for handling this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread R. David Murray

R. David Murray added the comment:

I want to stack another patch on top of this, so I committed it.  If you see 
anything I screwed up, Martin, please let me know.

--
resolution:  -> fixed
stage: commit review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 69900c5992c5 by R David Murray in branch '3.5':
#22233: Only split headers on \r and/or \n, per email RFCs.
https://hg.python.org/cpython/rev/69900c5992c5

New changeset 4d2369b901be by R David Murray in branch 'default':
Merge: #22233: Only split headers on \r and/or \n, per email RFCs.
https://hg.python.org/cpython/rev/4d2369b901be

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread R. David Murray

Changes by R. David Murray :


Removed file: http://bugs.python.org/file1/crlf-headers2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread R. David Murray

Changes by R. David Murray :


Added file: http://bugs.python.org/file2/crlf-headers2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread R. David Murray

Changes by R. David Murray :


--
stage: patch review -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-09-07 Thread R. David Murray

R. David Murray added the comment:

This looks good to me.  However, although it is by no means obvious, the tests 
in test_parser are supposed to be for the new policies.  When I changed the 
test to test them another place that needed to fixed was revealed.  I've 
updated the patch accordingly.

--
Added file: http://bugs.python.org/file1/crlf-headers2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-08-31 Thread R. David Murray

R. David Murray added the comment:

I'm hoping to take a look at all of these at the core sprint next week.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-08-30 Thread Doug Hellmann

Changes by Doug Hellmann :


--
nosy: +doughellmann

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-08-30 Thread Martin Panter

Martin Panter added the comment:

If someone reviews my patch and thinks it is fine, I might commit it. Maybe I 
can just re-review it myself, now that I have forgotten all the details :)

If messing with the email package is a problem (performance, or compatibility), 
another option is to keep the changes to the HTTP module (which I would be more 
confident in changing on my own). I have another patch for review at Issue 
24363 which apparently also fixes this splitlines() bug.

--
versions:  -Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2016-08-30 Thread Clay Gerrard

Clay Gerrard added the comment:

BUMP. ;)

This issue was recently raised as one blocker to OpenStack Object Storage 
(Swift) finishing our port to python3 (we're hoping to finish adding support 
>=3.5 by Spring '17 - /me crosses fingers).

I wonder if someone might confirm or deny the attached patch is likely to be 
included in the 3.6 timeframe (circa 12/16?) and/or back-ported to the 3.5 
series?

FWIW, I would echo other's sentiment that I would much prefer the 
implementation to be correct even if there was some worry we might have to 
choose between further optimization and getting a fix ASAP :D

Warm Regards,
-Clay

--
nosy: +Clay Gerrard

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2015-11-27 Thread R. David Murray

R. David Murray added the comment:

I agree.  Can you update the email issue with this suggestion and/or a patch?

The problem with this, of course is backward compatibility, but since it is 
more correct per the RFCs, our past policy has been to fix it anyway.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2015-11-27 Thread Martin Panter

Martin Panter added the comment:

David: what is the email issue you mentioned? In the mean time, I am uploading 
a patch to this issue.

It seems using StringIO is a bit slower than str.splitlines(). I found a way to 
optimize building long lines, which compensated a lot of the loss, but this 
optimization would apply even without using StringIO. My patch makes 
test.test_email 0.3% slower (the optimization alone would make it 4.4% faster), 
and test_email.TestFeedParsers.test_long_lines() is 3% slower (optimization 12% 
faster).

I also tried two other alternatives to str.splitlines(), but they were both 
slower than the StringIO technique:
* _partial is a list of UTF-8 bytes; join and use bytes.splitlines()
* _partial is a UTF-8 bytearray; use bytearray.splitlines()

--
keywords: +patch
stage: needs patch -> patch review
versions: +Python 3.6
Added file: http://bugs.python.org/file41177/crlf-headers.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2015-11-26 Thread Martin Panter

Martin Panter added the comment:

For the record, this is a simplified version of the original scenario, showing 
the low-level HTTP protocol:

>>> request = (
... b"GET /%C4%85 HTTP/1.1\r\n"
... b"Host: graph.facebook.com\r\n"
... b"\r\n"
... )
>>> s = create_connection(("graph.facebook.com", HTTPS_PORT))
>>> with ssl.wrap_socket(s) as s:
... s.sendall(request)
... response = s.recv(3000)
... 
50
>>> pprint(response.splitlines(keepends=True))
[b'HTTP/1.1 404 Not Found\r\n',
 b'WWW-Authenticate: OAuth "Facebook Platform" "not_found" "(#803) Some of the '
 b'aliases you requested do not exist: \xc4\x85"\r\n',
 b'Access-Control-Allow-Origin: *\r\n',
 b'Content-Type: text/javascript; charset=UTF-8\r\n',
 b'X-FB-Trace-ID: H9yxnVcQFuA\r\n',
 b'X-FB-Rev: 2063232\r\n',
 b'Pragma: no-cache\r\n',
 b'Cache-Control: no-store\r\n',
 b'Facebook-API-Version: v2.0\r\n',
 b'Expires: Sat, 01 Jan 2000 00:00:00 GMT\r\n',
 b'X-FB-Debug: 07ouxMl1Z439Ke/YzHSjXx3o9PcpGeZBFS7yrGwTzaaudrZWe5Ef8Z96oSo2dINp'
 b'3GR4q78+1oHDX2pUF2ky1A==\r\n',
 b'Date: Thu, 26 Nov 2015 23:03:47 GMT\r\n',
 b'Connection: keep-alive\r\n',
 b'Content-Length: 147\r\n',
 b'\r\n',
 b'{"error":{"message":"(#803) Some of the aliases you requested do not exist: '
 b'\\u0105","type":"OAuthException","code":803,"fbtrace_id":"H9yxnVcQFuA"}}']

In my mind, the simplest way forward would be to change the “email” module to 
only parse lines using the “universal newlines” algorithm. The 
/Lib/email/feedparser.py module should use StringIO(s, newline="").readlines() 
rather than s.splitlines(keepends=True). That would mean all email parsing 
behaviour would change; for instance, given the following message:

>>> m = email.message_from_string(
... "WWW-Authenticate: abc\x85\r\n"
... "\r\n"
... )

instead of the current behaviour:

>>> m.items()
[('WWW-Authenticate', 'abc\x85')]
>>> m.get_payload()
'\r\n\r\n'

it would change to:

>>> m.items()
[('WWW-Authenticate', 'abc\x85')]
>>> m.get_payload()
''

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22233] http.client splits headers on non-\r\n characters

2015-07-10 Thread Gregory P. Smith

Gregory P. Smith added the comment:

The obvious fix seems to be to not use splitlines but explicitly split on the 
allowed characters for ASCII based protocols and formats that only want \r and 
\n to be considered.

I don't think we can rightfully change the unicode splitlines behavior.

--
nosy: +gregory.p.smith
title: http.client splits headers on none-\r\n characters - http.client splits 
headers on non-\r\n characters

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22233
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com