[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-10 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 57fc950298bb by Martin Panter in branch '2.7':
Issue #22413: Document newline effect on StringIO initializer and getvalue
https://hg.python.org/cpython/rev/57fc950298bb

New changeset cba4bf2a1721 by Martin Panter in branch '3.4':
Issue #22413: Document newline effect on StringIO initializer and getvalue
https://hg.python.org/cpython/rev/cba4bf2a1721

New changeset 451da3327f68 by Martin Panter in branch '3.5':
Issue #22413: Merge StringIO doc from 3.4 into 3.5
https://hg.python.org/cpython/rev/451da3327f68

New changeset 46df76819b79 by Martin Panter in branch '3.5':
Issue #22413: Remove comment made out of date by Argument Clinic
https://hg.python.org/cpython/rev/46df76819b79

New changeset c12d3f941731 by Martin Panter in branch 'default':
Issue #22413: Merge StringIO doc from 3.5
https://hg.python.org/cpython/rev/c12d3f941731

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-10 Thread Martin Panter

Changes by Martin Panter :


--
resolution: wont fix -> fixed
stage: commit review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-09 Thread Martin Panter

Martin Panter added the comment:

Thanks for the feedback. Yeah, 2.7 is an independent branch, but I will try 
porting the changes there.

--
assignee: docs@python -> martin.panter
nosy: +berker.peksag
stage: patch review -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-05 Thread Guido van Rossum

Guido van Rossum added the comment:

The patch fails to apply in the 2.7 branch. It works in 3.4. Could you look 
into the 2.7 issue?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-05 Thread Guido van Rossum

Guido van Rossum added the comment:

It looks like we don't merge 2.7 into 3.4 any more, so that will have to be a 
separate patch anyway.

So you can commit the patch to 3.4, merge into 3.5 and 3.6. Good luck! And 
thanks for your perseverance.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-03 Thread Martin Panter

Martin Panter added the comment:

Here is a suggested patch. I did include details of the initializer and 
getvalue(); this is the heart of the problem IMO. In a limited sense the 
newline flag _is_ similar to TextIOWrapper, but more broadly this implied to me 
that newlines should be encoded in the buffer, just like in TextIOWrapper’s 
wrapped “buffer” and on disk.

My patch also adds to a comment in the C code and removes another comment made 
out of date by Argument Clinic.

In the documentation I didn’t mention the problem with split CRLFs; I think 
that is a separate bug.

--
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python
stage:  -> patch review
status: closed -> open
versions: +Python 2.7, Python 3.6
Added file: http://bugs.python.org/file40665/newline-doc.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-10-01 Thread Guido van Rossum

Guido van Rossum added the comment:

I don't see a reason to deprecate anything. Can you write up in one paragraph 
how StringIO's newline flag differs from the one to TextIOWrapper? (What 
happens to the initial value is a separate issue AFAIC.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-09-30 Thread Martin Panter

Martin Panter added the comment:

I understand it may not be worth changing the behaviour. Would you instead 
accept a change to the documentation to point out that “newline” does _not_ 
actually work like TextIOWrapper? Or perhaps even deprecating or recommending 
against using “newline”?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-09-27 Thread Guido van Rossum

Guido van Rossum added the comment:

Agreed we shouldn't change this. It looks like the behavior is consistent if 
you consider `a = StringIO(stuff, newline=...)` merely a shorthand for `a = 
StringIO(newline=...); a.write(stuff)`.

I understand you would like to have a way to set the internal buffer directly, 
without newline translation; to support that we'd have to add a new argument. 
Although really, it's probably better to just do the \r\n translation in your 
app anyway.

You're very unlikely to ever need \r translation -- that was last seen on MacOS 
9, which has been dead for 14 years now. Fighting \r\n or pretending it's a 
Windows-only thing is pretty hopeless -- most text-based internet protocols 
(like HTTP) require it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-09-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

At this point in time, I don't think it's a good idea to change the semantics 
at all. Some people might unknowingly rely on the current semantics, and the 
consequences of a change in 3.6 might be hard to debug.

The larger issue here is that the newline translation layer is meant as an 
adaptation layer between Python (where a newline is always "\n") and the 
outside world (where newlines are system-dependent or even file-dependent). But 
what is the "outside world" with StringIO? The data always comes from and goes 
to Python. So there is no obviously right decision except, perhaps, the 
decision not to have newline translation at all.

So I'd just recommend closing this as won't fix.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-09-27 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Ok, I'm closing, then.

--
resolution:  -> wont fix
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-09-24 Thread Martin Panter

Martin Panter added the comment:

According to  Serhiy and Antoine 
both seem to agree this behaviour is a bug. The reason why newline="\r\n" and 
newline="\r" cause these funny translations is that __init__() internally 
passes the initial buffer value through write(). So I propose to set the buffer 
directly without using write().

However, doing this would have another effect. In the C implementation, write() 
also does universal newline decoding. The Python implementation had similar 
decoding added in getvalue() to compensate (Issue 20435). I propose to revert 
the getvalue() decoding, and to move the universal decoding from write() to 
read() in the C implementation.

I anticipate the combined effect of these changes would be:

1. StringIO("\n", newline="\r\n") no longer encodes the newline to CRLF in the 
internal buffer, so reading or getvalue() will return "\n" unencoded

2. StringIO("\r\n", newline=None).getvalue() returns "\r\n" undecoded, rather 
than universal decoding changing it to "\n"

3. s = StringIO(newline=None); s.write("\r\n"); s.getvalue() also returns 
"\r\n" undecoded. It is undocumented, but StringIO intentionally does not 
encode to os.linesep (yet another bug IMO).

4. StringIO.newlines would only get updated during reading, rather than during 
construction and writing, since newline decoding will only take place during 
reading.

There is another bug where the universal newline decoding does not anticipate 
split CRLF sequences. This would hopefully be fixed at the same time.

>>> s = io.StringIO(newline=None)
>>> s.write("\r\n" "\r")
3
>>> s.write("\n")  # Complete second CRLF
1
>>> s.getvalue()  # Should be "\r\n\r\n", at least when os.linesep == "\n"
'\n\n\n'

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline="\r\n") translation

2015-09-24 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I agree with you, but I think that Antoine disagrees.

My half-baked patch for issue20435 did a half of the work. It initialized the 
buffer directly in __init__. Here is the rebased version. There is a difference 
between C and Python implementations for universal newlines and 
test_newline_none fails.

--
keywords: +patch
Added file: http://bugs.python.org/file40571/stringio_newline_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline=\r\n) translation

2014-09-15 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

See issue20435.

--
nosy: +pitrou, serhiy.storchaka
versions: +Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22413
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22413] Bizarre StringIO(newline=\r\n) translation

2014-09-14 Thread Martin Panter

New submission from Martin Panter:

I noticed that the newline translation in the io.StringIO class does not behave 
as I would expect:

 text = NL\n CRLF\r\n CR\r EOF
 s = StringIO(text, newline=\r\n)
 s.getvalue()
'NL\r\nCRLF\r\r\nCR\rEOF'  # Why is this not just equal to “text”?
 tuple(s)
('NL\r\n', 'CRLF\r\r\n', 'CR\rEOF')  # Too many lines, butchered EOL sequence
 tuple(TextIOWrapper(BytesIO(text.encode(ascii)), ascii, newline=\r\n))
('NL\nCRLF\r\n', 'CR\rEOF')  # This seems more reasonable

Although I have never had a use for newline=\r, it also seems broken:

 tuple(StringIO(text, newline=\r))
('NL\r', 'CRLF\r', '\r', 'CR\r', 'EOF')  # Way too many lines
 tuple(TextIOWrapper(BytesIO(text.encode(ascii)), ascii, newline=\r))
('NL\nCRLF\r', '\nCR\r', 'EOF')

The other newline options (\n, , and None) seem to behave correctly though. 
There seem to be quite a few bug reports to do with newline translation in 
StringIO, but I couldn’t see anything specifically about this one. However the 
issue was mentioned at https://bugs.python.org/issue20423#msg209581.

I noticed there are test cases which appear to bless the current behaviour, as 
seen in the patch for Issue 20498. IMO these tests are wrong.

--
components: IO
messages: 226895
nosy: vadmium
priority: normal
severity: normal
status: open
title: Bizarre StringIO(newline=\r\n) translation
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22413
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com