[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2020-06-27 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

Yes, the status quo won ;-).

Sebastian, if you think a doc fix is still needed for current versions,  please 
open a new issue with a specific suggestion and explanation for changing the 
3.9 doc.

--
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2020-06-27 Thread Zackery Spytz


Zackery Spytz  added the comment:

Python 2 is EOL, so I think this issue should be closed.

--
nosy: +ZackerySpytz

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2018-01-11 Thread Sebastian Bank

Sebastian Bank  added the comment:

https://bugs.python.org/issue15927#msg309811 gives sme code examples 
illustrating why I think this should be backported (and also the documentation 
should be changed for both Python 2 and 3).

--
nosy: +xflr6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2017-09-30 Thread Vaibhav Mallya

Vaibhav Mallya  added the comment:

Hello R. David & Terry!

Appreciate your prompt responses. While experimenting with different test cases 
I realized that escaped slashes and newlines are intrinsically annoying to 
reason about as stringy-one-liners, so I threw together a small tarball test 
case - attached - to make sure we're on the same page. 

To be clear, I was referring *solely* to reading with csv.DictReader (we're not 
using the writing part).

The assertion for the multi_line_csv_unquoted fails, and I believe it should 
succeed.

I hadn't considered the design-bug vs code-bug angle. I also think that 
documenting this somehow - explicitly - would help others, since there's no 
mention of the interaction here, with what should be a fairly common use-case. 
It might even make sense to make a "strong recommendation" that everything is 
quoted + escaped (much as redshift makes a strong recommendation to escape).

Our data pipeline is doing fine after the right parameters on both sides, this 
is more about improving Python for the rest of the community. Thanks for your 
help, I will of course respect any decision you make.

--
Added file: https://bugs.python.org/file47181/csv_test.tar

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2017-09-29 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

I explained on #15927 why I currently see it as an enhancement issue, and 
therefore not appropriate to be backported.  In fact, based on the doc, I am 
puzzled why the line terminator was being escaped.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2017-09-29 Thread Vaibhav Mallya (mallyvai)

Vaibhav Mallya (mallyvai)  added the comment:

If there's any way this can be documented that would be a big help, at
least. There have been other folks who run into this, and the current
behavior is implicit.

On Sep 29, 2017 5:44 PM, "R. David Murray"  wrote:

R. David Murray  added the comment:

I'm pretty hesitant to make this kind of change in python2.  I'm going to
punt, and let someone else make the decision.  Which means if no one does,
the status quo will win.  Sorry about that.

--

___
Python tracker 

___

--
nosy: +Vaibhav Mallya (mallyvai)

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2017-09-29 Thread R. David Murray

R. David Murray  added the comment:

I'm pretty hesitant to make this kind of change in python2.  I'm going to punt, 
and let someone else make the decision.  Which means if no one does, the status 
quo will win.  Sorry about that.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2017-09-29 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

In closing #15927, R. David Murray said "Although this is clearly a bug fix, it 
also represents a behavior change that could cause a working program to fail.  
I have therefore only applied it to 3.4, but I'm open to arguments that it 
should be backported." 

David, I'll leave you to evaluate the argument presented.

Vaibhav: in the meanwhile, consider moving your pipeline to 3.x or patching 
your copy of the csv module.  You can put it in sitepackes as csv27.  Or if you 
are distributing code anyway, include your patched copy with it.

--
nosy: +r.david.murray, terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31590] CSV module incorrectly treats escaped newlines as new records if unquoted

2017-09-26 Thread Vaibhav Mallya

New submission from Vaibhav Mallya:

I'm writing python `csv` based-parsers as part of a data processing pipeline 
that includes Redshift and other data stores upstream and down. It's easy and 
expected in all of these data stores  
(http://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html) that CSV-style 
data can be generated with ESCAPE'd newlines, and with or without quotes on the 
columns.

Challenge: However, 2.x CSV module has a bug where ESCAPE'd newlines in 
unquoted CSVs are not actually treated as escaped newlines, but as entirely new 
record entries. This is at odds with expected behavior in most common data 
warehouses (See - Redshift docs I linked above for example) and is a subtle 
source of bugs for data processing pipelines. We changed our Redshift 
Parameters to ADDQUOTES so we could get around this bug, after some debugging. 

Note - This seems to be a continuation of https://bugs.python.org/issue15927 
which was closed as WONTFIX for 2.x. I think this is a legitimate bug, and 
should be fixed in 2.x. If someone is relying on old / bad behavior might mean 
something else is wrong. In my view, the current behavior effectively adds an 
implicit, undocumented dialect to the CSV module.

--
components: Library (Lib)
messages: 303025
nosy: mallyvai
priority: normal
severity: normal
status: open
title: CSV module incorrectly treats escaped newlines as new records if unquoted
type: behavior
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com