[issue40762] Writing bytes using CSV module results in b prefixed strings

Terry J. Reedy Fri, 29 May 2020 18:57:49 -0700

Terry J. Reedy <[email protected]> added the comment:

I make 5 core developers who agree that csv should definitely *not* assume that 
bytes given to it represent encoded text, reverting to the confusion of Python 
1 and 2.  (And even it if did, it should not assume that the encoding of the 
given to it and the encoding of the file are the same, or even that all bytes 
given to it have the same encoding!)


If a user does not like the current default wonky mixed ascii-hex string 
representation of bytes, the user should explicitly convert bytes to the 
representation they want.  Here are just 3 examples, 2 with possible variations.

>>> b'\xc2a9'.hex()  # One might want to add prefix '0x' or r'\x'.
'c26139'             # Or add a separator.
>>> str(list(b'\xc2a9'))  # One might want to change or strip brackets, 
'[194, 97, 57]'           # change separator, or strip spaces.
>>> b'\xc2a9'.decode('latin-1')
'Âa9'

What is best depends on the expected reader of the output.  Pandas users who 
don't like Pandas' csv output should talk to its authors.  They are welcome to 
ask advice on python-list.

Eric: I agree that adding 'strict=False' might be a good idea (in a new issue).

----------
nosy: +terry.reedy
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed
type: behavior -> enhancement

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue40762>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40762] Writing bytes using CSV module results in b prefixed strings

Reply via email to