New submission from Yhojann Aguilera <[email protected]>:
Unable parse a csv with latin iso charset.
with open('./exported.csv', newline='') as csvFileHandler:
csvHandler = csv.reader(csvFileHandler, delimiter=';',
quotechar='"')
for line in csvHandler:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 1032:
invalid continuation byte
I try using a binary mode on open() but says: binary mode doesn't take a
newline argument. Ok, replace newline to binary char: newline=b'', but says:
open() argument 6 must be str or None, not bytes. Ok, remove newline argument:
_csv.Error: iterator should return strings, not bytes (did you open the file in
text mode?).
Ok, csv module no support binary read mode. Try use latin iso:
with open('./exported.csv', mode='r', encoding='ISO-8859', newline='') as
csvFileHandler:
UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032:
character maps to <undefined>
But the charset is latin iso:
$ file exported.csv
exported.csv: ISO-8859 text, with very long lines, with CRLF line terminators
Ok, change to ISO-8859-8:
UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032:
character maps to <undefined>
Unable load the file. Why not give the option to work binary? the delimiters
can be represented with binary values.
----------
components: Unicode
messages: 350836
nosy: Yhojann Aguilera, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: Unable parse csv on latin iso or binary mode
type: behavior
versions: Python 3.7
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue37984>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com