While inelegant, I've "solved" this with a wrapper/generator f = file(fname, …) g = (line.replace('\0', '') for line in f) reader = csv.reader(g, …) for row in reader: process(row)
My actual use at $DAYJOB cleans out a few other things too, particularly non-breaking spaces coming from client data that .strip() doesn't catch in Py2.x ("hello\xa0".strip()) -tkc On 2018-02-28 23:40, John Pote wrote: > I have a csv data file that may become corrupted (already happened) > resulting in a NULL byte appearing in the file. The NULL byte > causes an _csv.Error exception. > > I'd rather like the csv reader to return csv lines as best it can > and subsequent processing of each comma separated field deal with > illegal bytes. That way as many lines from the file may be > processed and the corrupted ones simply dumped. > > Is there a way of getting the csv reader to accept all 256 possible > bytes. (with \r,\n and ',' bytes delimiting lines and fields). > > My test code is, > > with open( fname, 'rt', encoding='iso-8859-1' ) as csvfile: > csvreader = csv.reader(csvfile, delimiter=',', > quoting=csv.QUOTE_NONE, strict=False ) > data = list( csvreader ) > for ln in data: > print( ln ) > > Result > > >>python36 csvTest.py > Traceback (most recent call last): > File "csvTest.py", line 22, in <module> > data = list( csvreader ) > _csv.Error: line contains NULL byte > > strict=False or True makes no difference. > > Help appreciated, > > John > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list