Re: Getting Unicode decode error using lxml.iterparse

2018-05-23 Thread Peter Otten
digi...@gmail.com wrote: > I'm trying to read my iTunes library in Python using iterparse. My current > stub is: > parser.add_argument('infile', nargs='?', > type=argparse.FileType('r'), default=sys.stdin) > I'm getting an error on one part of the XML: > > > File

Re: Getting Unicode decode error using lxml.iterparse

2018-05-23 Thread Stefan Behnel
digi...@gmail.com schrieb am 23.05.2018 um 00:56: > I'm trying to read my iTunes library in Python using iterparse. My current > stub is: > > Snip > > import sys > import datetime > import xml.etree.ElementTree as ET > import argparse > import re > > class Library: > >

Re: Getting Unicode decode error using lxml.iterparse

2018-05-23 Thread Stefan Behnel
dieter schrieb am 23.05.2018 um 08:25: > If the encoding is not specified, "lxml" will try to determine it > and finally defaults to "utf-8" (which seems to be the correct encoding > for your case). Being an XML parser, it does not do that. XML parsers are designed to reject non-wellformed

Re: Getting Unicode decode error using lxml.iterparse

2018-05-23 Thread dieter
digi...@gmail.com writes: > I'm trying to read my iTunes library in Python using iterparse. My current > stub is: > ... > My input file (reduced to home in on the error) is: > > snip - > > > > > > 15078 > > NamePart 2. The

Getting Unicode decode error using lxml.iterparse

2018-05-22 Thread digitig
I'm trying to read my iTunes library in Python using iterparse. My current stub is: Snip import sys import datetime import xml.etree.ElementTree as ET import argparse import re class Library: unmarshallers = { # collections "array": lambda x: [v.text for v in