On 7/9/07, Henning Thielemann <[EMAIL PROTECTED]> wrote:
HXT returns a list of warnings for invalid UTF-8 byte sequences:
 
http://www.fh-wedel.de/~si/HXmlToolbox/hdoc_arrow/Text-XML-HXT-DOM-Unicode.html#v%3Autf8ToUnicode

Is your decoder lazy?


Yes, the decoder is lazy.

Regarding error handling, I noticed that Python has three modes for
decoding UTF-8: strict, replace, and ignore.

strict: error "bad encoding"
replace: ('\xfffd' :)
ignore: id

which I could add if there was interest.

--
Eric Mertens
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to