Re: [Ecls-list] UTF-8 sequence decoding errors [Was: Upcoming changes]

2011-02-13 Thread Juan Jose Garcia-Ripoll
2011/2/13 Matthew Mondor mm_li...@pulsar-zone.net I also did a test relating to my previous suggestions about a way to preserve intact invalid input at output, later refered to as UTF-8B by Andy Hefner previously, and it seems possible. There seems to be scarce support around for these

Re: [Ecls-list] UTF-8 sequence decoding errors [Was: Upcoming changes]

2011-02-13 Thread Matthew Mondor
On Sun, 13 Feb 2011 09:59:37 +0100 Juan Jose Garcia-Ripoll juanjose.garciarip...@googlemail.com wrote: Yes I think that supporting that encoding would be very easy too. The only possibly tricky part is for users of that encoding to as necessary output a more conventional utf-8 stream to some

Re: [Ecls-list] UTF-8 sequence decoding errors [Was: Upcoming changes]

2011-02-12 Thread Juan Jose Garcia-Ripoll
Thanks for the detailed report. I made some changes. * The exported symbols come from the EXT package. They are character-coding-error character-coding-error-external-format character-decoding-error character-decoding-error-octets character-encoding-error character-encoding-error-code

Re: [Ecls-list] UTF-8 sequence decoding errors [Was: Upcoming changes]

2011-02-12 Thread Matthew Mondor
On Sat, 12 Feb 2011 19:07:43 +0100 Juan Jose Garcia-Ripoll juanjose.garciarip...@googlemail.com wrote: Thanks for the detailed report. I made some changes. * The exported symbols come from the EXT package. They are Indeed, SI and EXT appear to be aliases; however when a condition type is

Re: [Ecls-list] UTF-8 sequence decoding errors [Was: Upcoming changes]

2011-02-12 Thread Juan Jose Garcia-Ripoll
2011/2/12 Matthew Mondor mm_li...@pulsar-zone.net a character is still getting lost after the CONTINUE restart, even if I consume all bytes from the invalid octets supplied. New test code attached. I did not realize that from your previous email. This is fixed now (trivial typo in

Re: [Ecls-list] UTF-8 sequence decoding errors [Was: Upcoming changes]

2011-02-12 Thread Matthew Mondor
On Sat, 12 Feb 2011 23:49:14 +0100 Juan Jose Garcia-Ripoll juanjose.garciarip...@googlemail.com wrote: I did not realize that from your previous email. This is fixed now (trivial typo in utf_8_decoder) I tested and it works fine for invalid UTF-8 bytes to LATIN-1 conversion. I also did a test