On Tue, 2003-01-28 at 06:26, A Rafael D Teixeira wrote: > UTF8 is and encoding with very strict rules, it was made so to allow you to > detect if you are trying to read text that perhaps is in another encoding, > like the ISO8859-* or Windows125* families. > > I think the exception may be too harsh a measure, but surely you have to at > least ignore those characters. To pass them along is to surely transfer the > problem to the client code in an clueless way. >
I kind of got that from the code :) the only issue i had was that it didn't do this under .NET, the same code seems just ignores the extra character. From the look o fit it could also happen in InternalGetCharCount. > In resume: > > If you have characters (bytes in truth) in your text, that are greater than > 0x7F and aren't valid start codes (the start code tells the count of bytes > that will follow) followed by their proper number of complementary bytes, > either these bytes ARE garbage (generated by an bad application) or the byte > stream IS ENCODED with another encoding. > If it is done with another encoding, is there a better to get it to it so that this problem goes away? -- btouchet <[EMAIL PROTECTED]>
signature.asc
Description: This is a digitally signed message part
