On Tue, 2003-01-28 at 06:26, A Rafael D Teixeira wrote:
> UTF8 is and encoding with very strict rules, it was made so to allow you to 
> detect if you are trying to read text that perhaps is in another encoding, 
> like the ISO8859-* or Windows125* families.
> 
> I think the exception may be too harsh a measure, but surely you have to at 
> least ignore those characters. To pass them along is to surely transfer the 
> problem to the client code in an clueless way.
> 

I kind of got that from the code :) the only issue i had was that it
didn't do this under .NET, the same code seems just ignores the extra
character. From the look o fit it could also happen in
InternalGetCharCount.

> In resume:
> 
> If you have characters (bytes in truth) in your text, that are greater than 
> 0x7F and aren't valid start codes (the start code tells the count of bytes 
> that will follow) followed by their proper number of complementary bytes, 
> either these bytes ARE garbage (generated by an bad application) or the byte 
> stream IS ENCODED with another encoding.
> 

If it is done with another encoding, is there a better to get it to it
so that this problem goes away? 


-- 
btouchet <[EMAIL PROTECTED]>

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to