UTF8 is and encoding with very strict rules, it was made so to allow you to detect if you are trying to read text that perhaps is in another encoding, like the ISO8859-* or Windows125* families.

I think the exception may be too harsh a measure, but surely you have to at least ignore those characters. To pass them along is to surely transfer the problem to the client code in an clueless way.

In resume:

If you have characters (bytes in truth) in your text, that are greater than 0x7F and aren't valid start codes (the start code tells the count of bytes that will follow) followed by their proper number of complementary bytes, either these bytes ARE garbage (generated by an bad application) or the byte stream IS ENCODED with another encoding.

Happy hackings,

Rafael Teixeira
Brazilian Polymath
Mono, MonoQLE Hacker




_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. http://join.msn.com/?page=features/virus

_______________________________________________
Mono-list maillist - [EMAIL PROTECTED]
http://lists.ximian.com/mailman/listinfo/mono-list

Reply via email to