> Also, if you're converting to, say, UTF-16, then non-character sequences > like \xEF\xBF\xBE and \xEF\xBF\xBF should probably be converted to the > corresponding UTF-16 non-characters (\uFFFE and \uFFFF), rather than being > rejected. (Note: Unicode 3.1 and ISO/IEC 10646-1:2000 differ on this point; > 10646 requires them to be rejected.) This discrepancy has been noted by the relevant committees, and is the subject of ballot comment in the current amendment of 10646. It should be fixed soon. --Ken
- UTF-8 validation rules Carl W. Brown
- Re: UTF-8 validation rules David Hopwood
- Re: UTF-8 validation rules Misha . Wolf
- RE: UTF-8 validation rules Carl W. Brown
- Re: UTF-8 validation rules Kenneth Whistler
- RE: UTF-8 validation rules Carl W. Brown
- RE: UTF-8 validation rules Carl W. Brown
- Re: UTF-8 validation rules Kenneth Whistler
- Re: UTF-8 validation rules David Hopwood
- Re: UTF-8 validation rules David Starner
- RE: UTF-8 validation rules Carl W. Brown
- Re: UTF-8 validation rules Kenneth Whistler
- RE: UTF-8 validation rules Marco Cimarosti

