This discussion has been centered around UTF-8. But I hope the corresponding rules apply to UTF-16 and UTF-32 for Unicode 4.0:

. for UTF-32: occurrences of 'surrogates' are ill-formed.



How about UTF-32 sequence which the 4 bytes represent value > U+10FFFF ? Are they considered ill-formed? Should they?




Reply via email to