How common is it to see any of the following in real-world Unicode text, as opposed to code charts and test suites and the like?
1. Unpaired surrogates 2. Noncharacters (besides CLDR data) 3. U+FEFF at the beginning of a stream (note: not "packet" or arbitrary cutoff point) I'm not asking whether any of these are recommended or "prohibited" or whether they are a good idea. I'm asking about actual usage. -- Doug Ewell | Thornton, CO, USA http://ewellic.org | @DougEwell _______________________________________________ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode