On Fri, Nov 5, 2010 at 1:56 PM, Doug Ewell <d...@ewellic.org> wrote: > Right, but as I said, those downstream tasks shouldn't be consumers of > UTF-16 code units anyway. They should be consumers of Unicode code > points, which by definition excludes loose surrogates. >
Code points include surrogates. Maybe you mean "UTF-32 code units" or "Unicode scalar values". markus