Re: VLERange: a range in between BidirectionalRange and RandomAccessRange

spir Thu, 13 Jan 2011 16:45:21 -0800

On 01/13/2011 11:00 PM, Nick Sabalausky wrote:

"Andrei Alexandrescu"<[email protected]>  wrote in message
news:[email protected]...


This may sometimes not be what the user expected; most of the time they'd
care about the code points.


I dunno, spir has succesfuly convinced me that most of the time it's
graphemes the user cares about, not code points. Using code points is just
as misleading as using UTF-16 code units.

You are right in that those 2 issues are really analog. In practice,once universal text is truely and commonly used, I guess problems withcodes-do-not-represent-characters may become far more obvious; and alsofar more serious because (logical) errors can easily pass by unseen.[In fact, how can a programmer even know for instance that a searchroutine missed its target or returned a false positive, when dealingwith characters from unknown languages? Indeed, there are test datasets, but they are useless if the tools one uses just ignore the issues.]The problem with using 16-bit representation and thus ignoring a fairamount of codepoints is maybe less problematic because there are ratherfew chances to randomly meet characters outside the BMP (BasicMultiligual Plane, part of UCS which codepoints are < 0x10000).Outside the BMP are scripting systems of less commonly studiedarcheological languages, and various sets of images such as alchemicalsymbols, playing cards or domino tiles. I doubt they'll ever be commonlyused, or else for specialised apps the programmer perfectly knows whatthey deal with.


A list of UCS blocks with pointers to detailed content can be found here:
http://www.fileformat.info/info/unicode/block/index.htm
Blocks over the BMP start with the line:
Linear B Syllabary      U+10000         U+1007F         (88)

Denis
_________________
vita es estrany
spir.wikidot.com

Re: VLERange: a range in between BidirectionalRange and RandomAccessRange

Reply via email to