Re: [fpc-devel] Unicode support in RTL - Roadmap

Michael Schnell Fri, 21 Nov 2008 07:16:43 -0800

So UTF8ElementlLength('Ü') would be 2 and UTF8PointLength('Ü') wouldbe 1.
Or 2, depending on whether it's predcomposed or decomposed.

I seem to remember that we discussed this some time ago and the resultwas that the compose (MAC style ?) characters in fact are a single codepoint (Unicode character) that consists of two (maybe more ? ) completecode points that are tied together by some special coding, so IMHO itcan be considered as a single Unicode character in both cases. If thiswould result in a huge table of possibly composed characters I thing wewould stick to the concept of providing a decent functionality andrestrict on those that are currently used by the "customers" we normallyaddress (Mac in Europe and America). A method to provide an extendedcomposition table should be provided to have those help themselves whoreally need it.

which does not make sense if UTF8PointLength(utfstring_1) is smallerthan UTF8PointLength(utfstring_2).
It does not make any sense under any circumstances, because there isno way for "UTF8PointSetLength" to know how many bytes it has toallocate when you pass a value (any value, regardless of where itcomes from) to it.

If UTF8PointLength(utfstring_1) is greater thanUTF8PointLength(utfstring_2) no new bytes need to be allocated but thefunction is just equivalent to


utfstring1 := UTF8PointCopy(utfstring1, 1, UTF8PointLength(utfstring_2));

To me this does not seem to impose any problem.

-Michael
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support in RTL - Roadmap

Reply via email to