Re: [Lazarus] Unicode branch

Michael Schnell Thu, 13 Jun 2013 01:20:49 -0700

On 06/12/2013 05:31 PM, Marco van de Voort wrote:


No. This is part of that, but only the most initial level. Much is not
yet decided.

... including how compatible it will be. (and more importantly, how
portable the compatibility will be)

As I followed (an took part in) several discussions on the move to(quasi-) dynamically encoded Strings. I perfectly do understand this.l always stated that it is not a good idea to just do "some"implementation before decently agreed definitions have been nailed down.The final product needs to fulfill some contradicting needs such as- "easy to use even for beginners" i.e. providing automaticconversions when necessary,

 - "architecture independent",

- "decent performance" at least when used appropriately thoughtful:avoiding unnecessary automatic conversions by not mixing differentsubtypes..

 - "backwards compatibility" not breaking legacy fpc / Lazarus user code

- "Delphi compatibility" at least when an appropriate mode is set.This includes Delphi XE and pre-Unicode Delphi versions- Unicode Details like handling of ambiguous code points andcode-point combinations in "=" compare. "Upcase". Also case insensitivecompare, "<" / ">" compare which seems to be language depending evenfor Unicode.- "versatility" maybe extended vs. Delphi. Here I would like to seeString-Sub-Types like non-encoded ( never auto-converted / "RAW") Byte,Word, DWord and QWord Strings and fully dynamically coded (not forcing aconversion when assigned to) Strings.- "extensibility": it should be doable - even for the end-user - andappropriately documented, to create an additional (auto converting)String Subtype for propriety encoding schemes (e.g. html entity) byproviding appropriate conversion functions.

 - ...

For me. a big question still is, what to do with the ambiguousMyString[n] notation. I am sure that Mr Wirth meant it like "take the nthe printable character from the string", which with the Unicode-drivensupport for non-western languages does not make too much sense any more.(Maybe someone should ask him ?!?!?!). But as (western) beginners neverwill accept that MyString[n] works in terms of sub-code (which worksrather well for them with UTF-16, but usually not with UTF-8), I votefor dropping it altogether, unless enabled by a "take care" $mode setting.


-Michael

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] Unicode branch

Reply via email to