On Thu, Sep 15 2016 at 16:36 CEST, john.w.kenn...@gmail.com writes: [...]
> In the new Swift programming language, which is white-hot in the Apple > community, Apple is moving toward a model of a transparent, generic > Unicode that can be “viewed” as UTF-8, UTF-16, or UTF-32 if necessary, > but in which a “character” contains however many code points it needs > (“e” with a stacked macron, acute accent, and dieresis is > algorithmically one “character” in Swift). Moreover, > e-with-an-acute-accent and e followed by a combining acute accent, for > example, compare as equal. At present, the underlying code is still > UTF-16LE. For several years I use the name "textel" (text element, in Polish "tekstel") for such objects. I do it mostly orally in my presentations for my students, but I used it also in writing e.g. in http://bc.klf.uw.edu.pl/118/, unfortunately without a proper definition. A rudymentary definition was provided for me only in my recent paper in Polish: http://bc.klf.uw.edu.pl/480/. It states simply (on p. 69) "an elementary text element independently of its Unicode representation" (meaning in particular composed vs precomposed). I still hope to formulate sooner or later a more satisfactory definition :-) I think Swift confirms that such a notion is really needed. Best regards Janusz -- , Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department) jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/