On Mon, Mar 28 2016 at 13:59 CEST, [email protected] writes: [...]
> But subheads are not Unicode Character Properties. As it was already said by Doug, nobody claims this. > And repeating the caveats expressed earlier, There was a lot of repetitions in this thread... > the Nameslist data is designed for chart production, not as a reliable > source of machine-readable data. I guess you understand "machine-readable data" (and in consequence "data mining") in a specific very narrow way. > While it may be in some cases useful to look at, the subheads are not > designed to be a consistent source of data. Can we agree that Nameslist is a reliable source of machine-readable data about the Unicode *charts*? On Sun, Mar 27 2016 at 6:38 CEST, [email protected] writes: [...] > 3 The information is purely editorial, and as such, changed by the > editors as needed, not assigned as result of a vote in the Unicode > Technical Committee. Changes are not a problem if properly documented, but this is another topic. Let's now be more specific: On Sun, Mar 27 2016 at 5:00 CEST, [email protected] writes: > Janusz Bień wrote: > >> Am I right that this information is available only in NamesList.txt? > > It probably comes from what Ken referred to as "a very long list of > annotational material, including names list subhead material, etc., > maintained in other sources." > > If you don't have access to those "other sources," See below. > then as far as I > can tell, yes, it's available only in NamesList.txt. > > -- > Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸 > > On Sun, Mar 27 2016 at 6:38 CEST, [email protected] writes: > On 3/26/2016 2:10 AM, Janusz S. "Bień" wrote: [...] > I've just noticed that NamesList.txt is in a sense data mined by the > Unicode consortium itself. I mean the "Unicode Utilities: Character > Properties", which e.g. for LATIN SMALL LETTER P WITH FLOURISH > (http://unicode.org/cldr/utility/character.jsp?a=A753) display in > particular > > subhead: Medievalist addition [...] > > If you seriously wanted to present "all that is known about a > character" you would need to excerpt all mentions of it in the core > specification, as well as (potentially) any additional details > presented in the version of the proposal document that was approved by > the UTC as part of encoding the character. Exactly. The essential information for LATIN SMALL LETTER P WITH FLOURISH is that in Medieval manuscripts it is used for "pro" or "por". This information is available only in http://www.unicode.org/L2/L2006/06027-n3027-medieval.pdf Is this a static and permanent link? What is the copyright status of the document? For example: Can it be redistributed and replicated on other sites? Can it be quoted literally in a Wikipedia entry? In general, what can be done to make access to such information easier? Best regards Janusz -- , Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department) [email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/

