Re: NamesList.txt as data source

2016-03-30 Thread Janusz S. Bien
Quote/Cytat - Andrew West (Tue 29 Mar 2016 06:15:15 PM CEST): On 29 March 2016 at 16:19, Janusz S. Bień wrote: > All documents submitted to WG2 and to L2 by individuals are copyright > of the author(s) of the document. Documents do not need to

Re: NamesList.txt as data source

2016-03-29 Thread Andrew West
On 29 March 2016 at 16:19, Janusz S. Bień wrote: > > > All documents submitted to WG2 and to L2 by individuals are copyright > > of the author(s) of the document. Documents do not need to carry a > > copyright notice to have copyright, and submitting the documents to > >

Re: NamesList.txt as data source

2016-03-29 Thread Janusz S. Bień
On Tue, Mar 29 2016 at 10:40 CEST, andrewcw...@gmail.com writes: > On 29 March 2016 at 06:15, Asmus Freytag (c) wrote: >> >> What is the copyright status of the >> document? >> [...] > All documents submitted to WG2 and to L2 by individuals are copyright > of the author(s)

Re: NamesList.txt as data source

2016-03-29 Thread Andrew West
On 29 March 2016 at 06:15, Asmus Freytag (c) wrote: > > What is the copyright status of the > document? > > The terms of use (ostensibly for the entire site) are defined here: > > http://www.unicode.org/copyright.html That refers to the Unicode Standard and data files and

Re: NamesList.txt as data source

2016-03-29 Thread Janusz S. Bień
On Tue, Mar 29 2016 at 9:54 CEST, asm...@ix.netcom.com writes: > On 3/29/2016 12:16 AM, Janusz S. "Bień" wrote: > > The document I refer to is a ISO/IEC document. As far as I know, ISO is > quite crazy about copyright. Does the Unicode Consortium policy apply to > this document? If so, then

Re: NamesList.txt as data source

2016-03-29 Thread Asmus Freytag (c)
On 3/29/2016 12:16 AM, Janusz S. "Bień" wrote: The document I refer to is a ISO/IEC document. As far as I know, ISO is quite crazy about copyright. Does the Unicode Consortium policy apply to this document? If so, then on which principle? An explicit agreement with

Re: NamesList.txt as data source

2016-03-29 Thread Janusz S. Bień
On Tue, Mar 29 2016 at 7:15 CEST, asm...@ix.netcom.com writes: > On 3/28/2016 9:40 PM, Janusz S. "Bień" wrote: [...] > The terms of use (ostensibly for the entire site) are defined here: > > http://www.unicode.org/copyright.html > > The document archive has not been designated with anything

Re: NamesList.txt as data source

2016-03-28 Thread Asmus Freytag (c)
On 3/28/2016 9:40 PM, Janusz S. "Bień" wrote: If you seriously wanted to present "all that is known about a > character" you would need to excerpt all mentions of it in the core > specification, as well as (potentially) any additional details > presented

Re: NamesList.txt as data source

2016-03-28 Thread Janusz S. Bień
On Mon, Mar 28 2016 at 13:59 CEST, m...@macchiato.com writes: [...] > But subheads are not Unicode Character Properties. As it was already said by Doug, nobody claims this. > And repeating the caveats expressed earlier, There was a lot of repetitions in this thread... > the Nameslist data is

Re: NamesList.txt as data source

2016-03-28 Thread Doug Ewell
Asmus Freytag wrote: Now, if the utilities were able to search the core spec (and all UAXs) and look up under what headers (in what sections) the character is described of discussed, that would make it clearer that this is a search. It would also be beyond nifty. It would indeed! But again,

Re: NamesList.txt as data source

2016-03-28 Thread Asmus Freytag
On 3/28/2016 4:59 AM, Mark Davis ☕️ wrote: The listing has both the block name and the Nameslist subhead label in listing characters. One can also use the subhead labels in filtering, eg http://unicode.org/cldr/utility/list-unicodeset.jsp?a=\p{subhead=Archaic%20letters}

Re: NamesList.txt as data source

2016-03-28 Thread Doug Ewell
Mark Davis wrote: > I think there is a misunderstanding because of the online utilities > which have been, for convenience, hosted with the same server as the > CLDR survey tool. So one sees "cldr" in the following URL, but that > doesn't mean a particular association with CLDR. Yes, that was my

Re: NamesList.txt as data source

2016-03-28 Thread Mark Davis ☕️
> I'm very curious about where CLDR data depends on these subheaders or other annotations in NamesList.txt You're right. CLDR data doesn't. I think there is a misunderstanding because of the online utilities which have been, for convenience, hosted with the same server as the CLDR survey tool.

Re: NamesList.txt as data source

2016-03-27 Thread Philippe Verdy
Le 27 mars 2016 20:47, "Doug Ewell" a écrit : > > Asmus Freytag wrote: > >> Nobody disputes that subheaders are informative. However, subheaders >> do not define a character property. > > > Janusz was making a point that the CLDR data sometimes treats them as such, or at least

Re: NamesList.txt as data source

2016-03-27 Thread Doug Ewell
I do understand there are some folks, particularly in the media, who don't understand the difference between normative and informative, and treat any information from (or submitted to) the Unicode Consortium as gospel and dictum. IMHO the explicit health warnings I suggested would be an

Re: NamesList.txt as data source

2016-03-27 Thread Doug Ewell
Asmus Freytag wrote: Nobody disputes that subheaders are informative. However, subheaders do not define a character property. Janusz was making a point that the CLDR data sometimes treats them as such, or at least as a kind of supplementary property. There are several good reasons: 1.

Re: NamesList.txt as data source

2016-03-26 Thread Asmus Freytag (t)
On 3/26/2016 2:10 AM, Janusz S. "Bień" wrote: On Thu, Mar 10 2016 at 22:40 CET, kenwhist...@att.net writes: [...] The *reason* that NamesList.txt exists at all is to drive the tool, unibook, that formats the full Unicode code charts for posting.

Re: NamesList.txt as data source

2016-03-26 Thread Doug Ewell
Janusz Bień wrote: Am I right that this information is available only in NamesList.txt? It probably comes from what Ken referred to as "a very long list of annotational material, including names list subhead material, etc., maintained in other sources." If you don't have access to those

Re: NamesList.txt as data source

2016-03-26 Thread Janusz S. Bień
On Thu, Mar 10 2016 at 22:40 CET, kenwhist...@att.net writes: [...] > The *reason* that NamesList.txt exists at all is to drive the tool, unibook, > that formats the full Unicode code charts for posting. It is only > posted in the Unicode Character Database at all as a matter of > convenience,

Re: annotations (was: NamesList.txt as data source)

2016-03-14 Thread Philippe Verdy
is the term "exponentially" really appropriate ? the NamesList file is not so large, and the grow would remain linear. Anyway, this file (current CSV format or XML format) does not need to be part of the core UCD files, they can be in a separate download for people needing it. One benefit I

Re: annotations (was: NamesList.txt as data source)

2016-03-13 Thread Marcel Schneider
On Sun, 13 Mar 2016 13:03:20 -0600, Doug Ewell wrote: > My point is that of J.S. Choi and Janusz Bień: the problem with > declaring NamesList off-limits is that it does contain information that > is either: > > • not available in any other UCD file, or > • available, but only in comments (like

Re: annotations (was: NamesList.txt as data source)

2016-03-13 Thread Doug Ewell
My point is that of J.S. Choi and Janusz Bień: the problem with declaring NamesList off-limits is that it does contain information that is either: • not available in any other UCD file, or • available, but only in comments (like the MAS mappings), which aren't supposed to be parsed either.

Re: annotations (was: NamesList.txt as data source)

2016-03-13 Thread Marcel Schneider
On Sun, 13 Mar 2016 07:55:24 +0100, Janusz S. Bień wrote: > For this purpose he wrote also a converter from NamesList format to XML That goes straight into the direction I suggested past year as a beta feedback item[1], but I never thought that it could be so simple. > I understand there is

annotations (was: NamesList.txt as data source)

2016-03-12 Thread Janusz S. Bień
On Thu, Mar 10 2016 at 22:40 CET, kenwhist...@att.net writes: > The *reason* that NamesList.txt exists at all is to drive the tool, > unibook, that formats the full Unicode code charts for posting. [...] On Fri, Mar 11 2016 at 3:13 CET, asm...@ix.netcom.com writes: > On 3/10/2016 5:49 PM, "J.

Re: NamesList.txt as data source

2016-03-12 Thread Marcel Schneider
On Thu, 10 Mar 2016 15:14:09 -0700, Doug Ewell wrote: > Ken Whistler wrote: > > > NamesList.txt should *not* be data mined. > > And yet it was the only Unicode data file utilized by MSKLC. > > There are many possible reasons for this approach, which we will > probably never know. Sadly it

Re: NamesList.txt as data source

2016-03-11 Thread Ken Whistler
On 3/11/2016 9:37 AM, Oren Watson wrote: Ok, so let me see if I understand this correctly. Suppose I'm writing a editor for math equations, and I want the user to be able to press a "Doublestruck" button and then type an C or D to get a ℂ or 픻 respectively. There is apparently no official

Re: NamesList.txt as data source

2016-03-11 Thread Oren Watson
Ok, so let me see if I understand this correctly. Suppose I'm writing a editor for math equations, and I want the user to be able to press a "Doublestruck" button and then type an C or D to get a ℂ or 픻 respectively. There is apparently no official source containing a machine-readable table of the

Re: NamesList.txt as data source

2016-03-10 Thread Asmus Freytag
On 3/10/2016 5:49 PM, "J. S. Choi" wrote: One thing about NamesList.txt is that, as far as I have been able to tell, it’s the only machine-readable, parseable source of those annotations and cross-references. There are explanations about character use that are only maintained in the PDF of

Re: NamesList.txt as data source (was: Re: Gaps in Mathematical Alphanumeric Symbols)

2016-03-10 Thread J. S. Choi
> On Mar 10, 2016, at 3:40 PM, Ken Whistler wrote: > > On 3/10/2016 1:00 PM, Andrew West wrote: >> It (http://www.unicode.org/Public/UNIDATA/NamesList.txt) is >> machine-readable, although the file specifically warns that "this file >> should not be parsed for

Re: NamesList.txt as data source (was: Re: Gaps in Mathematical Alphanumeric Symbols)

2016-03-10 Thread Asmus Freytag (t)
On 3/10/2016 2:14 PM, Doug Ewell wrote: Ken Whistler wrote: NamesList.txt should *not* be data mined. And yet it was the only Unicode data file utilized by MSKLC. There are many possible reasons for this approach, which we will probably

Re: NamesList.txt as data source (was: Re: Gaps in Mathematical Alphanumeric Symbols)

2016-03-10 Thread Doug Ewell
Ken Whistler wrote: > NamesList.txt should *not* be data mined. And yet it was the only Unicode data file utilized by MSKLC. There are many possible reasons for this approach, which we will probably never know. -- Doug Ewell | http://ewellic.org | Thornton, CO 

NamesList.txt as data source (was: Re: Gaps in Mathematical Alphanumeric Symbols)

2016-03-10 Thread Ken Whistler
On 3/10/2016 1:00 PM, Andrew West wrote: It (http://www.unicode.org/Public/UNIDATA/NamesList.txt) is machine-readable, although the file specifically warns that "this file should not be parsed for machine-readable information". NamesList.txt is just a structured text file, so of course it