mail attribution (was: A sign/abbreviation for "magister")

2018-11-01 Thread Janusz S. Bień via Unicode
On Thu, Nov 01 2018 at 6:43 -0700, Asmus Freytag via Unicode wrote: > On 11/1/2018 12:52 AM, Richard Wordingham via Unicode wrote: > > On Wed, 31 Oct 2018 11:35:19 -0700 > Asmus Freytag via Unicode wrote: [...] > Unfortunately, your emails are extremely hard to read in plain text. > It is even

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Janusz S. Bień via Unicode
On Thu, Nov 01 2018 at 13:34 -0700, Asmus Freytag via Unicode wrote: > On 11/1/2018 10:23 AM, Janusz S. Bień via Unicode wrote: [...] > Looks like you completely missed my point. Nobody ever claimed that > reproducing all variations in manuscripts is in scope of Unicode, so > whom do you want to

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Asmus Freytag via Unicode
On 11/1/2018 7:59 PM, James Kass via Unicode wrote: Alphabetic script users write things the way they are spelled and spell things the way they are written.  The abbreviation in question as written consists of three recognizable symbols.  An

Re: A sign/abbreviation for "magister"

2018-11-01 Thread James Kass via Unicode
Alphabetic script users write things the way they are spelled and spell things the way they are written.  The abbreviation in question as written consists of three recognizable symbols.  An "M", a superscript "r", and an equal sign (= two lines).  It can be printed, handwritten, or in fraktu

Re: A sign/abbreviation for "magister"

2018-11-01 Thread James Kass via Unicode
Richard Wordingham responded to Janusz S. Bień, >> ... Nobody ever claimed that reproducing all variations >> in manuscripts is in scope of Unicode, so whom do you want >> to convince that it is not? > > I think the counter-claim is that one will never be able > to encode all the meaning-convey

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
As well the step 2 of the algorithm speaks about a single "array" of collation elements. Actually it's best to create one separate array per level, and append weights for each level in the relevant array for that level. The steps S2.2 to S2.4 can do this, including for derived collation elements in

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 01 Nov 2018 18:23:05 +0100 "Janusz S. Bień via Unicode" wrote: > On Thu, Nov 01 2018 at 8:43 -0700, Asmus Freytag via Unicode wrote: > > I don't think it's a joke to recognize that there is a continuum > > here and that there is no line that can be drawn which is based on > > straightfo

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 1 Nov 2018 18:39:16 +0100 Philippe Verdy via Unicode wrote: > What this means is that we can safely implement UCA using basic > substitions (e.g. with a function like "string:gsub(map)" in Lua > which uses a "map" to map source (binary) strings or regexps,into > target (binary) strings: >

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 1 Nov 2018 21:13:46 +0100 Philippe Verdy via Unicode wrote: > I'm not speaking just about how collation keys will finally be stored > (as uint16 or bytes, or sequences of bits with variable length); I'm > just refering to the sequence of weights you generate. > You absolutely NEVER nee

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 1 Nov 2018 22:04:40 +0100 Philippe Verdy via Unicode wrote: > The DUCET could have as well used the notation ".none", or > just dropped every "." in its file (provided it contains a data > entry specifying what is the minimum weight used for each level). > This notation is only intend

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
So it should be clear in the UCA algorithm and in the DUCET datatable that "" is NOT a valid weight It is just a notational placeholder used as ".", only indicating in the DUCET format that there's NO weight assigned at the indicated level, because the collation element is ALWAYS ignorable

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
In summary, this step given in the algorithm is completely unneeded and can be dropped completely: *S3.2 *If L is not 1, append a *level separator* *Note:*The level separator is zero (), which is guaranteed to be lower than any weight in the resulting s

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Marcel Schneider via Unicode
On 01/11/2018 01:21, Asmus Freytag via Unicode wrote: On 10/31/2018 3:37 PM, Marcel Schneider via Unicode wrote: On 31/10/2018 19:42, Asmus Freytag via Unicode wrote: […] It is a fallacy that all text output on a computer should match the convention of "fine typography". Much that is written

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
The is there in the UCA only because the DUCET is published in a format that uses it, but here also this format is useless: you never need any [.], or [..] in the DUCET table as well. Instead the DUCET just needs to indicate what is the minimum weight assigned for every level (exce

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
Le jeu. 1 nov. 2018 à 21:31, Philippe Verdy a écrit : > so you can use these two last functions to write the first one: > > bool isIgnorable(int level, string element) { > return getLevel(getWeightAt(element, 0)) > getMinWeight(level); > } > correction: return getWeightAt(element, 0)

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
Le jeu. 1 nov. 2018 à 21:08, Markus Scherer a écrit : > When you want fast string comparison, the zero weights are useful for >> processing -- and you don't actually assemble a sort key. >> > And no, I absolutely no case where any weight is useful during processing, it does not distinguish a

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Asmus Freytag via Unicode
On 11/1/2018 10:23 AM, Janusz S. Bień via Unicode wrote: On Thu, Nov 01 2018 at 8:43 -0700, Asmus Freytag via Unicode wrote: On 11/1/2018 12:33 AM, Janusz S. Bień via Unicode wrote: On Wed, Oct 31 2018 at 12:14 -0700, Ken Whistler via Unicode wrot

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
I'm not speaking just about how collation keys will finally be stored (as uint16 or bytes, or sequences of bits with variable length); I'm just refering to the sequence of weights you generate. You absolutely NEVER need ANYWHERE in the UCA algorithm any weight, not even during processing, or

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
For example, Figure 3 in the UTR#10 contains: Figure 3. Comparison of Sort Keys StringSort Key 1 cab *0706* 06D9 06EE ** 0020 0020 *0020* ** *0002* 0002 0002 2 Cab *0706* 06D9 06EE ** 0020 0020 *0020* ** *0008* 0002

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Markus Scherer via Unicode
There are lots of ways to implement the UCA. When you want fast string comparison, the zero weights are useful for processing -- and you don't actually assemble a sort key. People who want sort keys usually want them to be short, so you spend time on compression. You probably also build sort keys

UCA unnecessary collation weight 0000

2018-11-01 Thread Philippe Verdy via Unicode
I just remarked that there's absolutely NO utility of the collation weight anywhere in the algorithm. For example in UTR #10, section 3.3.1 gives a collection element : [..0021.0002] for COMBINING GRAVE ACCENT. However it can also be simply: [.0021.0002] for a simple reason: the secon

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Janusz S. Bień via Unicode
On Thu, Nov 01 2018 at 8:43 -0700, Asmus Freytag via Unicode wrote: > On 11/1/2018 12:33 AM, Janusz S. Bień via Unicode wrote: > > On Wed, Oct 31 2018 at 12:14 -0700, Ken Whistler via Unicode wrote: > > On 10/31/2018 11:27 AM, Asmus Freytag via Unicode wrote: > > > but we don't have an agreem

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Asmus Freytag via Unicode
On 11/1/2018 12:33 AM, Janusz S. Bień via Unicode wrote: On Wed, Oct 31 2018 at 12:14 -0700, Ken Whistler via Unicode wrote: On 10/31/2018 11:27 AM, Asmus Freytag via Unicode wrote: but we don't have an agreement that reproducin

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Asmus Freytag via Unicode
On 11/1/2018 12:52 AM, Richard Wordingham via Unicode wrote: On Wed, 31 Oct 2018 11:35:19 -0700 Asmus Freytag via Unicode wrote: On the other hand, I'm a firm believer in applying certain styling attributes to things like e-mail or discussion paper

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Richard Wordingham via Unicode
On Wed, 31 Oct 2018 11:35:19 -0700 Asmus Freytag via Unicode wrote: > On the other hand, I'm a firm believer in applying certain styling > attributes to things like e-mail or discussion papers. Well-placed > emphasis can make such texts more readable (without requiring that > they pay attention t

Re: use vs mention (was: second attempt)

2018-11-01 Thread Richard Wordingham via Unicode
On Wed, 31 Oct 2018 23:35:06 +0100 Piotr Karocki via Unicode wrote: > These are only examples of changes in meaning with or , > not all of these examples can really exist - but, then, another > question: can we know what author means? And as carbon and iodine > cannot exist, then of course CI sh

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Janusz S. Bień via Unicode
On Wed, Oct 31 2018 at 12:14 -0700, Ken Whistler via Unicode wrote: > On 10/31/2018 11:27 AM, Asmus Freytag via Unicode wrote: >> >> but we don't have an agreement that reproducing all variations in >> manuscripts is in scope. > > In fact, I would say that in the UTC, at least, we have an agreeme

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Richard Wordingham via Unicode
On Wed, 31 Oct 2018 14:57:37 -0700 Asmus Freytag via Unicode wrote: > On 10/31/2018 10:18 AM, Marcel Schneider via Unicode wrote: >> Sad that Arabic ² and ³ are still missing. > How about all the other sets of native digits? They might not be in natural use this way! Also, there is the possibi