Re: Combining Marks and Variation Selectors

2020-02-02 Thread Asmus Freytag via Unicode

  
  
On 2/2/2020 5:22 PM, Richard Wordingham
  via Unicode wrote:


  On Sun, 2 Feb 2020 16:20:07 -0800
Eric Muller via Unicode  wrote:


  
That would imply some coordination among variations sequences on
different code points, right?

E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on 0B56 (Mn,
ccc=0) would imply the existence of a variation sequence on 0B48 with
the same variation selector, and the same effect.

  
  
That particular case oughtn't to be impossible, as in NFD everything in
sight has ccc=0.  However TUS 12.0 Section 23.4 does contain an
additional prohibition against meaningfully applying a variation
selector to a 'canonical decomposable character'. (Scare quotes because
'ly' seems to be missing from the phrase.)

Richard.

So, let's look at what that would look like with some variation
  selector

<0B48, Fxxx> ≡ <0B47, 0B56, Fxxx>


If the variant in the shape of 0B48 is well-described by a
  variation on the contribution due to 0B56 in the decomposed
  sequence then this might make sense. But if the variant would be
  better described as a variation in the 0B47 component, then it
  would be a prime example of poor "pseudo encoding": where some
  random sequence is assigned to a a shape (in this case) without
  being properly analyzable into constituent characters with their
  own identity.
Which would it be in this example?
And this example only works, of course, because with ccc=0, 0B56
  cannot be reordered.
The prohibition as worded may perhaps be slightly more broad than
  necessary, but I can understand that the UTC didn't want to parse
  it more finely in the absence of any good examples that could be
  used to better understand what the actual limitations should be.
  Better safe than sorry, and all that.

A./


  


  
On 2/2/2020 11:43 AM, Mark Davis ☕️ via Unicode wrote:
I don't think there is a technical reason for disallowing variation
selectors after any starters (ccc=000); the normalization algorithm
doesn't care about the general category of characters.

Mark

  
  




  



Re: Combining Marks and Variation Selectors

2020-02-02 Thread Richard Wordingham via Unicode
On Sun, 2 Feb 2020 16:20:07 -0800
Eric Muller via Unicode  wrote:

> That would imply some coordination among variations sequences on
> different code points, right?
> 
> E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on 0B56 (Mn,
> ccc=0) would imply the existence of a variation sequence on 0B48 with
> the same variation selector, and the same effect.

That particular case oughtn't to be impossible, as in NFD everything in
sight has ccc=0.  However TUS 12.0 Section 23.4 does contain an
additional prohibition against meaningfully applying a variation
selector to a 'canonical decomposable character'. (Scare quotes because
'ly' seems to be missing from the phrase.)

Richard.

> On 2/2/2020 11:43 AM, Mark Davis ☕️ via Unicode wrote:
> I don't think there is a technical reason for disallowing variation
> selectors after any starters (ccc=000); the normalization algorithm
> doesn't care about the general category of characters.
> 
> Mark



Re: Combining Marks and Variation Selectors

2020-02-02 Thread Eric Muller via Unicode

  
  
That would imply some coordination
  among variations sequences on different code points, right?
  
  E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on
  0B56 (Mn, ccc=0) would imply the existence of a variation sequence
  on 0B48 with the same variation selector, and the same effect.
  
  Eric.
  
  On 2/2/2020 11:43 AM, Mark Davis ☕️ via Unicode wrote:


  
  
I don't think there is a technical reason for
  disallowing variation selectors after any starters (ccc=000);
  the normalization algorithm doesn't care about the general
  category of characters.



  

  

  

  

  

Mark
  
  

  

  

  

  

  

  


  
  
  
On Sun, Feb 2, 2020 at 10:09
  AM Richard Wordingham via Unicode 
  wrote:

On
  Sun, 2 Feb 2020 07:51:56 -0800
  Ken Whistler via Unicode  wrote:
  
  > What it comes down to is avoidance of conundrums
  involving canonical 
  > reordering for normalization. The effect of variation
  selectors is 
  > defined in terms of an immediate adjacency. If you
  allowed variation 
  > selectors to be defined for combining marks of ccc!=0,
  then 
  > normalization of sequences could, in principle, move the
  two apart.
  > That would make implementation of the intended rendering
  much more
  > difficult.
  
  I can understand that for non-starters.  However, a lot of
  non-spacing
  combining marks are starters (i.e. ccc=0), so they would not
  be a
  problem.   is an
  unbreakable block in
  canonical equivalence-preserving changes.  Is this restriction
  therefore
  just a holdover from when canonical equivalence could be
  corrected?
  
  Richard.

  


  



Re: Combining Marks and Variation Selectors

2020-02-02 Thread Mark Davis ☕️ via Unicode
I don't think there is a technical reason for disallowing variation
selectors after any starters (ccc=000); the normalization algorithm doesn't
care about the general category of characters.

Mark


On Sun, Feb 2, 2020 at 10:09 AM Richard Wordingham via Unicode <
unicode@unicode.org> wrote:

> On Sun, 2 Feb 2020 07:51:56 -0800
> Ken Whistler via Unicode  wrote:
>
> > What it comes down to is avoidance of conundrums involving canonical
> > reordering for normalization. The effect of variation selectors is
> > defined in terms of an immediate adjacency. If you allowed variation
> > selectors to be defined for combining marks of ccc!=0, then
> > normalization of sequences could, in principle, move the two apart.
> > That would make implementation of the intended rendering much more
> > difficult.
>
> I can understand that for non-starters.  However, a lot of non-spacing
> combining marks are starters (i.e. ccc=0), so they would not be a
> problem.   is an unbreakable block in
> canonical equivalence-preserving changes.  Is this restriction therefore
> just a holdover from when canonical equivalence could be corrected?
>
> Richard.
>


Re: Combining Marks and Variation Selectors

2020-02-02 Thread Richard Wordingham via Unicode
On Sun, 2 Feb 2020 07:51:56 -0800
Ken Whistler via Unicode  wrote:

> What it comes down to is avoidance of conundrums involving canonical 
> reordering for normalization. The effect of variation selectors is 
> defined in terms of an immediate adjacency. If you allowed variation 
> selectors to be defined for combining marks of ccc!=0, then 
> normalization of sequences could, in principle, move the two apart.
> That would make implementation of the intended rendering much more
> difficult.

I can understand that for non-starters.  However, a lot of non-spacing
combining marks are starters (i.e. ccc=0), so they would not be a
problem.   is an unbreakable block in
canonical equivalence-preserving changes.  Is this restriction therefore
just a holdover from when canonical equivalence could be corrected?

Richard.


Re: Combining Marks and Variation Selectors

2020-02-02 Thread Ken Whistler via Unicode

Richard,

What it comes down to is avoidance of conundrums involving canonical 
reordering for normalization. The effect of variation selectors is 
defined in terms of an immediate adjacency. If you allowed variation 
selectors to be defined for combining marks of ccc!=0, then 
normalization of sequences could, in principle, move the two apart. That 
would make implementation of the intended rendering much more difficult.


That is basically why the UTC, from the start, ruled out using variation 
selectors to try to make graphic distinctions between different styles 
of acute accent marks explicit, for example.


--Ken

On 2/1/2020 7:30 PM, Richard Wordingham via Unicode wrote:

Ah, I missed that change from Version 5.0, where the restriction was,
'The base character in a variation sequence is never a combining
character or a decomposable character'.  I now need to rephrase the
question.  Why are marks other than spacing marks prohibited?