Re: Variation Sequences (and L2-11/059)

2019-03-13 Thread Janusz S. Bień via Unicode
On Wed, Mar 13 2019 at  9:48 -07, Ken Whistler wrote:
> On 3/13/2019 2:42 AM, Janusz S. Bień via Unicode wrote:
>> Hi!
>>
>> On Mon, Jul 16 2018 at  7:07 +02, Janusz S. Bień via Unicode wrote:
>>> FAQ (http://unicode.org/faq/vs.html) states:
>>>
>>>  For historic scripts, the variation sequence provides a useful tool,
>>>  because it can show mistaken or nonce glyphs and relate them to the
>>>  base character. It can also be used to reflect the views of
>>>  scholars, who may see the relation between the glyphs and base
>>>  characters differently. Also, new variation sequences can be added
>>>  for new variant appearances (and their relation to the base
>>>  characters) as more evidence is discovered.
>> I'm proof-reading a paper where I quote the above fragment and to my
>> surprise I noticed it's no longer present in the FAQ.
>
> That text is, in fact, still present on the FAQ page in question:
>
> https://www.unicode.org/faq/vs.html#18

I apologize for jumping to the wrong conclusion, I should check it more
carefully.

>
>>
>> So my question are:
>>
>> 1. Does the change mean the change of the official policy of the
>> Consortium?
>
> Your premise here, however, is mistaken. The FAQ pages do *not*, and
> never have represented official policy of the Unicode Consortium.

That I expected but asked just to be on the safe side.

> The
> individual FAQ entries are contributed by many people -- some
> attributed, and some not. They are updated or added to periodically by
> various editors, in response to feedback, or as old entries grow
> out-dated, or new issues arise. Those updates are editorial, and do
> not reflect any official decision process by Unicode technical
> committees or officers. The FAQ main page itself points out that "The
> FAQs are contributed by many people," and invites the public to submit
> possible new entries for editing and addition to the list of FAQs.

BTW, what about copyright of FAQ entries? Do I guess correctly it
belongs to the consortium? To be specific, what about using and entry in
full in English or in translation as or in a Wikipedia entry?

>
> For official technical content, refer to the published technical
> specifications themselves, which are carefully controlled, versioned,
> and archived.
>
> For official policies of the Unicode Consortium, refer to the Unicode
> Consortium policies page, which is also carefully controlled:
>
> https://www.unicode.org/policies/policies.html

Thanks for reminding.


>> 2. Are the archival versions of the FAQ available somewhere?
>
> https://web.archive.org/web/*/https://www.unicode.org/faq/

Great!

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Variation Sequences (and L2-11/059)

2019-03-13 Thread Ken Whistler via Unicode



On 3/13/2019 2:42 AM, Janusz S. Bień via Unicode wrote:

Hi!

On Mon, Jul 16 2018 at  7:07 +02, Janusz S. Bień via Unicode wrote:

FAQ (http://unicode.org/faq/vs.html) states:

 For historic scripts, the variation sequence provides a useful tool,
 because it can show mistaken or nonce glyphs and relate them to the
 base character. It can also be used to reflect the views of
 scholars, who may see the relation between the glyphs and base
 characters differently. Also, new variation sequences can be added
 for new variant appearances (and their relation to the base
 characters) as more evidence is discovered.

I'm proof-reading a paper where I quote the above fragment and to my
surprise I noticed it's no longer present in the FAQ.


That text is, in fact, still present on the FAQ page in question:

https://www.unicode.org/faq/vs.html#18



So my question are:

1. Does the change mean the change of the official policy of the
Consortium?


Your premise here, however, is mistaken. The FAQ pages do *not*, and 
never have represented official policy of the Unicode Consortium. The 
individual FAQ entries are contributed by many people -- some 
attributed, and some not. They are updated or added to periodically by 
various editors, in response to feedback, or as old entries grow 
out-dated, or new issues arise. Those updates are editorial, and do not 
reflect any official decision process by Unicode technical committees or 
officers. The FAQ main page itself points out that "The FAQs are 
contributed by many people," and invites the public to submit possible 
new entries for editing and addition to the list of FAQs.


For official technical content, refer to the published technical 
specifications themselves, which are carefully controlled, versioned, 
and archived.


For official policies of the Unicode Consortium, refer to the Unicode 
Consortium policies page, which is also carefully controlled:


https://www.unicode.org/policies/policies.html



2. Are the archival versions of the FAQ available somewhere?


https://web.archive.org/web/*/https://www.unicode.org/faq/




3. Are the changes to the FAQ documented somehow (a version control
system?)?


No.

--Ken



Re: Variation Sequences (and L2-11/059)

2019-03-13 Thread Janusz S. Bień via Unicode
Hi!

On Mon, Jul 16 2018 at  7:07 +02, Janusz S. Bień via Unicode wrote:
> FAQ (http://unicode.org/faq/vs.html) states:
>
> For historic scripts, the variation sequence provides a useful tool,
> because it can show mistaken or nonce glyphs and relate them to the
> base character. It can also be used to reflect the views of
> scholars, who may see the relation between the glyphs and base
> characters differently. Also, new variation sequences can be added
> for new variant appearances (and their relation to the base
> characters) as more evidence is discovered.

I'm proof-reading a paper where I quote the above fragment and to my
surprise I noticed it's no longer present in the FAQ.

So my question are:

1. Does the change mean the change of the official policy of the
Consortium?

2. Are the archival versions of the FAQ available somewhere?

3. Are the changes to the FAQ documented somehow (a version control
system?)?

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Variation Sequences (and L2-11/059)

2018-07-20 Thread Janusz S. Bień via Unicode
On Thu, Jul 19 2018 at 17:47 +0100, wjgo_10...@btinternet.com writes:
> Janusz S. Bien wrote:
>
>> You seem to assume that my concern is only rendering.
>
> Well my thinking is that what you are wanting is a way to accurately
> transcribe documents and maybe printed books from Old Polish into a
> Unicode-based electronic format so that the information can be more
> readily studied, while retaining glyph information that is not
> presently representable using Unicode characters.
>
> I found the following.
>
> https://en.wikipedia.org/wiki/Old_Polish_language
>
> WJGO >> So you could if you wish try to make your own font
>
> JSB >Actually I tried:
>
> JSB > https://bitbucket.org/jsbien/parkosz-font/
>
> Thank you for the link to the font. I have studied the font in the 
> FontCreator program (version 8).
>
> I remember that I produced an OpenType font using Variation Selectors
> and OpenType Glyph Substitution back in April 2017. I wrote about it
> and provided a link to the font and a link to a typecase document.
>
> https://forum.high-logic.com/viewtopic.php?f=10&t=7033
>
> Although that font is about chess, I am thinking that that is the sort
> of font that is needed for what you are wanting to do. This could use
> variation selectors or could use circled digits as desired.
>
> I am a researcher and I am looking for a worthwhile project related to
> typography in which to participate from time to time - no money
> charged, no money to pay - and I am interested in printed books of the
> incunabula period and the early sixteenth century.
>
> I do not know any Polish, but I do not need to be involved in choosing
> which glyphs are needed, so my not knowing any Polish would not seem
> to be a problem.
>
> William Overington
>
> Thursday 19 July 2018
>

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien


Re: Variation Sequences (and L2-11/059)

2018-07-19 Thread Janusz S. Bień via Unicode
On Thu, Jul 19 2018 at 17:47 +0100, wjgo_10...@btinternet.com writes:
> Janusz S. Bien wrote:
>
>> You seem to assume that my concern is only rendering.
>
> Well my thinking is that what you are wanting is a way to accurately
> transcribe documents and maybe printed books from Old Polish into a
> Unicode-based electronic format so that the information can be more
> readily studied, while retaining glyph information that is not
> presently representable using Unicode characters.

That's right.

As long as we have no corpus tools able to handle variation sequences,
both variation sequences and yuor proposal can be considered just a form
of transcription and your proposal may have perhaps a liitle advantage.

However if somebody will have time and/or money to implement a new
corpus software, it make more sense in my opinion to implement standard
variation sequences.

Of course sticking to the standard make sense if the standard is
reasonable. In my opinion Unicode was designed with only one application
in mind: some text is input on the keyboard and has to be rendered after
some processing. However due to the mass digitalization we have quite
often the reverse situation: we have scans with graphical object which
might be difficult to identify, we have to analyse the text somehow and
identyfying the Unicode characters is the final part of the research. To
be more specific, I will quote my response to David Perry on the MUFI
list:

On Fri, Jul 20 2018 at  6:54 +0200, jsb...@mimuw.edu.pl writes:
> On Wed, Jul 18 2018 at 13:33 -0700, [...] writes:

[...]

>> If you are working to digitize the Polish dictionary you mentioned,
>> the first step would be to determine whether there is any difference
>> in meaning between the two versions of the section sign. If not, just
>> encode them all with U+00A7.
>
> I beg to disagree.
>
> The difference should be encoded in some way (at the moment I plan to
> use a simple transciption like §⤾ for SECTION SIGN mirrored), than their
> occurrences analysed with some corpus tools (concordances etc.) and
> finally the opinion formulated about the function of the distinction or
> tha lack of it.

On the other hand, I was just surprised by the information from David
Perry, who said on the MUFI list:

> Note, however, that most applications check whether VSs have been
> registered for the script in use and, if not, they will not display
> the variants even if they a font maker has put them in. (I tried
> . . . )

If the consortium will be reluctant to register new sequences and the
software will strictly adhere to the standard, then there will be a
problem.

> I found the following.
>
> https://en.wikipedia.org/wiki/Old_Polish_language

Thank you for your interest in Polish language. I will answer to the
rest of you post a little later.

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Variation Sequences (and L2-11/059)

2018-07-19 Thread William_J_G Overington via Unicode
Janusz S. Bien wrote:

> You seem to assume that my concern is only rendering.

Well my thinking is that what you are wanting is a way to accurately transcribe 
documents and maybe printed books from Old Polish into a Unicode-based 
electronic format so that the information can be more readily studied, while 
retaining glyph information that is not presently representable using Unicode 
characters.

I found the following.

https://en.wikipedia.org/wiki/Old_Polish_language

WJGO >> So you could if you wish try to make your own font

JSB >Actually I tried:

JSB > https://bitbucket.org/jsbien/parkosz-font/

Thank you for the link to the font. I have studied the font in the FontCreator 
program (version 8).

I remember that I produced an OpenType font using Variation Selectors and 
OpenType Glyph Substitution back in April 2017. I wrote about it and provided a 
link to the font and a link to a typecase document.

https://forum.high-logic.com/viewtopic.php?f=10&t=7033

Although that font is about chess, I am thinking that that is the sort of font 
that is needed for what you are wanting to do. This could use variation 
selectors or could use circled digits as desired.

I am a researcher and I am looking for a worthwhile project related to 
typography in which to participate from time to time - no money charged, no 
money to pay - and I am interested in printed books of the incunabula period 
and the early sixteenth century.

I do not know any Polish, but I do not need to be involved in choosing which 
glyphs are needed, so my not knowing any Polish would not seem to be a problem.

William Overington

Thursday 19 July 2018



Re: Variation Sequences (and L2-11/059)

2018-07-18 Thread Asmus Freytag (c) via Unicode

On 7/17/2018 8:56 PM, Janusz S. "Bień" wrote:

On Tue, Jul 17 2018 at  8:34 -0700, Asmus Freytag writes:

On 7/16/2018 10:04 PM, Janusz S. Bień via Unicode wrote:

  I understand there is no sufficient demand for the Unicode Consortium
maintaining a supplementary non-ideographic variation database. Hence
for the time being  a kind of Private Use variation database seems to be
the only solution - am I right?

The question comes down to resources, among other things. As well as to whether
there are actual users / implementers waiting for and ready to adopt such a 
database
as solution to their problems.

I hope the resources are sufficient to improve wording of the variation
sequence FAQ. Do we agree that at present users/implementers are rather
misled by it?


Sure, we can go either of two ways: we can state that Unicode has no, 
and will not have any, solution to the issue of such variants for 
non-ideographic scripts. That part is easy.


Or, alternatively we could figure out, what the solution space might be 
(in the right circumstances), including some external resources for 
maintaining a database on an ongoing basis, and a larger well-identified 
community of scholars or archivists that sign up to use and support it.


If a non-zero solution space exists, simply saying that there will never 
be any solution would be equally wrong as the current wording which 
points at something that is not longer part of the solution space . . . 
(although at one point, people thought it might be).



A strawman proposal could identify these issues and some ways that they might be
addressed and then ask for criteria of what the UTC might deem sufficient.

Perhaps this statement should be put into FAQ, instead of "you should
propose your addition as a variation sequence"?


There are some additions that should be proposed for standardization, 
but the bar is relatively high.



A./


Re: Variation Sequences (and L2-11/059)

2018-07-17 Thread Janusz S. Bień via Unicode
On Tue, Jul 17 2018 at  8:34 -0700, Asmus Freytag writes:
> On 7/16/2018 10:04 PM, Janusz S. Bień via Unicode wrote:
>
>  I understand there is no sufficient demand for the Unicode Consortium
> maintaining a supplementary non-ideographic variation database. Hence
> for the time being  a kind of Private Use variation database seems to be
> the only solution - am I right?
>
> The question comes down to resources, among other things. As well as to 
> whether
> there are actual users / implementers waiting for and ready to adopt such a 
> database
> as solution to their problems.

I hope the resources are sufficient to improve wording of the variation
sequence FAQ. Do we agree that at present users/implementers are rather
misled by it?

> A strawman proposal could identify these issues and some ways that they might 
> be
> addressed and then ask for criteria of what the UTC might deem sufficient.

Perhaps this statement should be put into FAQ, instead of "you should
propose your addition as a variation sequence"?

On Tue, Jul 17 2018 at 13:45 +0100, William_J_G Overington writes:
> Janusz S. Bien wrote:
>
>> I understand there is no sufficient demand for the Unicode
>> Consortium maintaining a supplementary non-ideographic variation
>> database. Hence for the time being a kind of Private Use variation
>> database seems to be the only solution - am I right?
>
> Well, with the greatest respect, in my opinion, no.
>
> You could use my suggestion and send a copy of your encoding to the
> Unicode Technical Committee (UTC) and maybe they will endorse it.

Difficult to do as there is no "my encoding".

>
> There is precedence over the astronaut emoji where in glyph
> substitution the rocket was lost and a space suit was obtained from
> somewhere.
>
> For my suggestion the circled digit would be lost and an alternate
> glyph introduced.

You seem to assume that my concern is only rendering.

[...]


On Tue, Jul 17 2018 at 14:07 +0100, William_J_G Overington writes:
> WJGO >> My suggestion is to use for each desired glyph a sequence
> consisting of three characters, and then have an OpenType font decode
> them so that the glyph can be displayed.
>
> JSB >This is a prohibitive requirement, because for years there is the lack 
> of font creators interested in old Polish.
>
> Well, I have not been aware of any call for participation. It seems an 
> interesting project.
>
> I make OpenType fonts using the FontCreator program.
>
> There is an active forum with helpful people participating.
>
> So you could if you wish try to make your own font

Actually I tried:

https://bitbucket.org/jsbien/parkosz-font/

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Variation Sequences (and L2-11/059)

2018-07-17 Thread Asmus Freytag via Unicode

  
  
On 7/16/2018 10:04 PM, Janusz S. Bień
  via Unicode wrote:


  I understand there is no sufficient demand for the Unicode Consortium
maintaining a supplementary non-ideographic variation database. Hence
for the time being  a kind of Private Use variation database seems to be
the only solution - am I right?

The question comes down to resources, among
other things. As well as to whether
there are actual users / implementers waiting for and ready to
adopt such a database
as solution to their problems.
A strawman proposal could identify these
issues and some ways that they might be
addressed and then ask for criteria of what the UTC might deem
sufficient.
A./
  
  



Re: Variation Sequences (and L2-11/059)

2018-07-17 Thread William_J_G Overington via Unicode
WJGO >> My suggestion is to use for each desired glyph a sequence consisting of 
three characters, and then have an OpenType font decode them so that the glyph 
can be displayed.

JSB >This is a prohibitive requirement, because for years there is the lack of 
font creators interested in old Polish.

Well, I have not been aware of any call for participation. It seems an 
interesting project.

I make OpenType fonts using the FontCreator program.

There is an active forum with helpful people participating.

So you could if you wish try to make your own font and receive help or you 
could f you wish ask if people might like to join in the research project and 
make fonts.

https://forum.high-logic.com/

JSB > I perceive your proposal as "visible variant selectors for private 
variation sequences", as a text encoded this way can be easily converted into a 
text using real variant selectors.

JSB > I think it might be a reasonable temporary solution, but not the ultimate 
one.

Well, based on the present practice of the way that ZERO WIDTH JOINER is being 
used for encoding emoji, I opine that it has the potential to be a permanent 
formally-encoded solution.

JSB > I would expect arguments that is has no obvious advantage over variations 
sequences.

Well, when I looked at the IVD database information it seems to use plane 14 
characters.

As a practical consideration, my suggestion has the advantage that it only use 
plane 0 characters.

William Overington

Tuesday 17 July 2018



Re: Variation Sequences (and L2-11/059)

2018-07-17 Thread William_J_G Overington via Unicode
Janusz S. Bien wrote:

> I understand there is no sufficient demand for the Unicode Consortium 
> maintaining a supplementary non-ideographic variation database. Hence for the 
> time being a kind of Private Use variation database seems to be the only 
> solution - am I right?

Well, with the greatest respect, in my opinion, no.

You could use my suggestion and send a copy of your encoding to the Unicode 
Technical Committee (UTC) and maybe they will endorse it.

There is precedence over the astronaut emoji where in glyph substitution the 
rocket was lost and a space suit was obtained from somewhere.

For my suggestion the circled digit would be lost and an alternate glyph 
introduced.

Whether the Unicode Technical Committee would endorse such an encoding would 
need to wait for a meeting of the UTC.

Too often in relation to Unicode matters things get done by people saying what 
they consider the UTC will say and ideas get screened out and the UTC never 
gets the opportunity to consider them.

William Overington

Tuesday 17 July 2018



Re: Variation Sequences (and L2-11/059)

2018-07-16 Thread Janusz S. Bień via Unicode
On Mon, Jul 16 2018 at 19:00 +0100, wjgo_10...@btinternet.com writes:
> Hi
>
>> I ask the question because there are now several historical corpora
>> of Polish under development, which use at present a kind of fall-back
>> or some other ad hoc solutions for "nonce glyphs", as they are called
>> in the FAQ.
>
> I wonder if you could say please what are the "kind of fall-back or
> some other ad hoc solutions" please.

I would prefer not to go into details. I think some of those "solutions"
are simply wrong but the list is not the right place to criticize them.

> The reason I ask is because I have thought of a possible solution to
>the problem that has graceful fall-back and uses only plane 0
>characters, no Private Use Area characters at all: I am wondering
>whether my suggestion will be of use or if it is just another method
>that could just be added to a collection of "kind of fall-back or some
>other ad hoc solutions".
> My suggestion is to use for each desired glyph a sequence consisting
> of three characters, and then have an OpenType font decode them so
> that the glyph can be displayed.

This is a prohibitive requirement, because for years there is the lack
of font creators interested in old Polish.

> Each such sequence being of the form.
>
> Base character ZERO WIDTH JOINER then a circled digit character or a circled 
> number character.
>
> http://www.unicode.org/charts/PDF/U2460.pdf
>
> Thus there being up to twenty specific glyphs for each base character.
>
> The list of glyphs could be gradually extended as needed and if an
> attempt to display a newly added glyph is made using a font
> implemented from an earlier list then there would be graceful
> fall-back to the base character followed by a circled digit.
>
> It would be helpful for entering text into documents if the ZERO WIDTH
> JOINER character has a visible glyph within the font. Then entering
> text with OpenType glyph substitution turned off could be easier to
> carry out.

I perceive your proposal as "visible variant selectors for private
variation sequences", as a text encoded this way can be easily converted
into a text using real variant selectors.

I think it might be a reasonable temporary solution, but not the
ultimate one.

> I am wondering quite how acceptable such a solution would be for
> standardization: the list of ways that something can be encoded using
> a ZWJ (ZERO WIDTH JOINER) character seems to have recently been de
> facto extended for use with generating emoji sequences - not with
> circled digits but use of ZWJ to change meaning which is a far bigger
> extension than needed for this suggestion as meaning would often be
> unaltered when using this suggestion.

I would expect arguments that is has no obvious advantage over
variations sequences.

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien


Re: Variation Sequences (and L2-11/059)

2018-07-16 Thread Janusz S. Bień via Unicode
On Mon, Jul 16 2018 at  1:08 -0700, unicode@unicode.org writes:
> The use case would seem to be more properly served by some form of
> registration mechanism, like the one IVD represents for ideographs.

I agree.

>
> The use of "standardized" variation sequences with the understanding
> that those would be (fairly) widely implemented would, in contrast, be
> best reserved to cases where the the encoding in the Standard resulted
> in deliberately unifying some variations for which there is
> nevertheless a common (!) use case of requiring each alternate to be
> selected.

I agree.

[...]

> On 7/15/2018 10:07 PM, Janusz S. Bień via Unicode wrote:
>
>  
> FAQ (http://unicode.org/faq/vs.html) states:
>
> For historic scripts, the variation sequence provides a useful tool,
> because it can show mistaken or nonce glyphs and relate them to the
> base character. It can also be used to reflect the views of
> scholars, who may see the relation between the glyphs and base
> characters differently. Also, new variation sequences can be added
> for new variant appearances (and their relation to the base
> characters) as more evidence is discovered.
> It states also:
>
>What variation sequences are valid?
>Only those listed in StandardizedVariants.txt...

The full answer is:

Only those listed in StandardizedVariants.txt,
emoji-variation-sequences.txt, or the registered sequences listed in
the Ideographic Variation Database (IVD).

Do we agree that the statements are not consistent, at least with your
view, which I share?

I understand there is no sufficient demand for the Unicode Consortium
maintaining a supplementary non-ideographic variation database. Hence
for the time being  a kind of Private Use variation database seems to be
the only solution - am I right?

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Variation Sequences (and L2-11/059)

2018-07-16 Thread William_J_G Overington via Unicode
Hi

> I ask the question because there are now several historical corpora of Polish 
> under development, which use at present a kind of fall-back or some other ad 
> hoc solutions for "nonce glyphs", as they are called in the FAQ.

I wonder if you could say please what are the "kind of fall-back or some other 
ad hoc solutions" please.

The reason I ask is because I have thought of a possible solution to the 
problem that has graceful fall-back and uses only plane 0 characters, no 
Private Use Area characters at all: I am wondering whether my suggestion will 
be of use or if it is just another method that could just be added to a 
collection of "kind of fall-back or some other ad hoc solutions".

My suggestion is to use for each desired glyph a sequence consisting of three 
characters, and then have an OpenType font decode them so that the glyph can be 
displayed.

Each such sequence being of the form.

Base character ZERO WIDTH JOINER then a circled digit character or a circled 
number character.

http://www.unicode.org/charts/PDF/U2460.pdf

Thus there being up to twenty specific glyphs for each base character.

The list of glyphs could be gradually extended as needed and if an attempt to 
display a newly added glyph is made using a font implemented from an earlier 
list then there would be graceful fall-back to the base character followed by a 
circled digit.

It would be helpful for entering text into documents if the ZERO WIDTH JOINER 
character has a visible glyph within the font. Then entering text with OpenType 
glyph substitution turned off could be easier to carry out.

I am wondering quite how acceptable such a solution would be for 
standardization: the list of ways that something can be encoded using a ZWJ 
(ZERO WIDTH JOINER) character seems to have recently been de facto extended for 
use with generating emoji sequences - not with circled digits but use of ZWJ to 
change meaning which is a far bigger extension than needed for this suggestion 
as meaning would often be unaltered when using this suggestion.

William Overington

Monday 16 July 2018

Original message
>From : unicode@unicode.org
Date : 2018/07/16 - 06:07 (GMTDT)
To : unicode@unicode.org
Subject : Variation Sequences (and L2-11/059)


FAQ (http://unicode.org/faq/vs.html) states:

For historic scripts, the variation sequence provides a useful tool,
because it can show mistaken or nonce glyphs and relate them to the
base character. It can also be used to reflect the views of
scholars, who may see the relation between the glyphs and base
characters differently. Also, new variation sequences can be added
for new variant appearances (and their relation to the base
characters) as more evidence is discovered.

It states also:

   What variation sequences are valid?
   Only those listed in StandardizedVariants.txt...

However the file in question contains only sections for mathematics and
some rather exotic scripts.

To the best of my knowledge, the only attempt to introduce additional
variation sequences was the strongly criticised Karl Pentzlin's proposal
L2-11/059

http://www.unicode.org/L2/L2011/11059-latin-cyr-var.pdf

What has happen to it? I don't remember any information about it on the
list.

However my primary question is:

Are variation sequences *really* recommended for historical scripts?

I ask the question because there are now several historical corpora of
Polish under development, which use at present a kind of fall-back or
some other ad hoc solutions for "nonce glyphs", as they are called in
the FAQ.

Best regards

Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Variation Sequences (and L2-11/059)

2018-07-16 Thread Asmus Freytag via Unicode

  
  
The use case would seem to be more
  properly served by some form of registration mechanism, like the
  one IVD represents for ideographs.
  
  The use of "standardized" variation sequences with the
  understanding that those would be (fairly) widely implemented
  would, in contrast, be best reserved to cases where the the
  encoding in the Standard resulted in deliberately unifying some
  variations for which there is nevertheless a common (!) use case
  of requiring each alternate to be selected.
  
  A./
  
  On 7/15/2018 10:07 PM, Janusz S. Bień via Unicode wrote:


  
FAQ (http://unicode.org/faq/vs.html) states:

For historic scripts, the variation sequence provides a useful tool,
because it can show mistaken or nonce glyphs and relate them to the
base character. It can also be used to reflect the views of
scholars, who may see the relation between the glyphs and base
characters differently. Also, new variation sequences can be added
for new variant appearances (and their relation to the base
characters) as more evidence is discovered.

It states also:

   What variation sequences are valid?
   Only those listed in StandardizedVariants.txt...

However the file in question contains only sections for mathematics and
some rather exotic scripts.

To the best of my knowledge, the only attempt to introduce additional
variation sequences was the strongly criticised Karl Pentzlin's proposal
L2-11/059

http://www.unicode.org/L2/L2011/11059-latin-cyr-var.pdf

What has happen to it? I don't remember any information about it on the
list.

However my primary question is:

Are variation sequences *really* recommended for historical scripts?

I ask the question because there are now several historical corpora of
Polish under development, which use at present a kind of fall-back or
some other ad hoc solutions for "nonce glyphs", as they are called in
the FAQ.

Best regards

Janusz