From: <[EMAIL PROTECTED]>
> Philippe Verdy wrote on 05/30/2003 09:42:53 AM:
> 
> > If this is not enough, may be we could create only a new diacritic
> > for the long leg attached on right
> 
> I think it's a bad idea to encode combining marks that do not combine
> productively but are only used with a small set of base characters, and
> that attach, meaning that special-case outlines are likely to be needed.

How do you consider the existing "hook" diacritic ? Attached diacritics are already 
encoded. We can use them as a good fallback system in their context, so that the text 
will e mostly readable but users that are not aware of that specific usage.

The exact glyph can be described in a definition or appendix to replace the default 
glyph that would be generated by (for example) L+HOOK; This allows a font to be 
created to correcly display the L+HOOK as a L-MOLL if needed, but this still 
facilitates the interchange, without making the text completely unreadable for those 
that don't have this font.

I don't think it creates a semantic issue, because such text using the special 
alphabet does not attach any semantic o L+HOOK, so it can be safely interpreted (in 
context) as meaning L-MOLL.

For me the displayed documents are describing new glyphs, but not really new 
characters, as it clearly reuses an existing script, with a very strong relation with 
the "normal" basic Latin script used in traditional French.

The "special-case outlines" fit into the category of glyphs, i.e. specific fonts, and 
the original abstraction of the author is preserved because a base letter plus a 
diacritic is considered in all French texts (and the exposed variants) as a unique 
abstract character (or grapheme cluster?). Using such diacritic will still work 
correctly with all other Unicode algorithms, and the reader will not make fales 
interpretations as this method also creates interesting easy fallbacks if the 
combination L+HOOK cannot be rendered: in the exposed documents, HOOK by itself as no 
meaning, but only the composed sequence.

This also adds good collation orders for this special-case script (which is an 
invented notation but not really a new abstract script), and other possible transforms 
(such as case-folding). In this context, L+LOOK would clearly mean the L-MOLL 
semi-consonnant, A+HOOK would clearly mean the AU vowel, N+HOOK would clearly mean 
N-MOLL, and the default rendering in non-aware applications would still not break the 
interpretation of text as it preserves the grapheme-cluster boundaries of the original 
text.

All that is required is an agreement between linguists that study such Old French 
texts and want to interchange their respective work. And they can build a common font 
that will include the proper L-MOLL glyph for the L+HOOK abstraction.

The way I see abstract characters defined in Unicode, is that they designate an 
agreement between users of Unicode to use these standardized codepoint sequences to 
correctly encode text sharing common semantics in a given language and script pair. 
The two Old French text is rare enough so that such agreements between the most 
influent linguists that study these texts can be found. I's up to the linguists to 
consider if a mapping to an existing sequence of standardized codepoints would not be 
more beneficial than creating a new codepoint that may be simply too rarely supported.

For me, I would prefer to be able to sutdy such text by seeing a L-HOOK glyph instead 
of a L-MOLL glyph if I don't have a font that matches precisely the glyph design of 
the original text. The "learning curve" would be extremely short, as this default 
glyph is still much similar to what the original text displayed, and because there 
does not seem to be conflicts between this "special-use" script variant and the 
traditional one (so mixing the traditional Old French script, or the current French 
script with this special case script would not cause interpretation and semantic 
problems)

On the opposite, encoding with variant selectors or new special codepoints would cause 
much more problems and would not ease the interchange (I would hate to see a default 
square glyph of all these new codepoints, and would even would not appreciate that the 
defaultrendering of a variant selector would be the base letter without that variant 
represented).

With some searches, you could easily find some related publications of this old text 
using such diacritics, because of reproduction costs at that time where the design of 
new metal fonts was too costly, and even the author may have authorized such 
compromize to keep their text "intact". (Such variation would not be worse than what 
could be found in private handwritten mail exchanges between the author and the 
publisher, or other scholars at the same epoch this script was created).

In fact, the facsimile text shown here exhibit some considerable variation of the 
glyphs, and this really demonstrate that these characters were quite hard to reproduce 
exactly due to technical constraints or ability of the font workers employed to create 
the reproduction plates for that publication. In these old times, reproduction costs 
of text were quite expensive for any author, and many compromizes were needed when 
publishing books, and I think that you'll find much differences between what the 
publisher did and what the author intended (it would require studying their 
handwritten mails or commercial contracts, to see what was initially intended, and how 
other scholars considered this creation and how they also interpreted it for their own 
studies or use).

So when I suggested that a new diacritic could be encoded for a "long leg", that was 
not my prefered choice. Using other existing diacritics (such as a combining hook 
already encoded in Unicode) seems a more reasonnable choice that falls within the 
limits of what authors and linguists already accepted to do at their epoch due to 
money and technical constraints.

Reply via email to