Hello,

I am currently working on a blackletter (fraktur) digitalization, which amongst 
others aims at including all special characters ever used in such a script. 
Especially old Sorbian literature (17th century until 1950) features a rich 
selection of special characters. Most of these already exist as unicode 
characters or are straightforward combinations, but there are a few cases where 
I am not sure how which encoding to use or whether they may qualify as new 
characters.

First, there are stroked variants of ‹W› and ‹w› (see appended sketch). Since 
combined characters do not qualify for unicode anymore, I first thought to 
encode them using U+0337 (combining short solidus overlay). However, »LATIN 
CAPITAL LETTER L WITH STROKE« and »LATIN CAPITAL LETTER L, COMBINING SHORT 
SOLIDUS OVERLAY« are listed as confusable characters, which in turn confused 
me. So my questions are: Do these characters qualify for a submission to 
unicode? If not: what would be the proper way to encode them? If yes: Is there 
a preferred temporary work-around? Either way, I suggest adding the current 
stance on letters with stroke to the FAQ.

Second, there is an ‹a› with to vertically aligned dots above. Should this be 
encoded as ‹a + U+0307 + U+0307› (‹ȧ̇› – ‹a› with double ‹combining dot 
above›) or does it qualify for a new diacritical mark?

Third, I noticed, that the positioning of diacritical marks on certain letters 
is not straightforward. E.g., for a ‹b with acute›, the acute could be placed 
either above the bowl or the vertical stem of the ‹b›. Am I assuming correctly, 
that this disambiguity is not to be dealt with on the encoding level, but on 
the font level, e.g., with glyph variants?

If any of the above qualifies for a new character: Am I assuming correctly, 
that these have not already been proposed? (I have searched 
http://www.unicode.org/alloc/Pipeline.html and 
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4031.pdf.)

Regards,
Gerrit Ansmann

<<attachment: W_with_stroke.png>>

Reply via email to