On Wed, 21 Jan 2004 11:13:33 -0700, John Jenkins wrote: > > Granted, epigraphy is tough on plain text. As Unicode starts to deal > with dead scripts, we have to deal with the issues it raises. > Variation selectors are one way of doing it. >
Yes, but I'm delighted to see from document N2684 "Draft Agreement on Old Hanzi Encoding" that variation selectors are not the method proposed for dealing with archaic forms of the Han script. I think that encoding the Oracle Bone, Bronze Inscription and Small Seal pre-Han scripts separately from the modern Han script is definitely the right thing to do, although as glyph variation is an even bigger problem for the ancient unstandardised scripts than for the modern script, I wonder whether variation selectors might not play a role in the end anyway. As I'm currently working on a proposal for the deceased Jurchen script, which also has a problem with glyph variation (about a third of the 1,355 entries in the most recent Jurchen dictionary are simple glyph variants, many almost indistinguishable from one another), maybe someone on the UTC could give me some advice ? Should I : A. Stick to a strict character encoding model, and ignore glyph variants that have no semantic distinctions (as I did for Phags-pa). B. Indiscriminately code every glyph form that has ever been seen, on the basis that ghyph variants are given in a respected dictionary. C. Propose distinct characters, but append a long list of proposed standardised variants to cover the simple glyph variants (some missing a dot here or adding a stroke there, some written in a more cursive manner, and some just differently proportioned). Andrew

