I’m new to this list. Please excuse my technical incompetence.
Is there a Unicode character that says “I represent an alphanumerical 
character, but I don’t know which”.  This is a very common problem in the 
transcription of historical texts where you have lacunas. Often, the extent of 
the lacuna is known, and the alphabet is known as well. The EEBO TCP 
transcriptions of English texts before 1700 are good examples.  They are SGML 
transcriptions, where missing stuff is represented by <gap/> elements with 
attributes about this or that. This is efficient when it comes to pages, very 
inefficient when it comes to individual characters.
There is a Web character—a diamond with a question mark inside it—which means 
“I may know what this character represents, but I can’t display it”. Which is a 
very different message. On the other hand, if you extened the use of that 
character, it probably wouldn’t’ create much ambiguity.
In the TCP project, various code points from the Geometrical were used to 
represent lacunae. The black circle (\u25cf) has been used as the character for 
a missing character.This is OK and unambiguous in its context.   But would be 
nice to have a special character for just that purpose, and given the number of 
emoji, this doesn’t seem to be a particularly frivolous request.  Which 
alphabet, you might ask. But that doesn’t really matter. There is a very high 
probability that the missing character comes from the character set of the 
surrounding words. And if that isn’t the case, the transcriber wouldn’t know 
it. S/he sees that there is something, perhaps even that there is just one of 
it, but doesn’t know which

Martin Mueller
Professor emeritus of English and Classics
Northwestern University

Reply via email to