Jonathan Rochkind wrote:
Hmm, you could theoretically assign chars in the private unicode area to the chars you need -- but then have your application replace those chars by small images on rendering/display.

This seems as clean a solution as you are likely to find. Your TEI solution still requires chars-as-images for these unusual chars, right? So this is no better with regard to copying-and-pasting, browser display, and general interoperability than your TEI solution, but no worse either -- it's pretty much the same thing. But it may be better in terms of those considerations for chars that actually ARE currently unicode codepoints.

I think you misunderstand the TEI option that we're using. The TEI option gives us a full abstraction of the novel glyphs, including abstract names, etc. Even without the images, the TEI is readable / maintainable / manipulatable.

If any of your "private" chars later become non-private unicode codepoints, you could always globally replace your private codepoints with the new standard ones.

With 137K "private codepoints" available, you _probably_ wouldn't run out.

That's the the same order of magnitude of characters as appear in a large novel. If you have a bunch of hand-written novel-length works and you're not 100% sure of the boundary between the glyphs for one letter and the glyphs for another, there won't be enough unicode private use points to encode them, but the TEI approach has no problem.

Actually, the TEI approach has several different ways of dealing with this kind of problem, all of which scale very nicely in my experience, so it's probably best to ask for advise on the TEI mailing list if you're faced with a problem like this.

Stuart Yeates       New Zealand Electronic Text Centre     Institutional Repository

Reply via email to