On 11/4/06, Philipp Reichmuth wrote: > I've been starting to reuse some of this work in a script to do active > character assignment for XeTeX depending on what glyphs are present in > an OpenType font, so that those characters for which the font doesn't > have a glyph are generated by ConTeXt. Basically I want to produce > something like this: > > \ifnum\XeTeXcharglyph"010D=0 > \catcode`č=\active \def č{\ccaron} > \else > \catcode`č=\letter > \fi % ConTeXt knows this letter -> better hyphenation > > \ifnum\XeTeXcharglyph"1E0D=0 > \catcode`ḍ=\active \def ḍ{\b{d}} > \else > \catcode`ḍ=\letter > \fi % ConTeXt doesn't know this letter
No reason for not adding it. > (with \other, respectively, for non-letters). Being somewhat of a > novice to TeX programming, I'm not sure if this will work, though, and > I'm also not sure if it's better to generate static scripts that do this > for every font (so the resulting TeX file is a font-specific big list of > \catcode`$CHARACTERs) or to do this dynamically on every font change, > maybe limited to selectable Unicode ranges (which is more general but > also a lot slower). Generating this for every single font would be stupid. This should be part of low-level XeTeX (Jonathan has promised to look into it some time). In my opinion the best way to deal with it would be the ability to define a fall-back definition for "every" missing letter in a font. Consequently, if you have "ddotbelow" missing in your font, XeTeX would ask ConTeXt if some fallback definition has been provided for that glyph, If yes, it would fall back to it, "\b{d}", but if the glyph would be present in that font, XeTeX would use it. > > I'd prefer to see a context encoding added to GNU recode for the > > benefit of future archeologists trying to decipher ancient documents. > > That would be better I guess, but isn't ConTeXt encoding a moving target > in that characters can still get added? Or is the list fixed to AGL > glyph names and nothing else? No, it's certainly not fixed to AGL. But I wouldn't object adding it to GNU recode (on top of "(La)TeX" which also recognizes \v, \b, ...) if someone would decide to make a good revision of it and if more people think that it would be useful (and if developers are open to that idea). I try to use Unicode when writing sources whenever possible. Mojca PS for Philipp: I didn't try out your definitions, but you have a cut out of an older conversation as an example of what certainly doesn't work under XeTeX ;) (answer was written by Jonathan Kew) I was trying write a few macros to support the old tfm-based fonts, but figured out that that was the wrong starting point (and also other reason than yours). > \catcode`ð=\active \defð{^^f0} > \starttext > Testing ... ð > \stoptext > > and it seems to enter some infinite loop when ð is encountered (I can > define any other letter as well, but only ^^f0 is causing problems). No, this seems to me like it's the wrong way to define the character! And I think you would have the same problem with other letters if trying to define them as their own codes; the ones that work for you must be getting defined as *different* codes from the original input. The ^^xx notation is converted to a literal character by TeX's input scanning routine, so it behaves exactly as if it were that character itself. And ^^f0 in Latin-1 (or Unicode) is the ð character. So this definition works exactly the same as if you were to say \catcode`ð=\active \defð{ð} which is clearly recursive. Given that you don't need to remap ð in the input to some other Unicode character for printing, there should be no need for this at all. The only reason to use a definition like this would be if the input text used a *different* character where you want to print eth; or you want to print something *other* than character F0 for the input ð. In general, a "safe" form of the definition would be to use \chardef: \catcode`ð=\active \chardefð="F0 This makes ð into a macro that expands to the character "F0; there is an important difference between this and ^^f0, which actually "becomes" the character ð itself as the input is read (and therefore inherits its catcode, definition, etc). _______________________________________________ ntg-context mailing list ntg-context@ntg.nl http://www.ntg.nl/mailman/listinfo/ntg-context