On Wed, 21 Jan 2015 07:07:44 -0600 Ken Schutte <[email protected]> wrote:
> Is it possible to do "shaping" (not sure if that's the correct term > here) without given a font? > I realize different fonts will support different features, but I want > to input a unicode string and get information like, > > - this 'ARABIC LETTER BEH' should use 'ARABIC LETTER BEH INITIAL FORM' > - this 'ARABIC SHADDA' will be combined with previous character > - mandatory ligatures (lam+alif) > etc > (of course will not get glyph coordinates) There's not much to stop you encapsulating this information in a font stored in the directory of your choice with dummy glyphs (not hard to create) and examining the glyphs you get out. Your font should have a glyph for every character you are interested in, and for their transforms. There is a limit of 64K glyphs in OpenType, so you might need to handle CJK characters separately. There may be some issues with control-characters that aren't rendered. To consider your examples: For ARABIC LETTER BEH, you would set up the 'init' feature for the Arabic script to convert it to the appropriate glyph, which may be the one you would map ARABIC LETTER BEH INITIAL FORM to. As I think has been mentioned, Arabic script forms for letters not used in Arabic itself tend to lack encoded presentation forms. For ARABIC SHADDA, what Harfbuzz will tell you is that there is some form of interaction between it and a previous characters. They will be in a single 'cluster' after shaping, and this is the indication that you would get. You would also get the same information for a vowel written before the consonant but stored after it, as occurs in most Indic scripts. It would not tell you that lam+alif was a mandatory ligature. Of course, the dummy font could record that this yielded a ligature. There isn't much information in Harfbuzz that isn't already in the Unicode Character Database. The most significant extra information is that it will tell you that THAI CHARACTER SARA AM decomposes. It may even tell you, after a fashion, that part of this character ends up between the preceding consonant and a tonemark. However, you would have to trace the relationship between the characters and glyphs. Harfbuzz itself won't tell you the same for the equivalent Lanna script sequence <U+1A61 TAI THAM VOWEL SIGN A, U+1A74 TAI THAM SIGN MAI KANG>, for the very good reason that this is a stylistic decision. The information about this has to be stored in the font. Similarly, it won't tell you which consonant character U+1A58 TAI THAM SIGN MAI KANG LAI ends up above (one gets different answers in Burma and Thailand). > Can I use harfbuzz for this or does it always require a font? You will have to decide whether my answer is 'yes' or 'no'. There are tools for adding the GSUB rules to a font, and it isn't too difficult to generate a formal font purely textually. (There are compilers around that will take transformation rules and add them to the glyph definitions.) I hope this helps. Richard. _______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
