Hi everyone, I pushed harfbuzz-0.9.13 out earlier today, and a hackfest report is overdue.
Jonathan Kew and I met for the week of February 11 in London, UK and did more HarfBuzz hacking. Martin Hosken joined us on Tuesday to share his valuable insight in Myanmar and other South-East Asian scripts. Here is what we achieved: = Myanmar We implemented a brand new Myanmar complex shaper based on the spec released by Microsoft [1], which is also what went into Windows 8. Thanks to the spec, this was a straightforward task. The Myanmar spec is so much simpler than the Indic specs that we decided that a separate shaper with a separate state machine is more suitable to the task. Thanks to the powerful ragel tool, the resulting shaper turned out to be very straightforward and easy to get to match Windows8 results. We are essentially matching Uniscribe in every case except for a small corner-case bug in Uniscribe. It should be indistinguishable to font developers and users. [1] http://www.microsoft.com/typography/OpenTypeDev/myanmar/intro.htm = Tai Tham, Cham, and New Tai Lue While at it, we added a new South-East Asian shaper (called 'sea' in the code) to handle simpler scripts that only have left-matras and prebase-reordering medials. Tai Tham, Cham, and New Tai Lue go through the new shaper, and all three work as expected as far as our testing goes. = Devanagari Fixed the (embarrassing) issue with eyelash Ra in fonts with old-style Devanagari spec. We match Uniscribe in that case now. = Malayalam Fixed a bug with interaction of dotless-reph and prebase-reordering Ra. It happens that Uniscribe has the same bug. = Kannada Fixed a couple bugs in the lookup processing that while are not Kannada-specific, where being hit with various Kannada fonts. = 'Phags-Pa Fixed shaping of 'Phags-Pa U+A872, which is the first character in Unicode to have Arabic_Joining=L. = "Default_Ignorables" While fixing some Kannada issues, we ended up implementing a rather sophisticated way of handling Default_Ignorable characters. Default_Ignorable is a category of Unicode characters that are by default not shown on the screen. These include things like ZWJ, ZWNJ, SOFT-HYPHEN, among others. Put the joiners aside. For the others, you really don't want them to affect your GSUB/GPOS matchings. Ie, a SOFT-HYPHEN shouldn't break your ligature or kerning. ZWJ/ZWNJ are more /complicated/. According to Unicode, ZWNJ should disable a ligature, while ZWJ should encourage it. Before this change, and in any other engine we have tested, inserting a ZWJ in fact breaks ligatures, as it blocks the GSUB rules. With this change, this is what we do now: * When matching GSUB rules: whenever we see a glyph for a Default_Ignorable character other than ZWNJ, if that glyph matches the GSUB rule, we proceed normally. Otherwise, instead of jumping to a "no match", we skip the Default_Ignorable glyph and keep matching, As such, if the font has, eg, rules that match a sequence of 'f',ZWJ,'i', that ligature will still match a sequence of 'f',ZWJ,'i'. But so does a lookup sequence of 'f','i', skipping the ZWJ automatically, * When matching GPOS rules: we simply ignore all Default_Ignorable glyphs, including ZWNJ, * For "basic shaping features" of Indic-like shapers, we disable the automatic rules above for ZWJ and ZWNJ (but not other Default_Ignorables). Indic-like scripts have very specific meanings attached to ZWJ and ZWNJ, and we leave it completely to the font designer to tell us what to do, We think that this is a major improvement over what we used to do (and every other engine still does). Feedback appreciated. = Misc Fixed tricky bug with sanitizing fonts that have overlapping (and broken) tables. We also streamlined handling of zero-width marks for Indic and non-Indic scripts, to match what Uniscribe does. = Summary I think this was yet another tremendously productive week of pair-programming with Jonathan, and would like to thank him for finding them time to make this happen. I also like to thank Martin Hosken, whose expertise in the scripts covered in this hackfest was key to making progress that we did. The hackfest also marked two major milestones for HarfBuzz the shaper: * We fixed all shaping bugs known to us, * As far as we know, we correctly shape every script that Windows 8 shapes, and then some. Last but certainly not least, I like to thank all the other people on the list, whom without their testing and feedback we couldn't get this far. To avoid embarrassingly missing people out, I pass on listing, but you know who you are! I also like to thank our employers, Google and Mozilla, for graciously funding and hosting the hackfest. Is there a script you want to see HarfBuzz support that it currently doesn't? Just ask, and we'll make it happen. Cheers, -- behdad http://behdad.org/ _______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
