ICU has quick-check functions http://icu-project.org/apiref/icu4c/unorm2_8h.html#ad81711834f00bbeb97738004f4f08450 which can return YES, NO, MAYBE as to whether normalization is required. If you're making a pass over the data, this is not *much* more expensive than just checking for non ascii. Something to consider, either if ICU is used, or in principle.
-s On Thu, Aug 9, 2012 at 10:32 AM, Jonathan Kew <[email protected]>wrote: > Hi Behdad, > > While complex-script shaping is obviously far more interesting, in > practice there is a lot of very simple ASCII text on the web. So what would > you think of adding a minor optimization that looks like it can give us > about 10% gain on shaping ASCII text with simple fonts? The idea is to make > hb_buffer_add check whether any non-ASCII characters have been put in the > buffer; and if not, there's no need to run the normalization pass. > > (Of course, there are plenty of non-ASCII characters that could also be > present without normalization becoming relevant, but I didn't want to make > the check any more expensive than a simple character-code comparison, and > optimizing performance of ASCII-only runs will benefit a lot of real-world > text for minimal effort.) > > This was prompted by profile data such as http://people.mozilla.com/~** > bgirard/cleopatra/?report=**c2e6bea3647461c0675e59441b78c0**f5c409ac0d<http://people.mozilla.com/~bgirard/cleopatra/?report=c2e6bea3647461c0675e59441b78c0f5c409ac0d>(see > https://bugzilla.mozilla.org/**show_bug.cgi?id=762710#c25<https://bugzilla.mozilla.org/show_bug.cgi?id=762710#c25>), > which relates to layout of a large, almost purely ASCII document. This > shows the normalization pass - which we know is redundant for ASCII-only > text - contributing around 10% of the total shaping time. With this patch, > that time simply vanishes from the profile. > > JK > > > _______________________________________________ > HarfBuzz mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/harfbuzz > >
_______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
