On Nov 19, 2008, at 10:42 AM, Jungshik Shin (신정식, 申政湜) wrote:



2008/11/6 Prunthaban Kanthakumar <[EMAIL PROTECTED]>

Now we can do the following,
1. Add an additional condition in styleDidChange method to check if the font-family is supported by our transcoder (At present a fast look-up table should do because we plan to support only limited set of fonts) - This condition will be #ifdefed on ENABLE(TRANSCODER_SUPPORT).

Shouldn't this be triggered by (font-family, site) rather than just font-family?

Since we're looking at this as a legacy compatibility feature, and would like future sites to move to proper Unicode-encoded text, my first instinct would be {font, site} pairs. But that depends on whether we can achieve acceptable Indic browsing results with just a fixed list of sites.


On a related note, I would like to mention here that, we cannot go with the approach of 'one look-up table' per font-face and a single transcoder to do the look-up for all fonts. The problem is that many indic languages use multiple code-points to represent one character and different fonts use different standards! For example there are situations where one glyph in EOT needs to be transcoded to 5+ Unicode code points. A reverse situation is also possible. Due to these issues, we cannot go with a simple look-up table for all fonts. This forces us to write some specialized code to handle each font (there might also be some fonts where a one-to-one look-up table will be enough).

In October, I listed two alternatives for this transformation. One is adding ICU converters for Indic font encodings (it can deal with m-to-n mappings) and the other is implementing your own. The first was ruled out because it's not easy to add new converters on Mac OS X where ICU is a part of the OS. There's another approach you can take. You can build ICU transliterator rules and it seems to be the cleanest way to do this. You don't need to port/implement conversion code (from another project : e.g. Padma) but just need to 'port' the conversion tables to ICU transliterator rules.

This transcoding will be invoked on the content of a text node already in Unicode just like 'text-transform: capitalize' or 'text- transform: lowercase' is. ICU transformer is for transforming a chunk of text in Unicode to another chunk of text in Unicode. ( http://www.icu-project.org/userguide/Transform.html ) So, it appears to be almost a perfect fit.

This sounds like it would work for any ICU-based, though it would prevent the feature from working for ports that use something other than ICU for unicode and text transcoding support, most notably the Qt port. Would it simplify the code significantly to make it an ICU transformer rather than something custom?

Regards,
Maciej

_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Reply via email to