Lets continue this particular discussion here: https://github.com/behdad/harfbuzz/commit/81ef4f407d9c7bd98cf62cef951dc538b13442eb#commitcomment-9469767
I want to come to a conclusion on this one as well. b On 15-03-28 10:38 AM, Konstantin Ritt wrote: > This seems to be deferred for ever :) > With the latest HarfBuzz, I still have to fix-up glyph/metrics for the > White_Spaces of GC=Cc|Zl|Zp to avoid the "missing glyph" boxes on rendering. > > From http://www.unicode.org/faq/unsup_char.html#2 : >> Q: Which characters should be displayed as a visible but blank space? >> A: This is the easy one: all the characters that have the White_Space > property, also generically known as “whitespace characters”. This set includes > SPACE, of course, but also such characters as the tab control character, > NO-BREAK SPACE, LINE SEPARATOR, and so on. For the full list, see the > White_Space values in PropList.txt > <http://www.unicode.org/Public/UCD/latest/ucd/PropList.txt>. > > And from PropList.txt : > 0009..000D ; White_Space # Cc [5] <control-0009>..<control-000D> > 0020 ; White_Space # Zs SPACE > 0085 ; White_Space # Cc <control-0085> > 00A0 ; White_Space # Zs NO-BREAK SPACE > 1680 ; White_Space # Zs OGHAM SPACE MARK > 2000..200A ; White_Space # Zs [11] EN QUAD..HAIR SPACE > 2028 ; White_Space # Zl LINE SEPARATOR > 2029 ; White_Space # Zp PARAGRAPH SEPARATOR > 202F ; White_Space # Zs NARROW NO-BREAK SPACE > 205F ; White_Space # Zs MEDIUM MATHEMATICAL SPACE > 3000 ; White_Space # Zs IDEOGRAPHIC SPACE > > > My proposition is the following: > - The glyph for White_Spaces should be replaced with the glyph for U+0020 > (except for U+0020 itself). > This is a good first approximation which guarantees we would never get a box > for White_Spaces. > - If there is no glyph for White_Space in the font (and we just replaced it > with the glyph for U+0020), simply dup the metrics for U+0020 as well; > otherwise believe the font provides a correct metrics. > This doesn't care about ie. half-width spaces but also a good approximation > for the most-common case. > This only applicable when no HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES has > been set; otherwise do nothing. > > Regards, > Konstantin > > > 2014-09-25 20:30 GMT+04:00 Behdad Esfahbod <[email protected] > <mailto:[email protected]>>: > > Thanks James and Jonathan for taking care of this on the CSS side. > Working-group resolved to change this to display Cc characters > (other than HT, LF, CR): > > http://log.csswg.org/irc.w3.org/css/2014-09-08/#e469835 > > On 14-03-20 03:19 AM, James Clark wrote: > > On Thu, Mar 20, 2014 at 6:04 AM, Behdad Esfahbod <[email protected] > <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > > > Also, Unicode says GC=Cc should just render as boxed if not > supported. > > > > > > However, it also says that characters with the White_Space property > true it > > should be rendered as space. In addition to 0x9, 0xA and 0xD (which > both CSS > > and HTML treat as white space), these are 0xB (VT), 0xC (FF), and 0x85 > (NEL). > > > > The > > reason we want them removed here is really an artifact of the HTML > spec. > > > > > > The requirement of ignoring all GC=Cc characters seems to be an > artifact of > > the CSS3 Text WD > (http://www.w3.org/TR/css-text-3/#white-space-processing), > > which is not yet set in stone. Note that it's different from CSS2.1 > > (http://www.w3.org/TR/CSS2/text.html#ctrlchars) which says that they > render as > > usual. > > > > The CSS3 text behaviour seems like a bad idea to me, because > > > > a) it conflicts with Unicode, and > > b) legacy Windows encodings use C1 code points (in the range 0x80 - > 0x9F) for > > real characters; if a page using eg Windows-1252 encoding is > mislabelled as > > ISO-8859-1 (which can definitely happen) then all the code points in > this > > range would be silently be ignored rather than showing up as boxes. > > > > WDYT? > > > > > > I think the default should be to do what Unicode says. Also ask the > CSS3 text > > folks why they are proposing this handling of Cc. > > > > James > > -- > behdad > http://behdad.org/ > _______________________________________________ > HarfBuzz mailing list > [email protected] <mailto:[email protected]> > http://lists.freedesktop.org/mailman/listinfo/harfbuzz > > -- behdad http://behdad.org/ _______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
