Sorry, no progress so far. But for tracking purposes: https://github.com/harfbuzz/harfbuzz/issues/1011
On Sat, Jan 20, 2018 at 6:22 PM, Eric Muller <emul...@amazon.com> wrote: > The easiest would be to add a new API analogous to hb_ot_font_set_funcs(), > that does NOT have the symbol shift in it > > That works. > > Thanks, > Eric. > > > > On 1/19/18 4:43 PM, Behdad Esfahbod wrote: > > Ok, let's see how we can address this... > > I don't like a setting on the buffer as currently the get_glyph() callback > has no way of accessing that information. The easiest would be to add a > new API analogous to hb_ot_font_set_funcs(), that does NOT have the symbol > shift in it. It's not the most elegant solution but easiest. Would that > work for you? > > That said, this issue is also related, as it pertains another non-Unicode > encoding, though, in the font not the buffer: > > https://github.com/harfbuzz/harfbuzz/issues/681 > > On Thu, Jan 18, 2018 at 11:27 PM, Eric Muller <emul...@amazon.com> wrote: > >> I want to build a rendering system where U+0041 renders as an "A", >> regardless of the selected font. >> >> Eric. >> >> >> >> On 1/17/18 3:48 PM, Behdad Esfahbod wrote: >> >> What's the actual problem you are facing? >> >> On Mon, Jan 15, 2018 at 9:58 AM, Eric Muller <emul...@amazon.com> wrote: >> >>> >>> It's clear that if the symbol font is asked by name, we should do the >>> shift. >>> >>> I think I disagree, in the sense that HB should not impose that behavior >>> on it's clients. HB is clearly the right place to implement the behavior, >>> but the choice of having that behavior or not should be with the client. >>> >>> For any document format, rendering the moral equivalent of <p >>> font-family='symbol'>A</p> with something else that an "A" >>> implies that all ASCII is PUA. That's a choice Word, InDesign, Notepad may >>> make if they want, but it should not be imposed on all users of HB. >>> >>> Personally, I think it is a very bad choice for HTML, and Firefox seems >>> to agree. It seems nice and user friendly at first, but this makes the >>> document ambiguous. What about <p font-family='minion, >>> symbol'>A</p>? It's an A or not an A depending on the presence of >>> "minion" in the client. What does the document mean? >>> >>> Of course, <p font-family='symbol'></p> should render with the >>> glyph symbol.cmap(F041). So even if the shift is never done, the glyph is >>> usable. It's just that you don't have the convenience of an IME-like >>> mechanism provided by the shaping engine, but you gain a reliable semantic >>> for the text. >>> >>> That's good behavior [in Word], but beyond what HarfBuzz can do. >>> >>> Yes, which is why the shift may be acceptable or even desirable for some >>> clients, and so hopefully the client could choose. >>> >>> What would clients do with that control then? How would they set it? >>> >>> If I build an app that is meant to work like other GDI apps, I allow the >>> shift (and may be add mitigating measures like Word). If I build an app >>> such as Firefox, I don't allow it. The choice is entirely driven by the >>> type application I want to build, and how I want to define my document >>> format. >>> >>> >>> If you were to implement this choice, I can see it either in the >>> construction of the HB unicode functions, or in the hb_buffer (either >>> globally, or one a character by character basis). I have a preference for >>> the latter: this choice could be passed down to the cmap lookup functions, >>> HB or not; it could also be different on different parts of a document, may >>> be reacting to markup. >>> >>> Eric. >>> >>> >>> >>> On 1/15/18 6:46 AM, Behdad Esfahbod wrote: >>> >>> Hi Eric, >>> >>> On Mon, Jan 15, 2018 at 2:25 AM, Eric Muller <emul...@amazon.com> wrote: >>> >>>> It seems that with a font that has only a 3, 0 cmap subtable (and may >>>> be some macintosh subtables), then HB will automatically do the shift by >>>> F000 (in the function get_glyph_from_symbol) for code points below U+00FF >>>> that are not mapped by the subtable. >>>> >>> >>> Right. Only in hb-ot-func though. Client font funcs can do otherwise. >>> >>> >>> >>>> It is clear that when U+0041 A is set with a symbol font, then that >>>> U+0041 has actually the semantics of a PUA code point, and certainly should >>>> not be treated as an "A". That's the whole point of a 3,0 cmap subtable. >>>> >>> >>> Correct. >>> >>> >>>> Consider an HTML page. The font-family is only a request and there is >>>> no guarantee that the actual font will or will not be a symbol font. Thus >>>> the semantic of the HTML page can change depending on the browser >>>> environment. Outside a browser, it seems that the safe treatment is >>>> therefore to consider all code points below U+00FF as PUA, which is clearly >>>> not tenable. So in that environment, I think that the shift should not be >>>> done. Of course, U+F041 should work. >>>> >>> >>> My take on this is that it's a bug of the font fallback logic if it >>> falls back to a symbol font. I changed fontconfig to never do that. >>> >>> >>>> Note that behavior of Word 2016 on Windows is actually more elaborate: >>>> enter U+0041, and set it with a non-symbol font; copy/paste or save to a >>>> text file, and the result is U+0041; but set this A in a symbol font, and >>>> copy/paste or save to a text file, and the result is U+F041. >>>> >>> >>> That's good behavior, but beyond what HarfBuzz can do. >>> >>> >>>> I think that the shift should be controllable by the client, rather >>>> than systematically applied. I don't have a strong opinion about the >>>> default behavior (i.e. when HB's client does not specify whether the shift >>>> should be done or not). >>>> >>> >>> What would clients do with that control then? How would they set it? >>> >>> I implemented this shift in fontconfig and then harfbuzz because in >>> LibreOffice and other software, there were existing documents that referred >>> to windings or other symbol fonts and encoding characters in the ASCII >>> range. It's clear that if the symbol font is asked by name, we should do >>> the shift. If it's NOT, then it should not be chosen to render text to >>> begin with, which means the shift can be applied unconditionally. >>> >>> How does that sound? >>> behdad >>> >>> >>>> Thoughts? >>>> >>>> Thanks, >>>> Eric. >>>> >>> >>> -- >>> behdad >>> http://behdad.org/ >>> >>> >>> >> >> >> -- >> behdad >> http://behdad.org/ >> >> >> > > > -- > behdad > http://behdad.org/ > > > -- behdad http://behdad.org/
_______________________________________________ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz