from:"Khaled Hosny"

Re: [HarfBuzz] Ligatures

2020-05-26 Thread Khaled Hosny




> On May 24, 2020, at 6:34 PM, Eli Zaretskii  wrote:
> 
>> From: Khaled Hosny 
>> Date: Sun, 24 May 2020 18:00:45 +0200
>> Cc: harfbuzz@lists.freedesktop.org
>> 
> 
>> This, for example, ensures that HarfBuzz can do basic Arabic-like shaping 
>> across item boundaries e.g. if you break items in the middle of an Arabic 
>> word (due to font change, for example), you still get the 
>> initial/medial/final forms across the boundary as appropriate. Or to put a 
>> combining mark at the start of a paragraph on a dotted circle as it 
>> otherwise has no base.
>> 
>> If this is not possible, then you can try to pass enough context, like reach 
>> back and forward to first character that is not a combining mark. This may 
>> or may not be enough.
>> 
>> Shaping space-delimited words is orthogonal to that, context is better be 
>> always provided.
> 
> So this sounds like passing a physical line that ends in a newline
> should be good enough?  Or are there issues that cross newlines as
> well?

It should be enough.
> 
> And what is a "paragraph" in this context?

The same as in UAX#9.

>> Some fonts do have OpenType lookups that interact with space (e.g. kerning 
>> pairs involving space, or even substitutions involving space), so shaping 
>> words independently will give suboptimal result. You can use HarfBuzz API to 
>> find out if the font has OpenType layout rules involving space, or decide to 
>> live with this limitation.
> 
> Which API provides this information?

https://harfbuzz.github.io/harfbuzz-hb-ot-layout.html#hb-ot-layout-lookup-collect-glyphs

But requires some understanding of how OpenType lookups are structured. 
Checking how Firefox uses it might help.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-24 Thread Khaled Hosny

> On May 24, 2020, at 5:41 PM, Eli Zaretskii  wrote:
> 
>>> I almost understand (and agree), sans one part: the "arbitrary parts"
>>> of what you wrote.  If we want to produce a ligature out of "ffi", the
>>> shaper will get "fii" and nothing more.  Which part here is arbitrary?
>> 
>> Sending "ffi" alone is an arbitrary decision. The font might have kerning 
>> between "ffi" and what comes before and after it, but you won't get it. The 
>> font might not have a ligature for "ffi" at all, but using kerning instead, 
>> so you will get kerning between "ffi" glyphs and not other glyphs which is 
>> arbitrary. It might be a cursive font that changes glyph shapes based on 
>> surrounding glyphs, and you will get that for "ffi" and not elsewhere which 
>> is arbitrary.
>> 
>> That is just plain wrong, there is no way around it.
> 
> So, to make sure I understand the correct solution: you are saying
> that all the text to be displayed should go through the shaper, is
> that right?
> 
> If so, how large should be the chunks of text to be passed to the
> shaper in any one call, in order to have a correct result?  Would it
> be enough to pass whitespace-separated words one by one? or do we need
> to send entire physical lines (up to the terminating newline
> character)? or maybe an entire paragraph?  What is the recommendation
> here?

In general the safest is to pass the whole paragraph of text and the start and 
length of each item (item being a run with same font, direction, script, and 
language).

This, for example, ensures that HarfBuzz can do basic Arabic-like shaping 
across item boundaries e.g. if you break items in the middle of an Arabic word 
(due to font change, for example), you still get the initial/medial/final forms 
across the boundary as appropriate. Or to put a combining mark at the start of 
a paragraph on a dotted circle as it otherwise has no base.

If this is not possible, then you can try to pass enough context, like reach 
back and forward to first character that is not a combining mark. This may or 
may not be enough.

Shaping space-delimited words is orthogonal to that, context is better be 
always provided.

Some fonts do have OpenType lookups that interact with space (e.g. kerning 
pairs involving space, or even substitutions involving space), so shaping words 
independently will give suboptimal result. You can use HarfBuzz API to find out 
if the font has OpenType layout rules involving space, or decide to live with 
this limitation. Firefox does this check as it wants to cache individualizing 
ideal shaped words when possible, and Chrome used to do that to but I think 
they now make sure to retain enough information to avoid unnecessary reshaping 
so such a word cache is not needed.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny



> On May 23, 2020, at 8:34 PM, Eli Zaretskii  wrote:
> 
>> From: Khaled Hosny 
>> Date: Sat, 23 May 2020 20:18:33 +0200
>> Cc: harfbuzz@lists.freedesktop.org
>> 
>>> The Emacs display engine examines the text to be displayed and laid
>>> out one character at a time, and makes layout decisions after each
>>> character or grapheme cluster it lays out.  Its design is therefore
>>> fundamentally incompatible with shaping large substrings of buffer
>>> text at once.  We do support that for short sequences of characters,
>>> which seems to work well enough for complex shaping (a.k.a. "character
>>> compositions") of scripts that require that, but we still do that one
>>> grapheme cluster at a time.  
>> 
>> That wouldn’t work for Arabic. You can’t shape Arabic one grapheme cluster 
>> at a time (or any other text actually, but the brokenness in Arabic will be 
>> immediately obvious), so I’m most certain that is not exactly how Arabic is 
>> handled in Emacs right now.
> 
> We pass to the shaper the part of text that matches the regexps you
> can see at the end of misc-lang.el, then display the glyphs the shaper
> returns.  The above description is a high-level overview; there are
> many details that I cannot describe in a short message.  For example,
> for Arabic, when we get back the grapheme clusters, we lay them out,
> then skip to the end of the text that we passed to the shaper.

You mean this:
https://repo.or.cz/emacs.git/blob/HEAD:/lisp/language/misc-lang.el#l78

I’m not sure how can I read it, but it seems to be missing the entire Arabic 
Extended-A and Arabic Mathematical Alphabetic Symbols blocks. I’m not also sure 
how it would handle using combining marks from other blocks with Arabic text 
(say putting U+20D6 over an Arabic letter).

What happens if one edits a file that contains only Arabic text, and why that 
(whatever it is ) can’t be extended to any text?

>>> The character composition is implemented
>>> in Lisp, which is called by the display engine, and which then calls
>>> back into C to invoke the shaper.  This implementation is meant to
>>> allow a great deal of control on what should be composed and how.  But
>>> it is also relatively slow, which is another reason why doing that for
>>> all the text to be laid out is impractical: it slows down redisplay to
>>> the degree that it becomes annoying to users.
>> 
>> Having more control should not be at the price of doing things wrong.
> 
> No one said it should, that's just how things are.
> 
>> The whole composition concept of Emacs does not make any sense to me, all 
>> text is “composed”. You can have a special mode that would disable shaping 
>> for specific purposes (opening huge log files, wanting to see raw text with 
>> no bidi or shaping, etc), but this can be done in cooperation with HarfBuzz 
>> and not by bypassing it entirely.
> 
> We are talking about a piece of software designed 21 years ago.  I
> realize that it makes no sense to you, but that's what we have, and
> will probably have for the next 10 years or so.  We must make the most
> out of what we have.

So nearly as old as the first release of OpenOffice (not counting its 
StarOffice days). Anyway bad decisions about text layout is quite rampant in 
software (old and new) and need to be fixed, but that is not my call.

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny

> On May 23, 2020, at 8:26 PM, Eli Zaretskii  wrote:
> 
>> From: Khaled Hosny 
>> Date: Sat, 23 May 2020 20:09:50 +0200
>> Cc: harfbuzz@lists.freedesktop.org
>> 
>> Overall, if you can’t send the whole text (words are the absolute minimum, 
>> but this has its issues as well), don’t just send arbitrary parts of it as 
>> the result will be some inconsistent mess.
> 
> I almost understand (and agree), sans one part: the "arbitrary parts"
> of what you wrote.  If we want to produce a ligature out of "ffi", the
> shaper will get "fii" and nothing more.  Which part here is arbitrary?

Sending “ffi” alone is an arbitrary decision. The font might have kerning 
between “ffi” and what comes before and after it, but you won’t get it. The 
font might not hav a ligature for “ffi” at all, but using kerning instead, so 
you will get kerning between “ffi” glyphs and not other glyphs which is 
arbitrary. It might be a cursive font that changes glyph shapes based on 
surrounding glyphs, and you will get that for “ffi” and not elsewhere which is 
arbitrary.

That is just plain wrong, there is no way around it.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny

> On May 23, 2020, at 10:35 AM, Eli Zaretskii  wrote:
> 
>> From: Khaled Hosny 
>> Date: Sat, 23 May 2020 09:59:15 +0200
>> Cc: harfbuzz@lists.freedesktop.org
>> 
>> Also either Emacs is currently treating text that it enables shaping for as 
>> second-class citizens where limitations/degraded performance is acceptable 
>> (which is really really bad)
> 
> Could you tell more about which limitations and degraded performance
> you had in mind?  I'm not sure we have this, but cannot tell without
> understanding the issues.

I have no idea. I’m just guessing why you think the Emacs display engine can’t 
handle all text like it handles Arabic. Either it does not handle Arabic 
correctly, or it can handle all text like it handles Arabic.

>> or “redesigning the entire Emacs display engine” is not really needed as you 
>> can just declare all text as text that needs to be shaped and be done with 
>> it.
> 
> The Emacs display engine examines the text to be displayed and laid
> out one character at a time, and makes layout decisions after each
> character or grapheme cluster it lays out.  Its design is therefore
> fundamentally incompatible with shaping large substrings of buffer
> text at once.  We do support that for short sequences of characters,
> which seems to work well enough for complex shaping (a.k.a. "character
> compositions") of scripts that require that, but we still do that one
> grapheme cluster at a time.  

That wouldn’t work for Arabic. You can’t shape Arabic one grapheme cluster at a 
time (or any other text actually, but the brokenness in Arabic will be 
immediately obvious), so I’m most certain that is not exactly how Arabic is 
handled in Emacs right now.

> The character composition is implemented
> in Lisp, which is called by the display engine, and which then calls
> back into C to invoke the shaper.  This implementation is meant to
> allow a great deal of control on what should be composed and how.  But
> it is also relatively slow, which is another reason why doing that for
> all the text to be laid out is impractical: it slows down redisplay to
> the degree that it becomes annoying to users.

Having more control should not be at the price of doing things wrong. The whole 
composition concept of Emacs does not make any sense to me, all text is 
“composed”. You can have a special mode that would disable shaping for specific 
purposes (opening huge log files, wanting to see raw text with no bidi or 
shaping, etc), but this can be done in cooperation with HarfBuzz and not by 
bypassing it entirely.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny

> On May 23, 2020, at 10:25 AM, Eli Zaretskii  wrote:
> 
>> From: Khaled Hosny 
>> Date: Sat, 23 May 2020 09:51:21 +0200
>> Cc: harfbuzz@lists.freedesktop.org
>> 
>>> Thanks.  Since (b) is not really feasible without redesigning the
>>> entire Emacs display engine (for which I see no volunteers lining up
>>> any time soon), I guess we will have to use some more-or-less
>>> reasonable and somewhat unreliable heuristics by supporting only some
>>> ligatures that are known in advance.
>> 
>> What are you going to do about kerning, or mark positioning? Partially 
>> kerning arbitrary glyphs (because the sub string match some regular 
>> expression) is worse than not kerning at all.
> 
> I don't think I understand the question.  How is kerning related to
> the issue at hand?

Kerning is part of text layout. You are only considering ligatures, but they 
are small part of text layout and your proposal does not seem to consider 
anything other than ligatures which is arbitrary division and makes no much 
sense to me. Some fonts provide ligatures to fix f-collioson, others fix it 
with contextual alternates, and others fix it with kerning. Your proposed 
solution does not address this. Also when you pass certain text to the layout 
engine, you get everything the font provides not just ligatures, so you would 
end up kerning certain letter combination (that you send to the layout engine) 
and not others, which is inconsistent and ugly.

Overall, if you can’t send the whole text (words are the absolute minimum, but 
this has its issues as well), don’t just send arbitrary parts of it as the 
result will be some inconsistent mess.

Regards,
Khaled

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny



> On May 23, 2020, at 9:51 AM, Khaled Hosny  wrote:
> 
> 
> 
>> On May 23, 2020, at 9:44 AM, Eli Zaretskii  wrote:
>> 
>>> From: Khaled Hosny 
>>> Date: Sat, 23 May 2020 08:36:10 +0200
>>> Cc: harfbuzz@lists.freedesktop.org
>>> 
>>>>  The only way of
>>>> doing this right, I'm told, is to either (a) query the font to get the
>>>> list of all the ligatures it supports, or (b) assume any combination
>>>> of characters can produce a ligature, and therefore we need to pass
>>>> all the characters intended for display through hb_shape.  The latter
>>>> in particular is in stark contrast to how the current Emacs display
>>>> code is designed and implemented.
>>> 
>>> (a) is not realistically possible as doing it properly has pretty much the 
>>> same cost as shaping the text. So your only reliable option is (b).
>> 
>> Thanks.  Since (b) is not really feasible without redesigning the
>> entire Emacs display engine (for which I see no volunteers lining up
>> any time soon), I guess we will have to use some more-or-less
>> reasonable and somewhat unreliable heuristics by supporting only some
>> ligatures that are known in advance.
> 
> What are you going to do about kerning, or mark positioning? Partially 
> kerning arbitrary glyphs (because the sub string match some regular 
> expression) is worse than not kerning at all.

Also either Emacs is currently treating text that it enables shaping for as 
second-class citizens where limitations/degraded performance is acceptable 
(which is really really bad), or “redesigning the entire Emacs display engine” 
is not really needed as you can just declare all text as text that needs to be 
shaped and be done with it.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny




> On May 23, 2020, at 9:44 AM, Eli Zaretskii  wrote:
> 
>> From: Khaled Hosny 
>> Date: Sat, 23 May 2020 08:36:10 +0200
>> Cc: harfbuzz@lists.freedesktop.org
>> 
>>>   The only way of
>>> doing this right, I'm told, is to either (a) query the font to get the
>>> list of all the ligatures it supports, or (b) assume any combination
>>> of characters can produce a ligature, and therefore we need to pass
>>> all the characters intended for display through hb_shape.  The latter
>>> in particular is in stark contrast to how the current Emacs display
>>> code is designed and implemented.
>> 
>> (a) is not realistically possible as doing it properly has pretty much the 
>> same cost as shaping the text. So your only reliable option is (b).
> 
> Thanks.  Since (b) is not really feasible without redesigning the
> entire Emacs display engine (for which I see no volunteers lining up
> any time soon), I guess we will have to use some more-or-less
> reasonable and somewhat unreliable heuristics by supporting only some
> ligatures that are known in advance.

What are you going to do about kerning, or mark positioning? Partially kerning 
arbitrary glyphs (because the sub string match some regular expression) is 
worse than not kerning at all.

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Ligatures

2020-05-23 Thread Khaled Hosny

> On May 22, 2020, at 9:32 PM, Eli Zaretskii  wrote:
> 
> Hi,
> 
> This is a bit off-topic, but I thought it could be appropriate to ask
> here, since we have here some of the best experts on this subject.
> 
> We are discussing support for ligatures in Emacs, specifically when
> using HarfBuzz as the shaping engine.  See the discussion from
> 
>  https://lists.gnu.org/archive/html/emacs-devel/2020-05/msg02493.html
> 
> The current support for producing ligatures works in the same way as
> complex text shaping for scripts that require that, like Arabic and
> Khmer: the sequences of characters that can be displayed as ligatures
> are identified in advance with suitable regular expressions, and the
> display engine then passes these sequences to hb_shape to produce the
> ligatures.
> 
> This works well for scripts that require complex shaping, because such
> scripts generally have well-defined rules for the sequences of
> codepoints that need shaping.  My original thoughts were that
> ligatures could be supported in the same way, based on the assumption
> that the list of possible ligatures is finite and can be stored in a
> suitable data stricture in advance.

I might be stating the obvious, but what Emacs is doing is a very outdated view 
of text layout. The schism between so called complex text and simple text does 
not actually exist. There are script-specific shaping rules that layout engines 
know and apply, and there are additional/complementary rules provided by the 
font that layout engines also apply.

For all applications care about, they have text with certain properties and 
fonts, and they hand them to the layout engine and get back positioned glyphs. 
Any attempt to second guess the layout engine and classify the text into parts 
that need or do not need shaping is futile.

Fonts can, and do, provide any number of arbitrary glyph interactions (not just 
ligatures), and the only reliable way to know that is to shape and check the 
output.

I think I already said this before, but Emacs should indiscriminately give all 
the text to HarfBuzz (or any other text layout engine it additionally supports) 
and give up on trying to pre-classify text, and is what pretty much any other 
sensible application is doing already. There are many ways to solve potential 
performance issues that does not involve compromising on the text layout.

> However, I'm being told that this assumption is false, and that each
> font defines ligatures from any number of arbitrary combinations of
> characters, and therefore the exhaustive list of the ligatures is in
> practice infinite and cannot be provided in advance.

That is true.

>The only way of
> doing this right, I'm told, is to either (a) query the font to get the
> list of all the ligatures it supports, or (b) assume any combination
> of characters can produce a ligature, and therefore we need to pass
> all the characters intended for display through hb_shape.  The latter
> in particular is in stark contrast to how the current Emacs display
> code is designed and implemented.

(a) is not realistically possible as doing it properly has pretty much the same 
cost as shaping the text. So your only reliable option is (b).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Can't map characters to glyphInfos using clusters

2020-03-03 Thread Khaled Hosny




> On Mar 4, 2020, at 12:08 AM, bo samson  wrote:
> 
> Hi all,
> When rendering each of my glyphs, I'd like to know if there are, for example, 
> linebreaks, so that I can skip them and go to the next line. (or at the very 
> least, skip that character)
> 
> According to the documentation, I should be using the glyphinfo.cluster for 
> this, but this info is unusable, even in ClusterLevel 2. The cluster number 
> will even increment past the length of my string. 

The cluster is input string code units (if you use add_utf* functions), so if 
your input is UTF-8 string it will be in bytes.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Alternate/random glyphs

2020-02-13 Thread Khaled Hosny



> On Feb 12, 2020, at 9:28 PM, Aleš Mlakar  wrote:
> 
> Hi Khaled,
> I did a quick debug through that part of HarfBuzz and it seems it's doing 
> lookups and never gets to the random code.

What random code?

> Tomorrow I'll try to put something together, would a link to a screen shot 
> and font be sufficient for starters?

It might help, you might also want to check with hb-view to see if it is giving 
the same output as your application.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Alternate/random glyphs

2020-02-12 Thread Khaled Hosny

Depends on how the font is doing the randomization, using contextual lookups 
(most likely) or rand feature (less likely as this does not basically work 
outside of HarfBuzz and few other less common applications).

We would need to see the font and specific examples with output of both to form 
any meaningful opinion.

Regards,
Khaled

> On Feb 12, 2020, at 8:15 PM, Aleš Mlakar  wrote:
> 
> Hi Nikolay,
> I have a font "Daft Brush" for example, and if I write "gg" I get 4 
> different glyphs "randomized", and the result is different in Indesign and in 
> my own application with Harfbuzz.
> So my question is, who is doing it wrong, Indesign or Harfbuzz, or neither 
> and is actually implementation defined.
> 
> Best,
> Ales
> 
> On Wed, Feb 12, 2020 at 7:01 PM Nikolay Sivov  wrote:
> 
> 
> On Wed, Feb 12, 2020 at 8:58 PM Aleš Mlakar  wrote:
> Hey all, 
> I've been trying to mimic font shaping in Adobe Indesign with Harfbuzz, most 
> of it works great, but when random/alternate glyphs (for the fonts that have 
> multiple glyphs for the same code point) are used it's not even remotely 
> similar anymore.
> 
> Sooo my question is basically this - is there any standard for glyph 
> randomization or is this application controlled?
> 
> If you're talking about alternate forms features and the like, application 
> usually controls whether feature is enabled or not. The output of the feature 
> is determined by font data, and shaping logic obviously. It's not random, if 
> we're talking about the same thing.
>  
> 
> Thanks!
> 
> Regards,
> Ales
> 
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> 
> 
> -- 
> Aleš Mlakar, 
> Programmer/Consultant
> am.bits
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] The show must go on...

2020-01-09 Thread Khaled Hosny

Thanks Behdad for all the work you have done in HarfBuzz and for typography
and internationalization of FOSS, and looking forward to seeing more of
your work. I wish you the best with your life and new adventures.

Thanks also to Ebrahim for all his work and looking after HarfBuzz and I
wish him the best in his new rule and duties.

Regards,
Khaled

On Wed, Jan 8, 2020, 2:47 AM Behdad Esfahbod  wrote:

> Heya,
>
> I took over maintenance of Pango 15 years ago.  Code in there that I
> wanted to replace turned into HarfBuzz, and the rest is history.
>
> Lately, as you might have noticed, I've been slow to respond to my
> maintainer duties.  At the same time, Ebrahim rose to fill in for me.  He
> has proven to be very capable to replace me at this point, so I believe
> it's time to officiate that.
>
> My involvement in the project wouldn't otherwise change.  I assure you
> all, my change of employment has nothing to do with this.  It's just that..
> it's time for me to stop blocking others' contributions while I focus on
> other things in life.
>
> Ebrahim, thank you for your leadership.  I'm looking forward to seeing you
> make mistakes and learn from it. :)  You are awesome.  I look forward to
> talk to you in person next week.
>
> Cheers,
>
> behdad
> Tehran
>
> PS. Talk about "bus factor".  Or is that called "bomb factor" these days..
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Exposing attachment tree / Arabic joining to shaping clients

2019-11-14 Thread Khaled Hosny


> Not sure how this can be explained, maybe someone has attempted to prevent 
> some fonts to get Kashida justification and maybe the detection was font 
> based, the reason it is still failing for IranNastaliq but not for Amiri, but 
> in any case it is imperfect.

Amiri has a zero-width kashida glyph (that gets mapped to from camp, Then 
replace it by actual Kashida using GSUB), a hack I implemented to prevent 
Kashida justification in LibreOffice (why I also made LibreOffice check that 
font has a Kashida glyph with +ve width b before it tries to do kashida 
justification). I can imagine IE doing a similar check. You can try with Aref 
Ruqaa font which does a similar hack.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Closing down the list, 2019 edition

2019-08-19 Thread Khaled Hosny



> On Aug 19, 2019, at 7:58 PM, Behdad Esfahbod  wrote:
> 
> Hi all,
> 
> I know I asked this before but... is there still value in keeping the list 
> around?  I find myself preferring github issues over mailing list threads all 
> the time.
> 
> I know our homepage / documentation / etc are lacking in a lot of ways and 
> maybe we just need to promote using github issues for all kinds of inquiries 
> instead...   Anyway, just gauging how others feel, mid-2019 edition.

GitHub is fine by me, keeping discussion in one place sounds good.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Use of bool and stdbool.h

2019-08-09 Thread Khaled Hosny

> On Aug 9, 2019, at 7:33 PM, Ebrahim Byagowi  wrote:
> 
> made we wonder why HarfBuzz went for hb_bool_t 

For C89 compatibility? Some not so old versions of MSVC didn’t have it either.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Failure in hb_font_get_nominal_glyph

2019-07-24 Thread Khaled Hosny

> On Jul 24, 2019, at 9:13 PM, Eli Zaretskii  wrote:
> 
>> From: Behdad Esfahbod 
>> Date: Wed, 24 Jul 2019 15:11:03 -0400
>> Cc: "harfbuzz@lists.freedesktop.org" 
>> 
>> Nothing stands out to me.
> 
> Thanks for taking a look.
> 
> Could something like that be caused by an old version of Freetype
> library used with HarfBuzz?  I believe when the OP upgraded his
> HarfBuzz he also upgraded Freetype as its dependency.

Emacs don’t seem to be using FreeType integration on your Windows code, so that 
seems unlikely.

I think Emacs is missing a call to hb_ot_font_set_funcs() after creating the 
font. This is the default since only a few releases ago and Emacs code seems to 
assume it. This is not needed on Linux since the FreeType functions are used 
there.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Order of combining diacriticals

2019-06-20 Thread Khaled Hosny

On Fri, Jun 14, 2019 at 9:06 PM Eli Zaretskii  wrote:
>
> > From: Behdad Esfahbod 
> > Date: Fri, 14 Jun 2019 11:34:17 -0700
> > Cc: Khaled Hosny ,
> >   "harfbuzz@lists.freedesktop.org" 
> >
> > On Thu, Jun 13, 2019 at 2:18 AM Eli Zaretskii  wrote:
> >
> >. For fonts that have no 'hebr' features, Emacs performs
> >  substitution of known precomposed characters before it invokes the
> >  shaping engine.  In this case, it substituted U+FB31 for the
> >  sequence U+05D1,U+05BC, and passed the sequence U+FB31,U+05B0 to
> >  HarfBuzz.
> >
> > You should remove all such hacks.
>
> I understand that for HarfBuzz they are probably not needed, if the
> necessary functions for accessing the glyphs are provided (something
> that might not be true on Windows, where we don't use Freetype
> directly).

This functionality either depends on Unicode decompositions (or in
case of Hebrew hard-coded tables in HarfBuzz), so the font functions
used make no difference.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Order of combining diacriticals

2019-06-12 Thread Khaled Hosny

On Wed, Jun 12, 2019 at 10:22:48PM +0300, Eli Zaretskii wrote:
> In Emacs, we use HB_BUFFER_CLUSTER_LEVEL_MONOTONE_GRAPHEMES cluster
> level, because HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS produced
> incorrect display.

The cluster levels shouldn’t affect display, the glyph positions are
exactly the same for all the three:

$ hb-shape NotoSerifHebrew-Regular.ttf --unicodes="U+05D1,U+05B0,U+05BC" 
--cluster-level=0
[uni05B0=0@178,0+0|uni05BC=0@153,0+0|uni05D1=0+539]

$ hb-shape NotoSerifHebrew-Regular.ttf --unicodes="U+05D1,U+05B0,U+05BC" 
--cluster-level=1
[uni05B0=1@178,0+0|uni05BC=1@153,0+0|uni05D1=0+539]

$ hb-shape NotoSerifHebrew-Regular.ttf --unicodes="U+05D1,U+05B0,U+05BC" 
--cluster-level=2
[uni05B0=1@178,0+0|uni05BC=2@153,0+0|uni05D1=0+539]

This might indicate an issue with the Emacs display code.

> With this level, whenever I type a Hebrew base
> character with more than one diacritical, I need to type them in
> certain order, otherwise the display is incorrect.
> 
> For example, in this series of characters:
> 
>   U+05D1 HEBREW LETTER BET
>   U+05B0 HEBREW POINT SHEVA
>   U+05BC HEBREW POINT DAGESH

> 
> I need to type them in the above order; if I type DAGESH before SHEVA,
> the produced display is incorrect.

The glyph order and positions are the same regardless of the input order
(which is what I’d expect since HarfBuzz normalizes mark order), the
only difference is cluster values which is also expected AFICT:

$ hb-shape NotoSerifHebrew-Regular.ttf --unicodes="U+05D1,U+05B0,U+05BC" 
--cluster-level=1
[uni05B0=1@178,0+0|uni05BC=1@153,0+0|uni05D1=0+539]

$ hb-shape NotoSerifHebrew-Regular.ttf --unicodes="U+05D1,U+05BC,U+05B0" 
--cluster-level=1
[uni05B0=2@178,0+0|uni05BC=1@153,0+0|uni05D1=0+539]
 
> Is this expected with level-0 clusters?  Or should I look for a bug in
> how Emacs uses HarfBuzz?

Might be a result of hb_buffer_reverse_clusters() used by Emacs.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-06 Thread Khaled Hosny

On Thu, Jun 06, 2019 at 05:29:07AM +0300, Eli Zaretskii wrote:
> > From: Behdad Esfahbod 
> > Date: Wed, 5 Jun 2019 12:45:00 -0700
> > Cc: "harfbuzz@lists.freedesktop.org" 
> > 
> > HarfBuzz handles everything it understands.  It was designed, in fact, such 
> > that when combined with
> > FreeType or other external font funcs implementation, it even "handles" 
> > font formats it does not understand. 
> > Eg. HarfBuzz doesn't understand BDF, PCF, etc, but if you use hb-ft, you 
> > can use hb-ft for everything, and
> > BDF, PCF etc also magically work because HarfBuzz defers to FreeType for 
> > glyph access, and simply
> > "passes through" for the rest.  It was designed such that you can keep one 
> > shaping code path.
> 
> We don't currently use hb-ft on Windows.  But thanks, I think I
> understand.

You can achieve the same by implementing font functions for the font
formats HarfBuzz does not directly support, using e.g. GDI API to access
glyph info in these fonts (see hb_font_funcs_set_* functions).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-05-30 Thread Khaled Hosny

On Wed, May 29, 2019 at 10:32:12PM +0300, Eli Zaretskii wrote:
> > No idea how Emacs would deal with reordered Indic glyphs which don’t
> > always follow the input order.
> 
> Can you show an example of such a situation and what is expected from
> the correct shaping and display?  I could then see what happens in
> Emacs.

The combining marks in strings like بَّا with the font from
https://github.com/khaledhosny/noname-fixed (don’t recall if I tested
with other fonts, can’t re-test now) would be drawn in the wrong order
without reversing the clusters. Or may be that was a different problem,
not sure anymore. Try removing the reverse_clusters() call and see what
happens.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-05-29 Thread Khaled Hosny

On Wed, May 29, 2019 at 06:30:08PM +0300, Eli Zaretskii wrote:
> Hi,
> 
> While testing the results of hb_shape_full called to shape R2L text, I
> observed behavior that surprised me: shaping an R2L base letter with a
> diacritical produces a sequence of glyphs in reverse order, i.e. the
> glyph for the diacritical comes first, before the base letter.
> 
> For example, if I shape the sequence (in the logical order)
> 
>   U+05EA HEBREW LETTER TAV
>   U+05BB HEBREW POINT QUBUTS
> 
> the glyphs left in the buffer by the shaper are in reverse order,
> first QUBUTS, then TAV.  I thought that this was because of bidi
> reordering, but the result doesn't change if I set the buffer
> direction to LTR before calling the shaper.  The order of the clusters
> does change with the direction, i.e. with LTR the first cluster is
> zero, followed by 1, etc., whereas with RTL the clusters are in the
> decreasing order.  But the glyphs are always in the same order: the
> point first, then the letter.
> 
> I see the same with the Arabic script if I shape U+0633 followed by
> U+0651 (in logical order).
> 
> This doesn't happen with LTR text in unidirectional scripts, including
> with Latin text when shaping a base letter followed by a diacritical.
> 
> Is this expected behavior?  If so, what are the reasons?  Also, can it
> be controlled by the client application?  E.g., Uniscribe can be told
> to produce glyphs in the logical order, after shaping them for RTL
> display.

AFAIK, yes this is expected. Usually the glyph order shouldn’t matter,
one just draws them as they are ordered by HarfBuzz and for anything
that requires glyph to glyph to character mapping, the clusters provide
all the information needed.

As it happens, somewhere in Emacs does not like that for whatever reason
and would raw the glyph in the wrong order, so it my HarfBuzz in Emacs
integration code I used hb_buffer_reverse_clusters() right after shaping
to get the glyph correctly drawn. No idea how Emacs would deal with
reordered Indic glyphs which don’t always follow the input order.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get a glyph code for a character?

2019-05-25 Thread Khaled Hosny

On Sat, May 25, 2019 at 06:08:42PM +0300, Eli Zaretskii wrote:
> > Date: Sat, 25 May 2019 15:50:38 +0100
> > From: Richard Wordingham 
> > 
> > I presume you're after the glyph indicated by the raw cmap, e.g.
> > without localisation.
> 
> Not sure what kind of localisation are you alluding to here.  I must
> confess that I'm relatively ignorant about fonts, glyphs, and shaping,
> so I'm probably missing a lot here.  For example, I have no idea what
> is a "raw cmap".

For any given script and language, the font might provide a different
localized glyph than the default one. Only hb_shape[_full]() will apply
such localization.

> > Using hb_shape could very well result in the addition of a dotted
> > circle for a combining mark - is that what you want?
> 
> AFAIK, this method is only called in Emacs for a combining mark when
> we indeed want it displayed as a separate character, with the dotted
> circle.  It is normally called for base (non-combining) characters.

Then hb_shape() is the right tool here. HarfBuzz will also automatically
insert dotted circle for combining marks that are at the start of the
text string if HB_BUFFER_FLAG_BOT is set on the buffer. You can safely
set HB_BUFFER_FLAG_BOT and HB_BUFFER_FLAG_EOT on any buffer as long as
the text passed to hb_buffer_add* functions is the full paragraph text
not just a chunk of it (that is another reason why one should pass the
full paragraph and the item offset and length to these function instead
of just the substring).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Why harfbuzz doesn't handle ligature carets itself?

2018-12-14 Thread Khaled Hosny

On Fri, Dec 14, 2018 at 07:00:43PM +0330, Ebrahim Byagowi wrote:
> Hey there, just occurred to me this [hopefully not deeply incorrect] why
> harfbuzz itself doesn't handle ligature carets, distributing the ligature
> cluster advance with ignorable clusters followed by using GDEF/lcar info,
> with falling back to equal dividing?

The current API does not allow for such fallback as it takes in glyph
index only, but you need text string of the ligature. You will also need
to do grapheme clusters segmentation which (IIRC) HarfBuzz does not
currently fully handle.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Cluster question (Was Cluster soap box time)

2018-12-14 Thread Khaled Hosny

On Thu, Dec 13, 2018 at 08:05:19PM -0800, Ansel Sermersheim wrote:
> On 11/29/18 7:29 AM, Behdad Esfahbod wrote:
> 
> > On Tue, Nov 27, 2018 at 8:34 PM Ansel Sermersheim  > > wrote:
> > 
> > On 11/19/2018 07:16 PM, Behdad Esfahbod wrote:
> > > Hi Ansel,
> > > 
> > > On Mon, Nov 19, 2018 at 7:44 PM Ansel Sermersheim
> > > mailto:an...@copperspice.com>> wrote:
> > > 
> > > ...We have tried cluster
> > > levels 0 and 1, and neither one worked as we expected. In
> > > every case,
> > > combining accents are marked as being in a separate cluster
> > > to the base
> > > codepoint. For example, U+0061 Latin Small Letter A followed
> > > by U+0308
> > > Combining Diaeresis are being placed in adjacent clusters
> > > rather than
> > > the same cluster.
> > > 
> > > 
> > > That doesn't sound right.  Are you setting any custom
> > > unicode-funcs on the buffer?  Only thing I can think of that can
> > > do this is faulty / missing Unicode funcs.
> > 
> > We had a feeling something was missing. No, we are not supplying
> > any unicode funcs. Do you have a sample or documentation reference
> > for what we need to supply?
> > 
> > 
> > Not supplying anything is good.  Was just ruling out that as a cause.
> > 
> > I have looked at the online documentation without seeing a clear
> > list of what is required. We are specifically looking to use
> > harfbuzz to decipher special case grapheme breaks.
> > 
> > 
> > Can you check with hb-shape command-line tool, to make sure what you
> > expect is what HarfBuzz produces there?
> 
> Sorry it took a while to get back with you, we really do appreciate your
> help. We have been looking over the code and we believe we are having a
> problem with missing unicode callback functions. We are compiling HarfBuzz
> with the following options turned on:
> 
> >    -DHAVE_ATEXIT
> >    -DHB_EXTERN=
> >    -DHB_NO_UNICODE_FUNCS
> >    -DHB_NDEBUG
> 
> I am particularly suspicious of the HB_NO_UNICODE_FUNCS define. Am I correct
> in thinking that this is suppressing the built-in harfbuzz unicode
> functions, so we must supply our own?

Yes. The simplest solution would to remove the define and make sure
src/hb-ucdn.cc and src/hb-ucdn are built (or if your code already
depends on ICU or GLib, you can alternatively build the corresponding
Unicode functions implementation).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Clusters chapter

2018-11-02 Thread Khaled Hosny

On Fri, Nov 02, 2018 at 03:47:31PM -0500, Nathan Willis wrote:
> Finally, I am adding a short "why your software cares about clusters"
> paragraph to the beginning. I've got cursor positioning, coloring
> diacritics, and line breaking in mind; anything else worth mentioning?

In addition to what Behdad mentioned, applying text attributes in
general (color, underline, overline, etc.), doing them properly requires
shaping first then finding which glyphs have which attributes using
cluster values. Justification can require character properties as well
(Japanese, Kashida, etc) but needs to be done after shaping, so mapping
glyphs back to input characters is needed.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] [ARABIC] - 'hb_buffer_len' returning unexpected value after shaping

2018-10-31 Thread Khaled Hosny

On Wed, Oct 31, 2018 at 11:28:11AM +, Laurent CRUAU wrote:
> Hello there,
> 
> I am pretty new to harfbuzz but anyway I had not been into trouble for long 
> using arabic shaping until recently.
> And now I am submitted something weird with very few Arabic strings (the vast 
> majority of them do not cause any problem).
> 
> I use HB v1.0.1 on Ubuntu 16, using the regular ArialTTF mscorefont. I also 
> tried HB v2.0.2. on an embedded target and got the same issue.
> 
> Consider the following utf16 string:
> "\x8D\xFE" "\xDF\xFE" "\xB4\xFE" "\xE0\xFE" "\x8E\xFE" "\xE1\xFE" "\x20\x00" 
> "\xCB\xFE" "\xE0\xFE" "\xF4\xFE" "\xDC\xFE" "\xE2\xE"
> Or the following UTF8:
> "\xEF\xBA\x8D\xEF\xBB\x9F\xEF\xBA\xB4\xEF\xBB\xA0\xEF\xBA\x8E\xEF\xBB\xA1\x20\xEF\xBB\x8B\xEF\xBB\xA0\xEF\xBB\xB4\xEF\xBB\x9C\xEF\xBB\xA2\x00";

How did you get the string? It uses Arabic Presentation Forms, and
though it is technically valid Unicode text, that is not usually the
kind of input HarfBuzz should be taking.

> After shaping has been performed, the following string is counted for 11 
> glyphs (i.e. w/ hb_buffer_len).

The number of output glyphs does not have to be the same as the number
of input characters. If there are ligatures then the number of glyphs
can be less, and if there are any decompositions, then the number of
glyphs can be more. In general your code should not make any assumptions
about the number of glyphs based on the number of input characters.

To match output glyphs with input characters, you should use the cluster
field of glyph info.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] hb_shape API failing for MAC default Indic MT and Sangam MN fonts

2018-10-11 Thread Khaled Hosny

On Thu, Oct 11, 2018 at 10:09:46AM +, Vijendra Singh wrote:
> Hi All,
> 
> I am using Harfbuzz 1.7.6 for Indic languages in my application but
> failed to get correct result from hb_shape API for all MAC default
> Indic MT and MN fonts like- Devanagari MT and Devanagari Sangam MN
> fonts.

These are AAT fonts, are you building HarfBuzz with Core Text support?
On master AAT fonts should work without Core Text support, but this is
still work in progress.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] query cvXX feature name table references

2018-04-30 Thread Khaled Hosny

On Mon, Apr 30, 2018 at 08:50:57PM +0700, Martin Hosken wrote:
> Dear Behdad,
> 
> Do you have any plans (pretty please) to add an API to enable a client
> to query a font to get hold of the name table references for the
> various cvXX features in a font?

See https://github.com/harfbuzz/harfbuzz/pull/976, may be you can
comment on the proposed API?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] unsafe to break

2017-11-11 Thread Khaled Hosny

On Sat, Nov 11, 2017 at 08:39:09AM +0700, Martin Hosken wrote:
> Dear Behdad,
> 
> Please could you explain the purpose and function of
> HB_GLYPH_FLAG_UNSAFE_TO_BREAK. Is this about line breaking? grapheme
> clustering?

It is about shaping after line breaking. IIUC, unsafe to break means you must
reshape if you break here (up to the next/previous safe to break point),
but actual break points have to be identified by the client as usual.

IMHO, that is an optimization for the clients that want to do the right
thing after breaking but don’t want to re-shape text needlessly.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Pre-base Vowel Following Cluster Used to be Rendered Afterwards

2017-07-12 Thread Khaled Hosny

On Tue, Jul 11, 2017 at 11:20:19PM +0100, 'Richard Wordingham' via Khaled Hosny 
wrote:
> This bug (https://github.com/behdad/harfbuzz/issues/180) was fixed on
> 17 December 2015.  It was present in at least HarfBuzz Versions 1.0.1
> and 1.1.2, and was gone in Version 1.1.3.
> 
> It's been annoying me in LibreOffice for some time now.  The problem is
> that I'm using a long-term-support version of Ubuntu, 16.04 Xenial.
> Xenial's distribution provides Version 1.0.1 of HarfBuzz - Package
> libharfbuzz0b Version 1.0.1-1ubuntu0.1 to be precise.  We can cite it
> for at least two living scripts, Balinese and Tai Tham (for Tai
> Khuen - example string is ᩉ᩠ᩅᩯᩁ /wɛːn/ 'ring (n.)').

This is out of HarfBuzz developers control. But what you can do is
backport the patch to 1.0.1 and convince Ubuntu to include it into their
package. The patch seems to apply almost cleanly to 1.0.1 with small conflict
in the test harness, so here is a backported version of the patch that
applies cleanly if it is of any help (you can also setup a PPA with the
patched version if you couldn’t convince Ubuntu maintainers).

Regards,
Khaled
>From c9a0f7e97329579ed09949bcd8762f75d7dfd45a Mon Sep 17 00:00:00 2001
From: Behdad Esfahbod <beh...@behdad.org>
Date: Thu, 17 Dec 2015 11:59:15 +
Subject: [PATCH] [use] Fix halant detection

Before, we were just checking the use_category().  This detects as
halant a ligature that had the halant as first glyph (as seen in
NotoSansBalinese.)  Change that to use the is_ligated() glyph prop
bit.  The font is forming this ligature in ccmp, which is before
the rphf / pref tests.  So we need to make sure the "ligated" bit
survives those tests.  Since those only check the "substituted" bit,
we now only clear that bit for them and "ligated" survives.

Fixes https://github.com/behdad/harfbuzz/issues/180
---
 src/hb-ot-layout-private.hh  |   6 ++
 src/hb-ot-shape-complex-use.cc   |  19 ---
 test/shaping/Makefile.am |   1 +
 test/shaping/fonts/sha1sum/MANIFEST  |   1 +
 .../fbb6c84c9e1fe0c39e152fbe845e51fd81f6748e.ttf | Bin 0 -> 2616 bytes
 test/shaping/tests/MANIFEST  |   1 +
 test/shaping/tests/use.tests |   1 +
 7 files changed, 18 insertions(+), 11 deletions(-)
 create mode 100644 test/shaping/fonts/sha1sum/fbb6c84c9e1fe0c39e152fbe845e51fd81f6748e.ttf
 create mode 100644 test/shaping/tests/use.tests

diff --git a/src/hb-ot-layout-private.hh b/src/hb-ot-layout-private.hh
index d168e27f..bf69d468 100644
--- a/src/hb-ot-layout-private.hh
+++ b/src/hb-ot-layout-private.hh
@@ -442,11 +442,9 @@ _hb_glyph_info_clear_ligated_and_multiplied (hb_glyph_info_t *info)
 }
 
 static inline void
-_hb_glyph_info_clear_substituted_and_ligated_and_multiplied (hb_glyph_info_t *info)
+_hb_glyph_info_clear_substituted (hb_glyph_info_t *info)
 {
-  info->glyph_props() &= ~(HB_OT_LAYOUT_GLYPH_PROPS_SUBSTITUTED |
-			   HB_OT_LAYOUT_GLYPH_PROPS_LIGATED |
-			   HB_OT_LAYOUT_GLYPH_PROPS_MULTIPLIED);
+  info->glyph_props() &= ~(HB_OT_LAYOUT_GLYPH_PROPS_SUBSTITUTED);
 }
 
 
diff --git a/src/hb-ot-shape-complex-use.cc b/src/hb-ot-shape-complex-use.cc
index 1d44d220..1639ff09 100644
--- a/src/hb-ot-shape-complex-use.cc
+++ b/src/hb-ot-shape-complex-use.cc
@@ -368,7 +368,7 @@ clear_substitution_flags (const hb_ot_shape_plan_t *plan,
   hb_glyph_info_t *info = buffer->info;
   unsigned int count = buffer->len;
   for (unsigned int i = 0; i < count; i++)
-_hb_glyph_info_clear_substituted_and_ligated_and_multiplied ([i]);
+_hb_glyph_info_clear_substituted ([i]);
 }
 
 static void
@@ -413,6 +413,12 @@ record_pref (const hb_ot_shape_plan_t *plan,
   }
 }
 
+static inline bool
+is_halant (const hb_glyph_info_t )
+{
+  return info.use_category() == USE_H && !_hb_glyph_info_ligated ();
+}
+
 static void
 reorder_syllable (hb_buffer_t *buffer, unsigned int start, unsigned int end)
 {
@@ -428,7 +434,6 @@ reorder_syllable (hb_buffer_t *buffer, unsigned int start, unsigned int end)
 
   hb_glyph_info_t *info = buffer->info;
 
-#define HALANT_FLAGS FLAG(USE_H)
 #define BASE_FLAGS (FLAG (USE_B) | FLAG (USE_GB) | FLAG (USE_IV))
 
   /* Move things forward. */
@@ -436,12 +441,12 @@ reorder_syllable (hb_buffer_t *buffer, unsigned int start, unsigned int end)
   {
 /* Got a repha.  Reorder it to after first base, before first halant. */
 for (unsigned int i = start + 1; i < end; i++)
-  if (FLAG_UNSAFE (info[i].use_category()) & (HALANT_FLAGS | BASE_FLAGS))
+  if ((FLAG_UNSAFE (info[i].use_category()) & (BASE_FLAGS)) || is_halant (info[i]))
   {
 	/* If we hit a halant, move before it; otherwise it's a base: move to it's
 	 * place, and shift things in between backward. */
 
-	if (info[i].use_category() == USE_H)
+	if (is_halant (info[i]))
 	  i--;

Re: [HarfBuzz] Getting glyph information using Harfbuzz API

2017-05-24 Thread Khaled Hosny

On Wed, May 24, 2017 at 03:54:59PM -0700, Behdad Esfahbod wrote:
> Hi Deepak,
> 
> On Tue, May 23, 2017 at 8:45 PM, Deepak Jois  wrote:
> 
> > 3 I suppose if I have (1) above I can get a hb_glyph_extents_t for each
> > glyph. I am not sure how to convert it to a value that makes sense to
> > LuaTeX which requires a width, depth and height.
> >
> 
> Right.  Simon Cozens also wanted this for Sile... Problem is, in the TeX
> model, the depth and height are "logical" boundaries, not glyph ink
> extents.  Those numbers do not exist in OpenType.  For width, you want
> advance width.  For depth and height, you really are on your own IMO.
> Glyph extents are a bad fallback.

I don’t know about Sile, but in XeTeX get height and depth from glyph
extents and it works rather well in practice (TFM files can sometimes do
creative things to the depth and height like having the same depth and
height for plus and minus glyphs so they align in math mode, but people
don’t seem to be missing this flexibility that much).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Shape plan and user features

2016-12-17 Thread Khaled Hosny

On Sat, Dec 17, 2016 at 04:06:28PM -0600, Behdad Esfahbod wrote:
> On Thu, Oct 20, 2016 at 4:04 AM, Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > It is not clear whether I should pass user features to
> > hb_shape_plan_create_cached(), hb_shape_plan_execute() or both. If I
> > pass them to the former but not the later features will be applied just
> > fine, but if I do the reverse they will not, so when do I need to pass
> > the features to hb_shape_plan_execute()?
> >
> 
> You need to pass the same features to plan_execute().  The plan_create call
> just cares about whether a feature is global or has a range.  It doesn't
> care about the actual range.

Good to know, that is what I did already.


Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] ICU and NMake build

2016-12-05 Thread Khaled Hosny

On Mon, Dec 05, 2016 at 03:15:04PM +, Elmar Braun wrote:
> Hello,
> 
> I've built HarfBuzz 1.3.3 with Visual Studio using the provided NMake build
> files. I have two questions though:
> 
> Building with "ICU=1" produces two DLLs: harfbuzz-vs14.dll, which, as far as
> I can tell, uses ucdn for Unicode data, and does not depend on ICU; and
> harfbuzz-icu-vs14.dll, which isn't a complete HarfBuzz DLL. (It exports just
> three symbols: hb_icu_get_unicode_funcs, hb_icu_script_from_script, and
> hb_icu_script_to_script.) I'm sure there's a perfectly good explanation for
> that, but that explanation is not obvious to me. Why doesn't it simply build
> HarfBuzz in a single DLL that depends directly on ICU?

That is to avoid having HarfBuzz depend on ICU and thus every HarfBuzz
client depending on ICU, which can be problematic for Linux distribution
and other systems with package managers. The autoconf build supports
--with-icu=builtin which does what you need, not sure if the NMake build
has such an option, but if it doesn’t then that would be a good
thing to have (or even be the default when ICU=1), please open an issue
on GitHub.

> Secondly, in win32\README.txt it describes the "ICU" option as follows:
> 
>  > ICU: Enables the build HarfBuzz-ICU, which is now the recommended layout
> engine
>  > for ICU (International Components for Unicode), which deprecated ICU
> LE.
> 
> That reads like a description of icu-le-hb to me. But my understanding is
> that icu-le-hb is separate, and not included with HarfBuzz. So isn't that
> text plain wrong?

Right, that sounds wrong. Thanks for spotting this.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Building hb-view

2016-11-10 Thread Khaled Hosny

On Thu, Nov 10, 2016 at 09:31:44PM +, Richard Wordingham wrote:
> On Fri, 28 Oct 2016 20:29:54 +0200
> Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > Then please attach the full build log, may be someone can spot the
> > issue.
> 
> The command sequence in a new directory was:

It still does not tell much, unfortunately, try running make in verbose
mode; make V=1 instead of make.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Building hb-view

2016-10-28 Thread Khaled Hosny

On Fri, Oct 28, 2016 at 06:04:43PM +0100, Richard Wordingham wrote:
> % CFLAGS=-g CXXFLAGS=-g ../configure

You should get a summary at the end here, check if Cairo support is
enabled as hb-view requires it.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Horizontal positions for vertical text?

2016-10-24 Thread Khaled Hosny

Hi,

It seems that almost every HarfBuzz client doing vertical text that I
checked will reverse the positions returned by HarfBuzz for vertical
text to use them as if they were horizontal. May be we should have a
buffer option to do this in HarfBuzz since it seems to be the most
common use? (this is mainly because every time I do it I get it wrong
the first few iterations :)

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Fallback vertical shaping?

2016-10-24 Thread Khaled Hosny

Hi,

I was looking into Firefox’s code to check how they handle vertical text
(to see why they get better results for the same font than my code), and
noticed that they try to use vertical presentation forms [1] when the
font lacks ‘vert’ feature [2]. I was wondering if this is something
HarfBuzz should be doing instead?

Regards,
Khaled

1. 
https://dxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxHarfBuzzShaper.cpp#208-279
2. 
https://dxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxHarfBuzzShaper.cpp#1438

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Shape plan and user features

2016-10-20 Thread Khaled Hosny

It is not clear whether I should pass user features to
hb_shape_plan_create_cached(), hb_shape_plan_execute() or both. If I
pass them to the former but not the later features will be applied just
fine, but if I do the reverse they will not, so when do I need to pass
the features to hb_shape_plan_execute()?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] parse_one_feature

2016-10-17 Thread Khaled Hosny

On Mon, Oct 17, 2016 at 01:31:18PM +0100, Martin Hosken wrote:
> Dear Behdad,
> 
> I notice that hb-shape has a parse_one_feature function, but that
> nothing refers to it and it is not in the public API. Does this mean
> that it is deprecated or that one day you will publicise it?

Public API is hb_feature_from_string().

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Opentype features

2016-07-01 Thread Khaled Hosny

On Fri, Jul 01, 2016 at 03:02:45PM -0400, Kelvin Ma wrote:
> so if this
> 
> isn’t lying then it looks like i gotta do
> 
> otint = hb.tag_from_string(list(map(ord, 'onum')))

hb.tag_from_string(b'onum')

> otfeature = [' ', ' ', ' ', ' ']
> hb.tag_to_string(otint, otfeature)
> print(otfeature)
> 
> to round-trip a opentype feature through harfbuzz…
> and of course
> 
> >>> Segmentation fault (core dumped)

Bug in the annotation:
https://github.com/behdad/harfbuzz/pull/286

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to increase harfbuzz numerical precision

2016-06-28 Thread Khaled Hosny

Device tables affect glyph positioning which is done by HarfBuzz to
Cairo.

On Tue, Jun 28, 2016 at 07:50:51PM -0400, Kelvin Ma wrote:
> The rendering is separate from harfbuzz i thought, the glyphs get passed to
> cairo’s cr.show_glyphs() and it uses its own font structure that has to be
> loaded separately from harfbuzz’s fonts. So any value set on the harfbuzz
> font is not known by the renderer
> 
> On Tue, Jun 28, 2016 at 7:48 PM, Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > For device tables (as I said in my first reply) and I think hinting
> > (though I don’t think hinting stuff affects HarfBuzz right now).
> >
> > On Tue, Jun 28, 2016 at 07:25:27PM -0400, Kelvin Ma wrote:
> > > ok this might be a dumb question but what is ppem used for anyway? I
> > > thought it was a font value that harfbuzz just lets you read off of the
> > > font (like upem, advance width, or glyph index) so you can do your own
> > math
> > > on the font outside of harfbuzz.
> > >
> > > On Tue, Jun 28, 2016 at 7:18 PM, Khaled Hosny <khaledho...@eglug.org>
> > wrote:
> > >
> > > > It is, but you have to set it separately, and then you can set the font
> > > > scale to whatever value you need without both being interdependent.
> > > >
> > > > On Tue, Jun 28, 2016 at 07:15:23PM -0400, Kelvin Ma wrote:
> > > > > I thought ppem was dependent on UPEM and font scale, is it not?
> > > > >
> > > > > On Tue, Jun 28, 2016 at 7:12 PM, Khaled Hosny <khaledho...@eglug.org
> > >
> > > > wrote:
> > > > >
> > > > > > Device tables depend on ppem, so despite the scale being set
> > > > > > on the font, you should still set the exact ppem.
> > > > > >
> > > > > > On Tue, Jun 28, 2016 at 06:43:58PM -0400, Kelvin Ma wrote:
> > > > > > > so that’s the only way huh…
> > > > > > > doesn’t that kind of defeat the purpose of hb.font_create() and
> > > > having
> > > > > > many
> > > > > > > scaled versions of the same font? You would only ever need one
> > font
> > > > of
> > > > > > each
> > > > > > > face, scaled to the UPEM, if the fontsize was to be applied
> > > > externally
> > > > > > > after shaping already occurred.
> > > > > > >
> > > > > > > On Tue, Jun 28, 2016 at 5:51 PM, Behdad Esfahbod <
> > beh...@behdad.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > HarfBuzz coordinates work in a int32 space.  You are free to
> > set
> > > > > > whatever
> > > > > > > > scales you want on the font.  For example, use 6 or 8 or 10 or
> > 16
> > > > bits
> > > > > > of
> > > > > > > > sub-pixel precision by multiplying your scale by a number.
> > > > > > > >
> > > > > > > > On Tue, Jun 28, 2016 at 4:57 PM, Kelvin Ma <
> > > > kelvinsthirt...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> It appears that harfbuzz rounds all decimals to integers when
> > > > giving
> > > > > > > >> glyph advances and offsets. This is causing some ugly
> > misalignment
> > > > > > problems
> > > > > > > >> in arabic shaping, as well as latin cursive fonts. (see
> > pictures)
> > > > > > > >>
> > > > > > > >> [image: Inline image 1]
> > > > > > > >>
> > > > > > > >> [image: Inline image 2]
> > > > > > > >>
> > > > > > > >> [image: Inline image 3]
> > > > > > > >> How do I get harfbuzz to preserve the floats?
> > > > > > > >>
> > > > > > > >> ___
> > > > > > > >> HarfBuzz mailing list
> > > > > > > >> HarfBuzz@lists.freedesktop.org
> > > > > > > >> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > behdad
> > > > > > > > http://behdad.org/
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > ___
> > > > > > > HarfBuzz mailing list
> > > > > > > HarfBuzz@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > > > > >
> > > > > >
> > > >
> >
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to increase harfbuzz numerical precision

2016-06-28 Thread Khaled Hosny

For device tables (as I said in my first reply) and I think hinting
(though I don’t think hinting stuff affects HarfBuzz right now).

On Tue, Jun 28, 2016 at 07:25:27PM -0400, Kelvin Ma wrote:
> ok this might be a dumb question but what is ppem used for anyway? I
> thought it was a font value that harfbuzz just lets you read off of the
> font (like upem, advance width, or glyph index) so you can do your own math
> on the font outside of harfbuzz.
> 
> On Tue, Jun 28, 2016 at 7:18 PM, Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > It is, but you have to set it separately, and then you can set the font
> > scale to whatever value you need without both being interdependent.
> >
> > On Tue, Jun 28, 2016 at 07:15:23PM -0400, Kelvin Ma wrote:
> > > I thought ppem was dependent on UPEM and font scale, is it not?
> > >
> > > On Tue, Jun 28, 2016 at 7:12 PM, Khaled Hosny <khaledho...@eglug.org>
> > wrote:
> > >
> > > > Device tables depend on ppem, so despite the scale being set
> > > > on the font, you should still set the exact ppem.
> > > >
> > > > On Tue, Jun 28, 2016 at 06:43:58PM -0400, Kelvin Ma wrote:
> > > > > so that’s the only way huh…
> > > > > doesn’t that kind of defeat the purpose of hb.font_create() and
> > having
> > > > many
> > > > > scaled versions of the same font? You would only ever need one font
> > of
> > > > each
> > > > > face, scaled to the UPEM, if the fontsize was to be applied
> > externally
> > > > > after shaping already occurred.
> > > > >
> > > > > On Tue, Jun 28, 2016 at 5:51 PM, Behdad Esfahbod <beh...@behdad.org>
> > > > wrote:
> > > > >
> > > > > > HarfBuzz coordinates work in a int32 space.  You are free to set
> > > > whatever
> > > > > > scales you want on the font.  For example, use 6 or 8 or 10 or 16
> > bits
> > > > of
> > > > > > sub-pixel precision by multiplying your scale by a number.
> > > > > >
> > > > > > On Tue, Jun 28, 2016 at 4:57 PM, Kelvin Ma <
> > kelvinsthirt...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> It appears that harfbuzz rounds all decimals to integers when
> > giving
> > > > > >> glyph advances and offsets. This is causing some ugly misalignment
> > > > problems
> > > > > >> in arabic shaping, as well as latin cursive fonts. (see pictures)
> > > > > >>
> > > > > >> [image: Inline image 1]
> > > > > >>
> > > > > >> [image: Inline image 2]
> > > > > >>
> > > > > >> [image: Inline image 3]
> > > > > >> How do I get harfbuzz to preserve the floats?
> > > > > >>
> > > > > >> ___
> > > > > >> HarfBuzz mailing list
> > > > > >> HarfBuzz@lists.freedesktop.org
> > > > > >> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > > --
> > > > > > behdad
> > > > > > http://behdad.org/
> > > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > ___
> > > > > HarfBuzz mailing list
> > > > > HarfBuzz@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > > >
> > > >
> >
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to increase harfbuzz numerical precision

2016-06-28 Thread Khaled Hosny

It is, but you have to set it separately, and then you can set the font
scale to whatever value you need without both being interdependent.

On Tue, Jun 28, 2016 at 07:15:23PM -0400, Kelvin Ma wrote:
> I thought ppem was dependent on UPEM and font scale, is it not?
> 
> On Tue, Jun 28, 2016 at 7:12 PM, Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > Device tables depend on ppem, so despite the scale being set
> > on the font, you should still set the exact ppem.
> >
> > On Tue, Jun 28, 2016 at 06:43:58PM -0400, Kelvin Ma wrote:
> > > so that’s the only way huh…
> > > doesn’t that kind of defeat the purpose of hb.font_create() and having
> > many
> > > scaled versions of the same font? You would only ever need one font of
> > each
> > > face, scaled to the UPEM, if the fontsize was to be applied externally
> > > after shaping already occurred.
> > >
> > > On Tue, Jun 28, 2016 at 5:51 PM, Behdad Esfahbod <beh...@behdad.org>
> > wrote:
> > >
> > > > HarfBuzz coordinates work in a int32 space.  You are free to set
> > whatever
> > > > scales you want on the font.  For example, use 6 or 8 or 10 or 16 bits
> > of
> > > > sub-pixel precision by multiplying your scale by a number.
> > > >
> > > > On Tue, Jun 28, 2016 at 4:57 PM, Kelvin Ma <kelvinsthirt...@gmail.com>
> > > > wrote:
> > > >
> > > >> It appears that harfbuzz rounds all decimals to integers when giving
> > > >> glyph advances and offsets. This is causing some ugly misalignment
> > problems
> > > >> in arabic shaping, as well as latin cursive fonts. (see pictures)
> > > >>
> > > >> [image: Inline image 1]
> > > >>
> > > >> [image: Inline image 2]
> > > >>
> > > >> [image: Inline image 3]
> > > >> How do I get harfbuzz to preserve the floats?
> > > >>
> > > >> ___
> > > >> HarfBuzz mailing list
> > > >> HarfBuzz@lists.freedesktop.org
> > > >> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > behdad
> > > > http://behdad.org/
> > > >
> >
> >
> >
> >
> >
> > > ___
> > > HarfBuzz mailing list
> > > HarfBuzz@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/harfbuzz
> >
> >
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] font_get_h_extents & font_get_v_extents

2016-06-27 Thread Khaled Hosny

On Sun, Jun 26, 2016 at 11:11:11PM -0400, kelvinsthirt...@gmail.com wrote:
> 
> 
> > On Jun 26, 2016, at 10:25 PM, Khaled Hosny <khaledho...@eglug.org> wrote:
> > 
> >> On Sun, Jun 26, 2016 at 10:05:15PM -0400, Kelvin Ma wrote:
> >> How do you get the plain ascent and descent of a font? font_get_h_extents()
> >> gives the HHead values of the font, not the regular ascent and descent.
> > 
> > There are three different settings for ascent and descent in OpenType
> > fonts, which one you are after?
> 
> The ones that partition the EM square. The ascent and descent should add up 
> to the UPEM value.

I don’t know if there is such a thing, but it would be pretty useless
for many fonts as there should be no relation between EM size and
vertical metrics.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] font_get_h_extents & font_get_v_extents

2016-06-26 Thread Khaled Hosny

On Sun, Jun 26, 2016 at 10:05:15PM -0400, Kelvin Ma wrote:
> How do you get the plain ascent and descent of a font? font_get_h_extents()
> gives the HHead values of the font, not the regular ascent and descent.

There are three different settings for ascent and descent in OpenType
fonts, which one you are after?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Setting initial cluster value

2016-06-25 Thread Khaled Hosny

On Sat, Jun 25, 2016 at 01:07:29PM -0400, Kelvin Ma wrote:
> for the same reason kerning cannot happen between two different font runs,
> how can arabic shaping happen across different font runs?

Basic Arabic shaping involves textual analysis, this depends on Unicode
character properties and neighboring characters and is font independent,
so it can be done across different fonts.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Setting initial cluster value

2016-06-25 Thread Khaled Hosny

On Sat, Jun 25, 2016 at 01:06:27PM -0400, Kelvin Ma wrote:
> > > > > Don’t you
> > > > need
> > > > > context to be ignored if the boundaries of the text you want to shape
> > > > fall
> > > > > inside a cluster? Like in the string 'af[fluency s]tate' where only
> > the
> > > > > 'fluency s' is supposed to be shaped?
> > > >
> > > > Depends on why you are shaping “fluency s” alone, if it is because of,
> > > > say, font change, then you need HarfBuzz to know the context otherwise
> > > > you get broken Arabic shaping.
> > >
> > > Well font change would produce a separate run that wouldn’t know about
> > the
> > > other runs so context can only be within a same-direction, same-font run.
> >
> > This is wrong, font change shouldn’t break Arabic shaping, so you have
> > to pass the context even in this case.
> >
> 
> If the text consists of text strings separated by formating objects, each
> text string doesn’t know about what’s around it. Because that’s at a much
> higher level in the code and harfbuzz can only handle a single font in a
> single run at a time. To artificially jam in the neighboring runs for each
> shaping attempt would involve an inordinate amount of string concatenation
> and searching on the fly.

One can always fix his code to not do wrong assumptions. When doing text
layout you always need the full paragraph, and you should have it around
after itemisation. Itemisation does not have to be done by splitting
text, you can just store run start indices and lengths.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Setting initial cluster value

2016-06-25 Thread Khaled Hosny

On Fri, Jun 24, 2016 at 10:06:07PM -0400, Kelvin Ma wrote:
> How do you set the initial cluster value? So that harfbuzz will start
> counting from some number like 25 instead of 0.

Use hb_buffer_add(), or just do 25 + cluster in your code.

> Also what’s the point of
> *item_offset* and *item_length* in the buffer_add functions?

It is explained in the documentation:
http://behdad.github.io/harfbuzz/harfbuzz-Buffers.html#hb-buffer-add-codepoints

>   Don’t you need
> context to be ignored if the boundaries of the text you want to shape fall
> inside a cluster? Like in the string 'af[fluency s]tate' where only the
> 'fluency s' is supposed to be shaped?

Depends on why you are shaping “fluency s” alone, if it is because of,
say, font change, then you need HarfBuzz to know the context otherwise
you get broken Arabic shaping.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Travis build failure

2016-06-17 Thread Khaled Hosny

It seems that use.tests have been failing on Travis for a while (which
makes all pull requests to fail as well). Behdad, do you have any idea
why it is failing?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] What is wrong with unicode in harfbuzz?

2016-06-16 Thread Khaled Hosny

On Thu, Jun 16, 2016 at 11:10:03PM -0400, Kelvin Ma wrote:
> ok thanks!! && in that case yall best fix the example at
> https://github.com/behdad/harfbuzz/blob/master/src/sample.py then because
> it just uses string.encode('utf-x'). which is confusing.

The example is fine, for UTF-8 the API expects bytes array which is what
str.encode() gives you. May be it can be extended to show how to handle
other encodings, can you send a pull request?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Fwd: Harfbuzz with linebreaking

2016-06-15 Thread Khaled Hosny

On Wed, Jun 15, 2016 at 09:29:34AM +0700, Martin Hosken wrote:
>I would suggest that you don't need to reshape if the start
> of the next line is in a different cluster to the end of the previous
> line. There are cases where you may need to do some positional tidying
> (deciding where the new 0 is in the line), but you can't ligate across
> a cluster boundary (by definition in OT).

This is only true for (some) Indic scripts that do not use the Universal
Shaping Engine, other shapers do not have this limitation.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Windows build of harfbuzz with nmake

2016-06-13 Thread Khaled Hosny

On Mon, Jun 13, 2016 at 10:34:24AM +0100, John Emmas wrote:
> On 13/06/2016 10:09, Juha Martikainen wrote:
> > 
> > I had a second build attempt where I made my own vcxproj file. There I
> > get the following kind of errors:
> > 
> > 1>..\..\src\hb-directwrite.cc(246): error C2039: 'directwrite' : is not
> > a member of 'hb_shaper_data_t'
> > 1>..\..\src\hb-directwrite.cc(257): error C3861:
> > 'hb_directwrite_shaper_face_data_ensure': identifier not found
> > 
> 
> Hi Juha, I think you accidentally posted off-list.
> 
> You're right about the above and it's hugely frustrating.  I build with
> MSVC-8 and I had to comment out the whole of 'hb-directwrite.cc' because I
> couldn't make it compile.

That is why you need config.h file, to enable/disable optional features
as suitable. For example DirectWrite and Uniscribe backends should be
disabled by default since they are only for testing purposes (comparing
HarfBuzz output to MS implementations) and are not needed in production
code.

Or may be the issue is that you are using your own build system and just
included all the source files, in this case you can just skip
hb-directwrite.* and hb-uniscribe.* (among others, reading
src/Makefile.am should give some clues).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Itemising Japanese scripts

2016-04-24 Thread Khaled Hosny

On Mon, Apr 25, 2016 at 08:18:14AM +1000, Simon Cozens wrote:
> On 25/04/2016 08:05, Khaled Hosny wrote:
> > The problem with merging is which script tag to select for the merged run,
> > Kana or Hani or “it depends on the font”.
> 
> Why does it matter what script tag to apply if there are no opentype
> interactions with Japanese?

If that is really the case, then yes it doesn’t matter. I know nearly
nothing about CJK typesetting and fonts, but I see features like “cpct”,
“fwid”, “halt”, “hwid”, etc. I even recall seeing that “kern” and “mark”
used in some fonts as well.

> On the other hand, I have just remembered one interaction: a pan-CJK
> font such as Source Han Sans / Noto Sans CJK will have variant forms of
> the kanji for Chinese, Japanese and Korean. But even then the selection
> should be done on language, not on script - I haven't checked how it works.

Depending on the font, if you didn’t select the right script, selecting
the language will have no effect.

> So if pushed I would say Kana, just in case. But it really shouldn't matter.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Itemising Japanese scripts

2016-04-24 Thread Khaled Hosny

On Mon, Apr 25, 2016 at 07:41:39AM +1000, Simon Cozens wrote:
> On 25/04/2016 07:22, Khaled Hosny wrote:
> > This leaves Han which has its own OpenType tag and that is what I have
> > been seeing most. So I wounder what other application do, should I try
> > something clever like see what scripts/features/lookups are in the font
> > and decide to merge the scripts if it is safe (i.e. merging or not should
> > have no effect on features applied), or should I just leave the current
> > behaviour and not worry about it?
> 
> Just merge them. From an OpenType perspective, Japanese is boring.
> Japanese characters are treated as discrete, independent units except in
> extreme cases, (e.g. Kazuraki which uses ligatures to attempt to
> replicate a hand-written look) so there should not be any OpenType
> interactions between characters. I remember Paul Hunt saying that when
> they developed Kazuraki they had to rewrite chunks of their OpenType
> shaper to deal with it, as it was such a special case.
> 
> *Even if* there are OpenType interactions, these should take place
> within the context of a Japanese font which has both kana and kanji
> sets, so merging would be the right thing to do.

The problem with merging is which script tag to select for the merged run,
Kana or Hani or “it depends on the font”.

> The real fun you're going to have is getting the layout correct, but you
> can look at my code for how to implement that. :-)

Will do :)
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Itemising Japanese scripts

2016-04-24 Thread Khaled Hosny

On Sun, Apr 24, 2016 at 05:36:22PM +0200, Adam Twardoch (List) wrote:
> I think they should always be merged. They were emcoded as three
> scripts in Unicode in the early days when it was not at all obvious
> how the script property is to be used. Certainly the notion of script
> itemisation in OpenType came much later and the fact that OpenType
> unifies them under one "kana" tag clearly indicating the preferred
> usage in OT context. 

Right, I totally forgot about Hiragana and Katakana having the same
OpenType tag.

This leaves Han which has its own OpenType tag and that is what I have
been seeing most. So I wounder what other application do, should I try
something clever like see what scripts/features/lookups are in the font
and decide to merge the scripts if it is safe (i.e. merging or not should
have no effect on features applied), or should I just leave the current
behaviour and not worry about it?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Itemising Japanese scripts

2016-04-24 Thread Khaled Hosny

On Mon, Apr 25, 2016 at 12:46:21AM +0900, suzuki toshiya wrote:
> Hi,
> 
> I will try to contact with W3C JLREQ experts to ask some
> recommended reference.  Before it, I list a few points;
> 
> * it is not rare to assign a font for Han character and
> different font for kana (I'm unfamiliar with the case
> 3 fonts for Han/Hiragana/Katakana are assigned in parallel).
> the most popular case would be the texts in the comic books.
> also, there are many kana-only font.
>
> * however, for PostScript-based systems, making a composite
> font (rearranged font) from multiple fonts might be popular
> technology. in the other words, the people using such technology
> did not split the text into Han-run & Kana-run. they use
> same text run including both character, and the font was
> switched by the codepoints of the character.

Script itemisation does not affect font selection in my case, the
application (Scribus) does not even do any font fallback and all font
selection is controlled by the user (for better or worth) and selecting
different fonts will split the runs.

> * it is difficult to handle the Latin characters in Japanese
> text schematically. Sometimes, they are expected to be rendered
> as Latin texts, but in other cases, the consisting Latin
> characters are dealt as Han/Kana characters. It is very
> confusing when we are trying to render vertical writing mode.

Fortunately (for me at least) I’m not doing vertical text at this
point, so it is not a concern. I’m mostly worried about efficiency and
whether there need to be any OpenType interaction between the different
scripts which would not be allowed due to the run splitting.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Itemising Japanese scripts

2016-04-24 Thread Khaled Hosny

I’m wondering what is the best practice of itemising Japanese scripts
(Han, Hiragana, Katakana), should they be merged somehow or is it better
to keep them in separate runs?

I’m currently treating them as separate scripts so they end up in
operate runs, but in the ~7000 characters of Japanese text I’m testing
with I get ~2000 runs, if it were some English or Arabic text it would
be just 1 run so it seems quite inefficient (though I didn’t make any
measurements).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] have markfiltersets ever worked?

2016-04-04 Thread Khaled Hosny

On Mon, Apr 04, 2016 at 11:39:53PM +0200, Khaled Hosny wrote:
> On Mon, Apr 04, 2016 at 02:31:59PM -0700, Behdad Esfahbod wrote:
> > Khaled, what sequence can I use for testing?  Want to add to the test suite.
> 
> An of:
> تختة تخنة تخئة تخثة تخٹة

Using https://github.com/khaledhosny/hussaini-nastaleeq (I thought I
mentioned it in my first reply).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] have markfiltersets ever worked?

2016-04-04 Thread Khaled Hosny

On Mon, Apr 04, 2016 at 02:31:59PM -0700, Behdad Esfahbod wrote:
> Khaled, what sequence can I use for testing?  Want to add to the test suite.

An of:
تختة تخنة تخئة تخثة تخٹة

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Detecting mandatory ligatures

2016-04-04 Thread Khaled Hosny

On Mon, Apr 04, 2016 at 01:02:56PM -0700, Behdad Esfahbod wrote:
> On Mon, Apr 4, 2016 at 12:45 PM, Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > On Tue, Mar 22, 2016 at 10:52:43PM +, Jamie Dale wrote:
> > > Hey all,
> > >
> > > I've spent today fixing some issues in our editable text controls, mostly
> > > relating to issues caused by the difference between characters and
> > grapheme
> > > clusters.
> > >
> > > I've sorted most of my issues now, but I'm still having an issue the لا
> > > ligature in Arabic.
> > >
> > > My current code that performs picking on text, or applies formatting to
> > > text that spans a ligature, assumes that any ligature can be split into
> > its
> > > component grapheme clusters, however this assumption does not hold true
> > for
> > > that ligature as it cannot be split.
> > >
> > > Does HarfBuzz have a way to identify these mandatory ligatures, or
> > failing
> > > that, how do people generally deal with this sort of thing? I have ICU
> > > available if it has anything that can help?
> >
> > You just don’t try to identify mandatory ligatures. What we are doing in
> > Scribus (that bit of code is not published yet) is to treat all
> > ligatures as unbreakable. You simply find how many characters in a
> >
> 
> s/characters/Unicode graphemes/

Right.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Detecting mandatory ligatures

2016-04-04 Thread Khaled Hosny

On Tue, Mar 22, 2016 at 10:52:43PM +, Jamie Dale wrote:
> Hey all,
> 
> I've spent today fixing some issues in our editable text controls, mostly
> relating to issues caused by the difference between characters and grapheme
> clusters.
> 
> I've sorted most of my issues now, but I'm still having an issue the لا
> ligature in Arabic.
> 
> My current code that performs picking on text, or applies formatting to
> text that spans a ligature, assumes that any ligature can be split into its
> component grapheme clusters, however this assumption does not hold true for
> that ligature as it cannot be split.
> 
> Does HarfBuzz have a way to identify these mandatory ligatures, or failing
> that, how do people generally deal with this sort of thing? I have ICU
> available if it has anything that can help?

You just don’t try to identify mandatory ligatures. What we are doing in
Scribus (that bit of code is not published yet) is to treat all
ligatures as unbreakable. You simply find how many characters in a
ligature, distribute the width on them and find the width of the
selected part, draw the selection rectangle then render the ligature
twice once with the highlight color clipped to the width of the selected
area and once with the regular color clipped to the rest of the glyph
width. You can try to use hb_ot_layout_get_ligature_carets() to get
better positions than simply distributing the width over the number of
components, but very few fonts support it and you will need a fallback
code anyway.

I believe this is essentially what Firefox does as well.

Regards,
Khaled
> 
> Thanks,
> Jamie.

> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] have markfiltersets ever worked?

2016-04-01 Thread Khaled Hosny

On Fri, Apr 01, 2016 at 10:30:43AM +0700, Martin Hosken wrote:
> Dear All,
> 
> Has anyone had any success with mark filter sets?

It used to work in the past, but I tried it now and does not seem to
work now.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] hb_ft_face_create_referenced and the hb_face_t is uninitialized

2016-03-19 Thread Khaled Hosny

On Fri, Mar 18, 2016 at 11:49:10AM -0400, Liam wrote:
> Hello,
> 
> I have a few questions about this method, I am looking to obtain the
> table information to find the values for substitution tables so that I can
> input the correct value for value when I create a hb_feature_t.
> 
> I figured the hb_ot_layout_table_get_feature_tags is the method call that
> would work, this method requires hb_face_t.
> 
> I am using freetype in this project and I have used freetype to create a
> FT_Face, and to save myself the headache I used hb_ft_create_referenced, and

Do you mean hb_ft_face_create_referenced() or
hb_ft_font_create_referenced()?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Beginner question: What are cluster levels?

2016-01-08 Thread Khaled Hosny

On Fri, Jan 08, 2016 at 04:47:14PM +, Behdad Esfahbod wrote:
> Now, if for example, B and C ligate, then the clusters to which they belong
> "merge".  The merged cluster gets the number that is the minimum of the
> cluster number of the clusters that went in.  In this case, we get:
> 
>   A,BC,D,E
>   0,1 ,3,4
> 
> Now let's assume that the BC glyph decomposes into three components, and D
> also decomposes into two.  The components all inherit the cluster value of
> the parent:
> 
>   A,BC0,BC1,BC2,D0,D1,E
>   0,1  ,1  ,1  ,2 ,2 ,3

Shouldn’t that be:
0,1  ,1  ,1  ,3 ,3 ,4

and so on for the rest of the example?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Some questions for the documentation

2016-01-02 Thread Khaled Hosny

Hi,

I’m trying to document all of buffer functions [1], and somethings are
no very clear to me.

* What is the purpose of hb_buffer_get_empty() and how is it different
  from hb_buffer_create()?

* What is the implications of using different cluster levels?
  My understanding is that level 0 merges clusters of combining marks to
  their bases, any thing else? And what is the difference between level
  1 and two?

Regards,
Khaled

1. http://behdad.github.io/harfbuzz/harfbuzz-Buffers.html
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-28 Thread Khaled Hosny

That was because of the new HB_EXTERN decorator, fixed in:
https://github.com/behdad/harfbuzz/pull/202

Regards,
Khaled

On Sat, Dec 26, 2015 at 03:25:29AM +0400, Khaled Hosny wrote:
> I just noticed now that almost all functions are missing from the
> generated documentation. When I run the build locally I see lots of:
> 
>./harfbuzz-sections.txt:422: warning: No declaration found for 
> hb_feature_to_string.
> 
> which would explain why they are missing from the docs, but I couldn’t
> manage to find why it can’t find them with my limited understanding of
> gtk-doc.
> 
> On Fri, Dec 25, 2015 at 06:46:22PM +0100, Behdad Esfahbod wrote:
> > This is all live now:
> > http://behdad.github.io/harfbuzz/
> > 
> > 
> > On 15-12-24 04:32 AM, Simon Cozens wrote:
> > > On 24/12/2015 11:39, Deepak Jois wrote:
> > >> Here is an old thread that I have bookmarked, regarding whatever little
> > >> documentation that does exist:
> > >>
> > >> http://lists.freedesktop.org/archives/harfbuzz/2015-August/005036.html
> > > 
> > > When Khaled's PR lands, there'll be docs available at
> > > http://behdad.github.io/harfbuzz/
> > > 
> > > (In the meantime the docs are at http://khaledhosny.github.io/harfbuzz/
> > > - like I said, sorry I've dropped the ball on the user manual. As well
> > > as the skeleton that's there, there's an awful lot more I need to add to
> > > it. But finding the time...)
> > > 
> > > Behdad, any reason this shouldn't be merged?
> > > 
> > > ___
> > > HarfBuzz mailing list
> > > HarfBuzz@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > > 
> > ___
> > HarfBuzz mailing list
> > HarfBuzz@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/harfbuzz
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-25 Thread Khaled Hosny

I just noticed now that almost all functions are missing from the
generated documentation. When I run the build locally I see lots of:

   ./harfbuzz-sections.txt:422: warning: No declaration found for 
hb_feature_to_string.

which would explain why they are missing from the docs, but I couldn’t
manage to find why it can’t find them with my limited understanding of
gtk-doc.

On Fri, Dec 25, 2015 at 06:46:22PM +0100, Behdad Esfahbod wrote:
> This is all live now:
> http://behdad.github.io/harfbuzz/
> 
> 
> On 15-12-24 04:32 AM, Simon Cozens wrote:
> > On 24/12/2015 11:39, Deepak Jois wrote:
> >> Here is an old thread that I have bookmarked, regarding whatever little
> >> documentation that does exist:
> >>
> >> http://lists.freedesktop.org/archives/harfbuzz/2015-August/005036.html
> > 
> > When Khaled's PR lands, there'll be docs available at
> > http://behdad.github.io/harfbuzz/
> > 
> > (In the meantime the docs are at http://khaledhosny.github.io/harfbuzz/
> > - like I said, sorry I've dropped the ball on the user manual. As well
> > as the skeleton that's there, there's an awful lot more I need to add to
> > it. But finding the time...)
> > 
> > Behdad, any reason this shouldn't be merged?
> > 
> > ___
> > HarfBuzz mailing list
> > HarfBuzz@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/harfbuzz
> > 
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-24 Thread Khaled Hosny

On Thu, Dec 24, 2015 at 12:50:42PM -0800, Jonathan Blow wrote:
> Khaled wrote:
> 
> 
> 
> > Each Unicode character has a script property, so you don’t need to hard
> > code it for the text. The only complication is inherited or common
> > characters, but there is a simple heuristic to handle them, see for
> > example:
> > https://github.com/HOST-Oman/libraqm/blob/master/raqm.c#L289
> >
> > But if you are sure your text is always single script and language (I
> > see the Arabic has English words, so doesn’t seem to be the case), then
> > you can hard code the script values.
> >
> 
> Does this mean that passing UNKNOWN and letting HB figure it out is the
> right thing then?

HarfBuzz does not do an such detection by default (there is the guess
segment properties function, but it does very simplistic detection and
is meant only for quick testing, not real world use).

> For example: Is there some sample text in mixed Arabic w/ bidi English
> names, etc, that will come out wrong if I just set the language to "arb"
> and script to arabic? That is what I am doing in those screenshots, and
> whereas "they look fine to me" we all know that is no guarantee things
> aren't horrible in some corner case.

Here is a quick test:
~$ hb-shape DejaVuSans.ttf fiAV --script=latn --direction=ltr
[fi=0+1290|A=2+1270|V=3+1401]

~$ hb-shape DejaVuSans.ttf fiAV --script=arab --direction=ltr
[f=0+721|i=1+569|A=2+1401|V=3+1401]

You get no ligature or kerning in the second case, and probably no
Latin-specific features will be activated at all. Not all fonts will
fail like this, but many will do.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-23 Thread Khaled Hosny

On Wed, Dec 23, 2015 at 01:27:45PM -0800, Jonathan Blow wrote:
> I am having a weird problem where I can render text with HarfBuzz and it is
> generally doing the right thing in terms of shaping, but glyph offsets seem
> to be coming out zero all the time, leading to general badness.
> 
> Here's a very clear case, I am rendering Chinese with the initial character
> being this left-bracket-quote-thing that is supposed to be offset pretty
> far to the right:
> 
> [image: Inline image 1]
> 
> (Yeah, nothing is really happening here to shape this text, I am just
> saying that the shaping and all that is working fine with Arabic etc, but
> this particular square-bracket quote is a really obvious case of an offset
> being wrong).
> 
> The hb_glyph_position_t for this glyph gives me:
> 
> x_advance = 2880
> y_advance = 0
> x_offset = 0
> y_offset = 0
> 
> Now, when I ask FreeType about the glyph,
> 
> format = FT_GLYPH_FORMAT_BITMAP
> bitmap_left = 29
> bitmap_top = 39
> advance.x = 2880
> advance.y = 0
> metrics.horiBearingX = 1856
> metrics.horiBearingY = 2496
> metrics.horiAdvanace = 2880
> 
> So the advances agree, but the offsets are getting dropped on the floor
> somehow.

FreeType does not give you any offsets. Unless you mean the side
bearings, but they are very different things and shouldn’t be used for
glyph placement at all. You are getting 0 offsets because either the
font does not do any special positioning here (not uncommon for old
Chinese fonts), or something is wrong in the way the text is shaped.

> (My Arabic rendering is messed up too, by the way ... you can see the
> current results at this blog posting:
> http://the-witness.net/news/2015/12/entering-the-home-stretch/ ... so it is
> not like this is a corner case.)

For Arabic, it looks like you are applying the kerning on the wrong
direction, see how the ا in لطراز is moving closer to the ز instead of
the ر as I’d expect it.

> 
> To shape I am doing this:
> 
> hb_buffer_clear_contents(hb_buffer);
> hb_buffer_set_direction(hb_buffer, HB_DIRECTION_LTR);
> hb_buffer_set_script(hb_buffer, HB_SCRIPT_UNKNOWN);

If you are not setting the correct text script, then you are unlikely to
get correct output. You need to analyse the text and pass the correct
script to HarfBuzz (resolving characters with common and inhere tied
script properties, etc).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-23 Thread Khaled Hosny

On Wed, Dec 23, 2015 at 03:20:10PM -0800, Jonathan Blow wrote:
> >
> >
> >
> > You don't need to look into bearings metrics if you want to render
> > something. And no, they are not offsets.
> >
> > HB gives you advances and offset vectors. So what you need to do is:
> >
> > (origin) -> (render glyph at origin+offset) -> (move origin by advance)
> > -> repeat.
> >
> 
> I guess the point of confusion is what does "render glyph" mean in your
> flowchart here. I am not able to conceive of any version of "render glyph"
> whose implementation does not involve adding horiBearingX, which is why I
> wonder if you guys are thinking about it as calling some API call that
> people commonly presume but which is not what is going on in my case.
> 

The problem is that you are mixing “render glyph” internal with the rest
of this flow chart. How you render a glyph is up to what font rendering
API you use, just don’t mix it with how the glyph origin is placed.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-23 Thread Khaled Hosny

On Wed, Dec 23, 2015 at 07:03:25PM -0800, Jonathan Blow wrote:
> The most frustrating thing about sending mail to a mailing list asking a
> question is that everyone treats you like a junior programmer.
> 
> 
> > > I am not able to conceive of any version of "render glyph" whose
> > > implementation does not involve adding horiBearingX
> >
> > Can you show your code? What are you using to turn the glyph onto pixels
> > on the screen?
> >
> > Normally people doing text layout don't do that bit themselves - they
> > use a third party rendering library to do it. *That* library needs to
> > care about sidebearings, but those of us working on the text layout
> > layer just need to worry about where the box starts and where the next
> > box is.
> >
> 
> "*That* library" is my code that I am talking about.
> 
> The first time I encounter a glyph that has not been rendered, I allocate
> space for it in a texture map in a reasonably-tightly-packed way according
> to the metrics given by FT, use FT_Load_Glyph(..., FT_LOAD_RENDER) to
> rasterize the glyph, copy the rasterized data into the texture map, then
> flag the texture as dirty so it will be uploaded to the GPU next time I use
> it.
> 
> In order to do this kind of thing, one only allocates space for the
> rectangular area in the glyph that actually uses any pixels, because
> packing empty space is dumb and wasteful. So when rendering glyphs, what
> one thinks of as "the glyph" is the actual rectangle that actually contains
> pixels in it. The bearing is then just the delta from the cursor position
> to the place at which you actually want to draw the glyph.
> 
> In other words, it is an offset. "Bearings" and "offsets" are Exactly. The.
> Same. Thing. The difference is just that the "bearing" is constant per
> glyph but HB's "offset" which I guess is due to kerning (and whatever else)
> varies per instance of a glyph. That is the only difference. You only think
> they aren't the same thing if you are hiding implementation details from
> yourself (which for many people is fine, if you don't need to think about
> lower-level code, great, go to town).

For us (text layout people and type designers) the glyph is the whole
thing with the space around it, where you are talking about the “ink”
part of it. HarfBuzz as layout engine is not concerned about the ink
inside the glyph and don’t even know any thing about it.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz glyph offsets

2015-12-23 Thread Khaled Hosny

On Wed, Dec 23, 2015 at 02:37:38PM -0800, Jonathan Blow wrote:
> >
> >
> >
> > FreeType does not give you any offsets. Unless you mean the side
> > bearings, but they are very different things and shouldn’t be used for
> > glyph placement at all. You are getting 0 offsets because either the
> > font does not do any special positioning here (not uncommon for old
> > Chinese fonts), or something is wrong in the way the text is shaped.
> >
> 
> Okay, this is confusing. Maybe this has to do with different cultural
> assumptions or different ideas about what the software looks like, but
> bearings are just offsets to me. When I look at this diagram from the
> FreeType documentation:
> 
> http://www.freetype.org/freetype2/docs/tutorial/step2.html
> 
> It says that wherever the cursor is, add bearingX as an offset and draw the
> thing in the box. And in fact this is how I draw things when they show up
> in the proper places. So I am not sure what you mean here; my guess is that
> you are thinking of there being a little bit of a software stack living
> below you that itself adds bearingX to draw the thing in the box, so that
> it would be incorrect for you to do so?
>
> So then my interpretation is that the 'offsets' from HB are additional and
> are not meant to include bearings, and are only for kerning or for like
> things, and I should actually add both offsets together I guess.

HarfBuzz gives you information about the placement of each glyph (the
origin in that point). How to render individual glyphs at this origin
point is up to the software stack you use, conflating glyph positioning
with the way FreeType API works will just lead to trouble, as you are
experiencing yourself.

> > For Arabic, it looks like you are applying the kerning on the wrong
> > direction, see how the ا in لطراز is moving closer to the ز instead of
> > the ر as I’d expect it.
> >
> 
> That's quite possible, I will investigate. My old code, pre-HB, would
> switch to drawing glyphs right-to-left for RTL languages, but upon starting
> to use HB I was pretty confused because it looks to me like HB is always
> giving me glyphs in left-to-right order, even for RTL languages (if there
> is a setting that changes this, I have no idea). So at some point I just
> simplified all the rendering code to always be LTR, though it is possible I
> made a mistake or else for some reason kerning is RTL while everything else
> is LTR which would be really weird??
>
> Oh wait ... I remember something about kerning in otf/ttf always having to
> do with the logical order of glyphs and not the display order; does HB keep
> this convention? I had interpreted offsets as being screen positions, not
> deltas in some abstract kerning space, so that could be the source of the
> issue.
>  

The OpenType model positions glyphs from left to right always, HarfBuzz
should give you the glyphs and positions in their visual order, so after
shaping you don’t need to care whether the text was LTR or RTL (until
you start doing line breaking, because it need to be done in logical
order).

> > If you are not setting the correct text script, then you are unlikely to
> > get correct output. You need to analyse the text and pass the correct
> > script to HarfBuzz (resolving characters with common and inhere tied
> > script properties, etc).
> >
> 
> In this case we have a fixed set of languages (currently 15) that we are
> just swapping in and out, so I can set this stuff on a per-language basis,
> so I guess for each one I need to know:
> 
> * what to pass to hb_buffer_set_script
> 
> * what to pass to hb_buffer_set_language (I am not sure how this latter
> affects results in different ways than the former)
> 
> ... Anything else?

Each Unicode character has a script property, so you don’t need to hard
code it for the text. The only complication is inherited or common
characters, but there is a simple heuristic to handle them, see for
example:
https://github.com/HOST-Oman/libraqm/blob/master/raqm.c#L289

But if you are sure your text is always single script and language (I
see the Arabic has English words, so doesn’t seem to be the case), then
you can hard code the script values.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] harfbuzz versus fribidi

2015-12-22 Thread Khaled Hosny

On Wed, Aug 26, 2015 at 11:27:40AM +0200, Eduardo Castineyra wrote:
> On 8/25/2015 2:51 PM, Graham Douglas wrote:
> >On 25/08/2015 13:26, Behdad Esfahbod wrote:
> >>I should add API to FriBidi to align it better for use with HarfBuzz...
> >Hi Behdad
> >
> >Yes, yes + yes --- that would be awesome :-)
> 
> Second.

There is now fribidi_reorder_runs():
https://github.com/behdad/fribidi/pull/10

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz + SDL_ttf

2015-12-22 Thread Khaled Hosny

On Wed, Aug 19, 2015 at 12:01:48PM +0100, Behdad Esfahbod wrote:
> Sylvain Becker wrote to me to point out that he has updated SDL_ttf to use
> HarfBuzz:
> 
>   https://bugzilla.libsdl.org/show_bug.cgi?id=3046
> 
> Not sure exactly what the implications are, but I thought I share.

Here is another patch that uses both HarfBuzz and FriBiDi:
https://bugzilla.libsdl.org/show_bug.cgi?id=3211

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Question about zero width glyphs in shaping output

2015-12-08 Thread Khaled Hosny

On Mon, Dec 07, 2015 at 09:13:23PM +0530, Deepak Jois wrote:
> Maybe this is a bit related to Khaled’s question earlier about control
> characters inside ligatures, but I wanted to start a new thread.
> 
> When I shape text with Noto Nastaliq, I notice a bunch of zero-width
> glyphs generated
> 
> $> hb-unicode-encode U+06CC,U+06C1 |  hb-shape notonastaliq.ttf
> [HehFin=1+472|TwoDotsBelowNS=0@310,-383+0|sp2=0+0|BehxIni.outS1=0@0,-68+731]
> 
> 1. What is the purpose of these zero-width glyphs?

That is something internal to the font, they are not glyphs inserted by
HarfBuzz.

> 2. If I am rendering the shaped output to a PDF file (for e.g. when
> using Harfbuzz with LuaTeX), do I need to care about these zero-width
> glyphs at all? How will they affect rendering

You should just output the glyph as returned by HarfBuzz,
second-guessing it is likely to be wrong. If a character should be
invisible HarfBuzz will replace it by the space glyph, so you need not
worry about this, unless you really know what you are doing.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Control characters inside ligatures

2015-12-07 Thread Khaled Hosny

On Mon, Dec 07, 2015 at 09:14:19AM +0100, Behdad Esfahbod wrote:
> On 15-12-05 03:31 PM, Khaled Hosny wrote:
> > Hi,
> > 
> > I just noticed that when there is a control character between character
> > that form a ligature, there is a zero width space after the ligature
> > with a cluster value of the first character in the ligature, for
> > example:
> > 
> > $ hb-unicode-encode U+0066,U+200C,U+0069 | hb-shape amiri-regular.ttf
> > [f_i=0+1064|space=0+0]
> > 
> > or 
> > 
> > $ hb-unicode-encode U+0066,U+00AD,U+0069 | hb-shape amiri-regular.ttf 
> > [f_i=0+1064|space=0+0]
> > 
> > This is rather surprising as I was expecting the control character to be
> > consumed inside the ligature and only the ligature glyph would remain. I
> > think the current behaviour makes mapping glyphs to text indices harder
> > in this case. WDYT?
> 
> I don't think it makes any difference.  It's a zero-width glyph, so it
> contributes nothing to the cluster as a whole, so you still have to divide the
> sum of the widths of the glyphs by the number of cursor stops and that works
> the same both ways.  No?

I was thinking in terms of line breaks, since the soft hyphen is a break
opportunity I need to know that the sequence  became
the  glyph, but I’m not sure how to do that with the extra glyph
with the same cluster value. But may be I’m looking to it from the wrong
angle, ad I simply need to reshape the left side (probably with a real
hyphen) and the right side and just break the line there.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Control characters inside ligatures

2015-12-05 Thread Khaled Hosny

Hi,

I just noticed that when there is a control character between character
that form a ligature, there is a zero width space after the ligature
with a cluster value of the first character in the ligature, for
example:

$ hb-unicode-encode U+0066,U+200C,U+0069 | hb-shape amiri-regular.ttf
[f_i=0+1064|space=0+0]

or 

$ hb-unicode-encode U+0066,U+00AD,U+0069 | hb-shape amiri-regular.ttf 
[f_i=0+1064|space=0+0]

This is rather surprising as I was expecting the control character to be
consumed inside the ligature and only the ligature glyph would remain. I
think the current behaviour makes mapping glyphs to text indices harder
in this case. WDYT?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Some questions about scripts and languages

2015-12-05 Thread Khaled Hosny

On Sat, Dec 05, 2015 at 04:52:13PM +0530, Deepak Jois wrote:
> I have a few questions about script and language handling in Harfbuzz APIs
> 
> 1. It seems that hb_buffer_guess_segment_properties uses the LC_TYPE
> as the language. In my case, for whatever reason it is reporting it as
> ‘c’.

If your application does not call setlocale(), you will get the C
locale.

> How does that effect shaping, if at all?

If font have language-specific features, like for example:
https://en.wikipedia.org/wiki/Serbian_Cyrillic_alphabet#Differences_from_other_Cyrillic_alphabets

Or how Urdu-specific number shapes are implemented in most fonts.

> 2. Is there a canonical list of languages that are defined in any font
> standard? Can/does Harfbuzz validate against them? The answer seems no
> from a cursory look at the code.

https://www.microsoft.com/typography/otspec/languagetags.htm

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Some questions about scripts and languages

2015-12-05 Thread Khaled Hosny

On Sat, Dec 05, 2015 at 06:52:34PM +0400, Khaled Hosny wrote:
> On Sat, Dec 05, 2015 at 04:52:13PM +0530, Deepak Jois wrote:
> > I have a few questions about script and language handling in Harfbuzz APIs
> > 
> > 1. It seems that hb_buffer_guess_segment_properties uses the LC_TYPE
> > as the language. In my case, for whatever reason it is reporting it as
> > ‘c’.
> 
> If your application does not call setlocale(), you will get the C
> locale.

BTW, you shouldn’t be really using hb_buffer_guess_segment_properties,
it is just for quick testing or a last measure when you absolutely has
to use it. Text direction should be determined by applying the Unicode
bidi algorithm, script has to be derived from the Unicode character
properties after resolving characters with common and inherited script
properties. Language can’t be auto detected, so the user has to supply
it, using the system locale is just a good enough measure in case the
user can’t select a language, otherwise using the `dflt` language might
be better (ensure the text renders the same regardless of the locale).

BTW, here is a tiny library that user FriBiDi, HarfBuzz, FreeType and
its own script resolving code to layout text:

https://github.com/HOST-Oman/libraqm

The code is a bit messy right now, though.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Streamlining hb_font_t some more

2015-10-26 Thread Khaled Hosny

On Mon, Oct 26, 2015 at 05:30:20PM +0900, Simon Cozens wrote:
> On 09/10/2015 15:09, Khaled Hosny wrote:
> >> should just use the typographical ascender/descender of the font and hence 
> >> not
> >> need glyph bounding boxes in Sile at all.
> > 
> > Yes please, an approach similar to what browsers do would be much
> > appreciated.
> 
> OK; I've implemented support in Harfbuzz for getting these metrics out
> of the OS/2 table.
> 
> Now that I've fiddled about with it, I don't agree that the correct way
> to determine line space for *print* is the way that browsers do it - I'm
> actually more convinced that using glyph metrics makes the most sense.

The problem with using glyph metrics is that you don’t have consistent
line spacing, for example in Arabic with a font like Amiri some lines
can contain high glyphs with stacked marks and some not and you end it
with big interline spacing between some lines, a very bad result
overall. I had to jump through hoops to get anything remotely consistent
and (by forcing the line spacing through all the document to be as big
as the biggest one and it looks rather ugly). TeX’s idea is to avoid
overlap, but this assumes that just because a line have a few tall
glyphs, it will overlap with previous line, but in practice this rarely
happens, because there are usually enough gaps in the other line to
accommodate those glyphs.

On the web I just get proper line spacing out of box, and when my
document requires bigger line spacing than the default (fully vocalised
text, for example), I just set the CSS line-spacing to a good value and
get a consistent result.

> The problem comes when you have mixed font sizes (not something that
> really *should* happen in print, I know, but I still want to do the
> right thing when it does). The question is, what is the right way to lay out
> 
> 1) \font[size=30pt]{Lorem} ipsum dolor...
> 
> and
> 
> 2) \font[size=30pt]{Lorem ipsum} dolor...
> 
> In InDesign, both (1) and (2) get 12 x 1.2 = 14.4pt interline space.
> This means that the descender of the "p" in "ipsum" will bump into
> letters on the next line. That's clearly wrong.

You can always have a way to set the interline spacing overriding the
automatic calculation, just like in CSS (a sane way to do it, not the
convoluted interdependent parameters that TeX uses and only ~2 persons
in the world understand them).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Streamlining hb_font_t some more

2015-10-26 Thread Khaled Hosny

On Mon, Oct 26, 2015 at 03:59:54PM +0100, Werner LEMBERG wrote:
> 
> > I am not a typographer, I just play one on the Internet, so I am not
> > sure what someone who was actually typesetting a book would do in
> > that situation.  My guess would be that they would, basically, do
> > what SILE does right now (and what TeX does; perhaps Knuth knew what
> > he was doing after all) - use consistent 14.4pt (or whatever) line
> > spacing in situation (1) and use larger line spacing which fits in
> > the descender in situation (2).  But I would have to ask a real
> > typesetter to know.
> 
> Knuth knew *very well* what he was doing, and the TeX typesetting
> model works just fine for almost all cases, even more than 30 years
> later.

That is such a big claim, judging by the amount of people strugling with
TeX line spacing on tex.stackexchange.com:
http://tex.stackexchange.com/questions/tagged/line-spacing

> Given that the typographic values from the `OS/2' SFNT table are crap
> in far too much fonts, and that Apple and MS differ on the right
> approach, I would really not using it.  Your (1) and (2) are the way
> to go, IMHO.

I have only seen a handful of utterly broken fonts, the rest of the
world are doing just fine.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Different results when shaping sub-sections of text

2015-10-10 Thread Khaled Hosny

On Wed, Oct 07, 2015 at 12:41:56PM +0100, Jamie Dale wrote:
> I'll admit that colour only was a bad example, but aside from also being
> able to change the font or font size, our rich-text can also contain
> completely user-defined widgets. This can make extracting out the style
> information... tricky, since I don't really know how it's being used (and
> may actually be part of a nested control, such as a button or hyperlink).

That is your call, but I’d go for a solution that at least covers known
formating properties. As I said, such shape splitting is bad and should
be avoided whenever possible.
 
> Rich-text itself is actually a secondary concern right now, my primary
> concern is selection highlighting (which uses a similar mechanism, as text
> is broken into runs where it is selected, since selection can change the
> text colour). That said, selection isn't allowed to change the font used so
> I can more easily combine the selected and non-selected text into a single
> shape, however I'm still unsure how ligatures would be handled in that case.
> 
> I'll use English for simplicity since I can actually read it. Imagine I
> have the text "Magnificent", where the "fi" has been combined into a
> ligature. If I were to select "Magnif", then in order to change the colour
> of that portion of the text, the ligature would have to be split. This
> doesn't present a readability issue for English, but would it present
> issues for other languages?

You would be getting completely different glyphs for selected and
unselected text, which strikes me as a rather bad user experience. I
have never used an application that does anything like this. What I have
seen is that applications that naive applications either color the whole
ligature or not at all, while more sophisticated applications use
clipping to just color the part of the glyph they think belongs to the
highlighted characters (and determining this can either by just evenly
distributing the ligature advance width over its components or using
hb_ot_layout_get_ligature_carets(), with the former method as a
fallback). Also note that splitting the text is not only about the
ligatures, in the Amiri case you showed no ligatures were involved at
all so you should have no problem coloring the highlighted part without
playing any tricks, and there are Latin fonts that also handle
f-ligatures by using contextual forms and no actual ligatures.

Regards,
Khaled

> 
> -Jamie.
> 
> On 6 October 2015 at 22:45, Khaled Hosny <khaledho...@eglug.org> wrote:
> 
> > On Tue, Oct 06, 2015 at 08:08:00PM +0100, Jamie Dale wrote:
> > > I suspect that the first shape has used some ligatures, and the second
> > > shape was unable to do that due to being unable to combine the glyphs (I
> > > have previously seen this with the "fi" ligature in English).
> > >
> > > If both of these forms are considered acceptable, then I'm happy enough,
> >
> > Shaping parts of text separately is generally a bad idea as you lose any
> > OpenType interaction between these parts, so you only do it when it is
> > absolutely necessary (e.g. due to font change). Though your second image
> > is still barely legible, it loses all the contextual substitutions
> > specified in the font and gives a very suboptimal result, but it can
> > make the text illegible in many other cases, for example when shaping
> > "لا". I expect Indic scripts to suffer more legibility-wise.
> >
> > The proper way it to identify rich-text attributes that shouldn’t break
> > shaping (color, underline, overline, etc.) and apply them after shaping,
> > using cluster values to do the reverse glyph to character index mapping
> > (while at it, use HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS so that
> > you get more finer cluster mapping).
> >
> > Regards,
> > Khaled
> >
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Different results when shaping sub-sections of text

2015-10-10 Thread Khaled Hosny

On Wed, Oct 07, 2015 at 04:50:25PM +0300, Nikolay Sivov wrote:
> I just tried that in LibreOffice Writer, and it seems like changing color in
> Arabic string disables some advance adjustments, but overall shape is
> intact. That's especially visible if you apply strikeout style to whole text
> - this results in gaps in strikeout line. For English it does indeed break
> ligatures if you try to color 'f' separately from 'i'.

LibreOffice wouldn’t be such a good example to choose as it handles this
very poorly, even MS Office is much better in this regard. I’d check
with Firefox as it generally have much better typographic support.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Streamlining hb_font_t some more

2015-10-09 Thread Khaled Hosny

On Thu, Oct 08, 2015 at 11:54:09AM -0400, Behdad Esfahbod wrote:
>   So, from my
> point of view, you should NOT use this for line height calculation.  You
> should just use the typographical ascender/descender of the font and hence not
> need glyph bounding boxes in Sile at all.

Yes please, an approach similar to what browsers do would be much
appreciated. TeX way of handling interline spacing has always been
cumbersome and confusing (not that I know exactly what SILE do now, but
the further from TeX way here the better).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Different results when shaping sub-sections of text

2015-10-06 Thread Khaled Hosny

On Tue, Oct 06, 2015 at 08:08:00PM +0100, Jamie Dale wrote:
> I suspect that the first shape has used some ligatures, and the second
> shape was unable to do that due to being unable to combine the glyphs (I
> have previously seen this with the "fi" ligature in English).
> 
> If both of these forms are considered acceptable, then I'm happy enough,

Shaping parts of text separately is generally a bad idea as you lose any
OpenType interaction between these parts, so you only do it when it is
absolutely necessary (e.g. due to font change). Though your second image
is still barely legible, it loses all the contextual substitutions
specified in the font and gives a very suboptimal result, but it can
make the text illegible in many other cases, for example when shaping
"لا". I expect Indic scripts to suffer more legibility-wise.

The proper way it to identify rich-text attributes that shouldn’t break
shaping (color, underline, overline, etc.) and apply them after shaping,
using cluster values to do the reverse glyph to character index mapping
(while at it, use HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS so that
you get more finer cluster mapping).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Issue with cursive attachment

2015-08-25 Thread Khaled Hosny

On Tue, Aug 25, 2015 at 05:06:12PM +0100, Behdad Esfahbod wrote:
 Basically what's happening is this:
 
   - Lookup 0 joins gid9 to gid6 saying that gid9 should stay on the baseline,
 
 then:
 
   - Lookup 1 joins gid6 to git8 saying that gid8 should stay on the baseline.
 
 Essentially gid6 is attached to two different sides...  I call this a faulty
 font,

It is a hack (but necessary one) that I didn’t even expect to work, but
it worked in Uniscribe.

 but I'll work something out to match Uniscribe to the extent possible.
 I suppose the later connection shall win, ie. gid8 stays on baseline and
 everything else adjusted to follow.

Yes, that is what Uniscribe is doing and my “expected” result.

Regards,
Khaled

 b
 
 On 15-08-23 04:58 PM, Khaled Hosny wrote:
  I’ve a problem with mixing cursive attachment lookups with RTL flag set
  on some of them and not the others.
  
  I’m attaching two minimal fonts showing this issue, the only different
  between the good and bad fonts is that the bad font has a cursive
  attachment lookup with RTL flag not set while all the others have it
  set, and HarfBuzz seems to not to apply that lookup on the same word
  where the other lookups are applied.
  
  Testing with the word كمثل, the anchor between the 2nd and 3rd glyphs
  (from right) is not applied in the bad font but applied in the good one.
  Both DirectWrite/Uniscribe and Core Text apply it fine in both fonts
  (though Core Text has other serious problems).
  
  If I unset the RTL bit on all the cursive attachment lookups, it renders
  correctly in HarfBuzz.
  
  (If you are wondering why I’m doing this, it is because for some of my
  cursive attachments I want rightmost glyph to stay in its natural
  position and the other glyphs attached to it instead of having the
  leftmost glyph stay in its natural position.)
  
  Regards,
  Khaled
  
  
  
  ___
  HarfBuzz mailing list
  HarfBuzz@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/harfbuzz
  
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Issue with cursive attachment

2015-08-25 Thread Khaled Hosny

On Tue, Aug 25, 2015 at 03:54:53PM +0100, Behdad Esfahbod wrote:
 Thanks Khaled.  I'm taking a look at this now.  I was fairly sure that this
 was exercised by Not Nastaliq Urdu, but maybe it wasn't.
 
 Anyway.  can I use these in our test suite please?

Sure, that is just a subset of (unreleased version of) Aref Ruqaa:
https://github.com/khaledhosny/aref-ruqaa

Regards,
Khaled

 On 15-08-23 04:58 PM, Khaled Hosny wrote:
  I’ve a problem with mixing cursive attachment lookups with RTL flag set
  on some of them and not the others.
  
  I’m attaching two minimal fonts showing this issue, the only different
  between the good and bad fonts is that the bad font has a cursive
  attachment lookup with RTL flag not set while all the others have it
  set, and HarfBuzz seems to not to apply that lookup on the same word
  where the other lookups are applied.
  
  Testing with the word كمثل, the anchor between the 2nd and 3rd glyphs
  (from right) is not applied in the bad font but applied in the good one.
  Both DirectWrite/Uniscribe and Core Text apply it fine in both fonts
  (though Core Text has other serious problems).
  
  If I unset the RTL bit on all the cursive attachment lookups, it renders
  correctly in HarfBuzz.
  
  (If you are wondering why I’m doing this, it is because for some of my
  cursive attachments I want rightmost glyph to stay in its natural
  position and the other glyphs attached to it instead of having the
  leftmost glyph stay in its natural position.)
  
  Regards,
  Khaled
  
  
  
  ___
  HarfBuzz mailing list
  HarfBuzz@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/harfbuzz
  
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Issue with cursive attachment

2015-08-25 Thread Khaled Hosny

Seems to work as expected now, thanks!

On Tue, Aug 25, 2015 at 08:31:09PM +0100, Behdad Esfahbod wrote:
 Fixed now.  I actually like the new code better.
 
 On 15-08-23 04:58 PM, Khaled Hosny wrote:
  I’ve a problem with mixing cursive attachment lookups with RTL flag set
  on some of them and not the others.
  
  I’m attaching two minimal fonts showing this issue, the only different
  between the good and bad fonts is that the bad font has a cursive
  attachment lookup with RTL flag not set while all the others have it
  set, and HarfBuzz seems to not to apply that lookup on the same word
  where the other lookups are applied.
  
  Testing with the word كمثل, the anchor between the 2nd and 3rd glyphs
  (from right) is not applied in the bad font but applied in the good one.
  Both DirectWrite/Uniscribe and Core Text apply it fine in both fonts
  (though Core Text has other serious problems).
  
  If I unset the RTL bit on all the cursive attachment lookups, it renders
  correctly in HarfBuzz.
  
  (If you are wondering why I’m doing this, it is because for some of my
  cursive attachments I want rightmost glyph to stay in its natural
  position and the other glyphs attached to it instead of having the
  leftmost glyph stay in its natural position.)
  
  Regards,
  Khaled
  
  
  
  ___
  HarfBuzz mailing list
  HarfBuzz@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/harfbuzz
  
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Issue with cursive attachment

2015-08-23 Thread Khaled Hosny

I’ve a problem with mixing cursive attachment lookups with RTL flag set
on some of them and not the others.

I’m attaching two minimal fonts showing this issue, the only different
between the good and bad fonts is that the bad font has a cursive
attachment lookup with RTL flag not set while all the others have it
set, and HarfBuzz seems to not to apply that lookup on the same word
where the other lookups are applied.

Testing with the word كمثل, the anchor between the 2nd and 3rd glyphs
(from right) is not applied in the bad font but applied in the good one.
Both DirectWrite/Uniscribe and Core Text apply it fine in both fonts
(though Core Text has other serious problems).

If I unset the RTL bit on all the cursive attachment lookups, it renders
correctly in HarfBuzz.

(If you are wondering why I’m doing this, it is because for some of my
cursive attachments I want rightmost glyph to stay in its natural
position and the other glyphs attached to it instead of having the
leftmost glyph stay in its natural position.)

Regards,
Khaled


good.ttf
Description: application/font-ttf


bad.ttf
Description: application/font-ttf
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Zero-width joiner has width

2015-08-08 Thread Khaled Hosny

On Sat, Aug 08, 2015 at 04:25:07PM +0200, Behdad Esfahbod wrote:
 On 15-08-02 07:42 PM, Simon Cozens wrote:
  
  On Aug 2, 2015, at 18:08, Jonathan Kew jfkth...@gmail.com wrote:
  Which suggests there's something odd about how you're using harfbuzz.
  
  Ok, that makes sense. And yes, I was ignoring the advance for glyphs and 
  instead using Freetype to return the glyph width. I think I stole that bit 
  of code from xetex. :-)
 
 Really?  Khaled, what's that about?

I don’t think XeTeX ever did this.

Regards,
Khaled
 
 b
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Problems with TTB Japanese

2015-06-12 Thread Khaled Hosny

On Fri, Jun 12, 2015 at 12:55:27PM +0900, Simon Cozens wrote:
 On 12/06/2015 09:17, Behdad Esfahbod wrote:
  This happens because HarfBuzz thinks your font instance is set for 
  horizontal
  typesetting.  That is, this returns offsets that work with a font that has
  origin at baseline-left.  What you expect instead can be achieved by
  configuring the font to use a top-center origin.
  
  What font funcs are you using?
 
 This may be the problem. I'm using hb_ft_font_create, getting the glyph
 information and positions. I use x_offset and y_offset from
 hb_buffer_get_glyph_positions to alter the cursor position. (because
 otherwise Arabic doesn't work),  and then use a function like this to
 get the metrics to put the glyph in a (TeX-like) box:
 
 (Following code mostly stolen from XeTeX, with the TTB special case
 added yesterday after talking to the rubber duck.)

Note that XeTeX does not really support vertical typesetting, it lays
the vertical text horizontally and expects the user to rotate it, so it
might not be the best source of inspiration (unless this is how your
code works, of course).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] how to detect missing glyphs e.g. for font substitition

2015-05-13 Thread Khaled Hosny

On Wed, May 13, 2015 at 10:14:48PM +0400, Konstantin Ritt wrote:
 2015-05-12 14:12 GMT+04:00 Konstantin Ritt ritt...@gmail.com:
 
  2015-05-12 5:41 GMT+04:00 Behdad Esfahbod behdad.esfah...@gmail.com:
 
  On 15-05-11 08:06 AM, Konstantin Ritt wrote:
   Uniscribe has an API to override the .notdef glyph's value
 
  I didn't know this.  Which API is that?
 
 
  Turns out I was lying a bit ;)
 
  There is an API to *retrieve* the .notdef glyph's value:
 
  https://msdn.microsoft.com/en-us/library/windows/desktop/dd368802(v=vs.85).aspx
  (see wgDefault  wgInvalid)
 
 
 wgDefault is probably the 'OS/2' table's usDefaultChar field, which is
 usually set to 0x or to 0x0020 (dunno if we use it somewhere).

Use of usDefaultChar is “strongly discouraged” per the spec anyway.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] how to detect missing glyphs e.g. for font substitition

2015-05-12 Thread Khaled Hosny

On Mon, May 11, 2015 at 08:49:48PM +, Louis Semprini wrote:

  Date: Mon, 11 May 2015 21:35:49 +0200
  From: khaledho...@eglug.org
  To: lsempr...@hotmail.com
  CC: harfbuzz@lists.freedesktop.org
  Subject: Re: [HarfBuzz] how to detect missing glyphs e.g. for font 
  substitition

  On Mon, May 11, 2015 at 07:56:19AM +, Louis Semprini wrote:
   Or, must Harfbuzz callers first do a complete, separate pass where
   they run all code points of the input through some kind of mapping
   routine that uses the fonts' 'cmap' and other tables?  The latter
   would be a shame because it would require the Harfbuzz caller to
   duplicate a vast amount of the complexity that is nicely hidden in
   Harfbuzz in their own code.  It's also a shame because in most cases,
   no font substitution would be needed and so it would be inefficient in
   the average case.

  Some HarfBuzz users do that i.e. check the font’s cmap table to see what
  characters it supports and selects fallback fonts for what it doesn’t
  before even calling HarfBuzz. Others rely on HarfBuzz, for example in
  LibreOffice the run is first shaped with the user selected font, then
  any contiguous runs of missing glyphs are reshaped with fallback fonts,
  this have also the advantage of letting HarfBuzz do its normalisation
  which can result in the font supporting more characters than it declares
  in its cmap table.

 That's good to know, but for the second group of users, how do they
 detect the missing glyphs?  By looking for glyph index 0?

Yes (but as Konstantin said, it depends on what your font functions
return for missing glyphs).

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] how to detect missing glyphs e.g. for font substitition

2015-05-11 Thread Khaled Hosny

On Mon, May 11, 2015 at 07:56:19AM +, Louis Semprini wrote:
 Or, must Harfbuzz callers first do a complete, separate pass where
 they run all code points of the input through some kind of mapping
 routine that uses the fonts' 'cmap' and other tables?  The latter
 would be a shame because it would require the Harfbuzz caller to
 duplicate a vast amount of the complexity that is nicely hidden in
 Harfbuzz in their own code.  It's also a shame because in most cases,
 no font substitution would be needed and so it would be inefficient in
 the average case.

Some HarfBuzz users do that i.e. check the font’s cmap table to see what
characters it supports and selects fallback fonts for what it doesn’t
before even calling HarfBuzz. Others rely on HarfBuzz, for example in
LibreOffice the run is first shaped with the user selected font, then
any contiguous runs of missing glyphs are reshaped with fallback fonts,
this have also the advantage of letting HarfBuzz do its normalisation
which can result in the font supporting more characters than it declares
in its cmap table.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Example application using FriBidi, FreeType, and HarfBuzz

2015-03-10 Thread Khaled Hosny

At least mpv seems to be using libass to render all subtitles, so it might
be worth checking what are they doing.

Regards,
Khaled

 On Mar 10, 2015 12:20 PM, Salah-Eddin Shaban salshaa...@gmail.com
wrote:

 That's right. VLC uses libass to render SSA/ASS subtitles. But it has
 its own text rendering modules to render other subtitle formats.

 The most common format usually found on sites like opensubtitles.org
 is SubRip (.srt).
 So to use libass here the user has to convert the files to SSA
 beforehand, or VLC should convert the text to SSA at runtime, which is
 not the best option.

 On Tue, Mar 10, 2015 at 4:35 AM, Behdad Esfahbod
 behdad.esfah...@gmail.com wrote:
  Hi there,
 
  That's very nice.  Thanks for sharing.
 
  That said, I was under the impression that VLC could already use
 HarfBuzz /
  FriBidi to render subtitles using libass.  Is that not the case?
 
  behdad
 
  On 15-03-08 12:58 PM, Salah-Eddin Shaban wrote:
  Hello,
 
  I would like to thank you very much for HarfBuzz.
 
  I'm currently trying to use it to add complex-script support to the
  text renderer of VLC Media Player, mainly for Arabic subtitle files. I
  tried first using Pango but was told there were compatibility issues
  with GLib, so HarfBuzz is the way to go.
 
  I had to struggle a little with the Unicode bidirectional algorithm
  and with getting to understand how Fribidi, FreeType, and HarfBuzz
  work together. Documentation is rather scarce so I had to dig through
  some source code, and to experiment with a test application.
 
  I still can't claim to understand everything about the algorithm or
  the process of rendering text. When I compare my code to the source
  code of Pango it seems too simple to be correct. But it's producing
  correct results, as far as I can see, for all text I test it on, and
  not just subtitle files.
 
  Anyway, I thought it might help someone get the general picture, if
  you think it's appropriate as another user example on the HarfBuzz
  wiki.
 
  Any comments or corrections are greatly appreciated.
 
  https://github.com/salshaaban/BidiRenderer
  ___
  HarfBuzz mailing list
  HarfBuzz@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/harfbuzz
 
 ___
 HarfBuzz mailing list
 HarfBuzz@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/harfbuzz

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Example application using FriBidi, FreeType, and HarfBuzz

2015-03-10 Thread Khaled Hosny

I s

On Wed, Mar 11, 2015 at 12:04:01AM +0200, Salah-Eddin Shaban wrote:
 MPlayer (and mpv) converts all text to SSA in order to use libass.
 There's no other way to use libass.

I see, thanks for the explanation (I always assumed that libass can
render other subtitle formats because of this).

Regards,
Khaled

 And, again, I don't think that's the best option. The FreeType module
 in VLC is very mature, it does not feel right to just dump it instead
 of walking that last mile to support complex-scripts. And besides,
 there's work to be done either way, but I prefer to learn about the
 Bidi algorithm and complex-script shaping than converting between
 subtitle formats :)
 
 The latter won't help me with other projects I'm currently
 contemplating. Like adding complex-script support to some text
 editors.
 
 On Tue, Mar 10, 2015 at 7:43 PM, Khaled Hosny khaledho...@eglug.org wrote:
  At least mpv seems to be using libass to render all subtitles, so it might
  be worth checking what are they doing.
 
  Regards,
  Khaled
 
   On Mar 10, 2015 12:20 PM, Salah-Eddin Shaban salshaa...@gmail.com
  wrote:
 
  That's right. VLC uses libass to render SSA/ASS subtitles. But it has
  its own text rendering modules to render other subtitle formats.
 
  The most common format usually found on sites like opensubtitles.org
  is SubRip (.srt).
  So to use libass here the user has to convert the files to SSA
  beforehand, or VLC should convert the text to SSA at runtime, which is
  not the best option.
 
  On Tue, Mar 10, 2015 at 4:35 AM, Behdad Esfahbod
  behdad.esfah...@gmail.com wrote:
   Hi there,
  
   That's very nice.  Thanks for sharing.
  
   That said, I was under the impression that VLC could already use
  HarfBuzz /
   FriBidi to render subtitles using libass.  Is that not the case?
  
   behdad
  
   On 15-03-08 12:58 PM, Salah-Eddin Shaban wrote:
   Hello,
  
   I would like to thank you very much for HarfBuzz.
  
   I'm currently trying to use it to add complex-script support to the
   text renderer of VLC Media Player, mainly for Arabic subtitle files. I
   tried first using Pango but was told there were compatibility issues
   with GLib, so HarfBuzz is the way to go.
  
   I had to struggle a little with the Unicode bidirectional algorithm
   and with getting to understand how Fribidi, FreeType, and HarfBuzz
   work together. Documentation is rather scarce so I had to dig through
   some source code, and to experiment with a test application.
  
   I still can't claim to understand everything about the algorithm or
   the process of rendering text. When I compare my code to the source
   code of Pango it seems too simple to be correct. But it's producing
   correct results, as far as I can see, for all text I test it on, and
   not just subtitle files.
  
   Anyway, I thought it might help someone get the general picture, if
   you think it's appropriate as another user example on the HarfBuzz
   wiki.
  
   Any comments or corrections are greatly appreciated.
  
   https://github.com/salshaaban/BidiRenderer
   ___
   HarfBuzz mailing list
   HarfBuzz@lists.freedesktop.org
   http://lists.freedesktop.org/mailman/listinfo/harfbuzz
  
  ___
  HarfBuzz mailing list
  HarfBuzz@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/harfbuzz
 
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Testing recent changes in HarfBuzz

2015-02-27 Thread Khaled Hosny

On Thu, Feb 26, 2015 at 02:01:20PM -0800, Behdad Esfahbod wrote:
 Khaled, Jonathan, others,
 
 While I was away I implemented a few optimizations and other fixes that I
 think I might have got wrong a bit.  I'm fairly confident in them, but then
 can use some testing, specially with complex recursive lookups, etc.  Would
 you mind giving it some testing (Amiri, whatever test suites you might be able
 to run)?

I ran the Amiri test suite against it and didn’t see any breakages. The
test suite is not extensive, though it covers some very complex
contextual substitutions.

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Font-independent shaping

2015-01-21 Thread Khaled Hosny

On Wed, Jan 21, 2015 at 07:07:44AM -0600, Ken Schutte wrote:
 I realize different fonts will support different features, but I want to
 input a unicode string and get information like,
 
 - this 'ARABIC LETTER BEH' should use 'ARABIC LETTER BEH INITIAL FORM'
 - this 'ARABIC SHADDA' will be combined with previous character
 - mandatory ligatures (lam+alif)
 etc
 (of course will not get glyph coordinates)

What is the intended use for this?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Dealing with ligatures and text input

2015-01-21 Thread Khaled Hosny

On Wed, Jan 21, 2015 at 04:20:30PM +0100, Diederick Huijbers wrote:
 Hi,
 
 I'm working on a C/C++/OpenGL text input field and wondering if
 someone can give me some advise on how to deal with ligatures and text
 input.
 
 Lets say I have a input field and I type fi. Now, harfbuzz will
 replace these two characters by the fi ligature.
 
HarfBuzz is merely respecting the font designer’s decision of having or
not having any given ligature.

 To be honest, I've no idea if the ligature fi makes the text more
 readable, but I assume it does.
 
 But if I use ligatures (because of the reasons harfbuzz uses them),
 how would I deal with e.g. the placement of the caret in the text
 input field. I should allow the user to move the caret between the f
 and i for example. Though how would I know the correct (visual) x
 position ?

If you know that a given glyph is a ligature (note that some fonts
implement “ligatures” in a way that gives you the same number of glyphs
for the input characters, so no assumptions should be made here), you
can call “hb_ot_layout_get_ligature_carets()” to get the proper caret
positions inside the ligature. Unfortunately very few fonts provide the
data used by this API (because almost no applications use it), so you
will need a fallback approximation, and AFAIK most applications just
divide the ligature width on the number of its components, so for “fi”
you would place the caret on the middle of the ligature and so on.

 And lets say I use the input field so a user can register him/herself
 and the value is stored in a database. Would I store the fi ligature
 or both characters separately?

Your text processing should always be done on the input text stream, not
the glyph indices returned by Harfbuzz.

 Currently I think disabling ligatures is my best option, but I'm not
 sure how I can disable this with harfbuzz.

Even if you can get away with disabling ligatures in Latin script, what
about other scripts where they might be required for proper text
shaping?

Regards,
Khaled
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

1 2 3 >

1 - 100 of 248 matches

Mail list logo