[HarfBuzz] Rendering of Arabic shadda-kasrah

2020-08-19 Thread Eli Zaretskii
Could someone please look at the discussion and the data of the Emacs
bug#34035 (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=34035) and
tell whether the fonts that produce incorrect display are faulty, and
if so, what is the problem with those fonts?  Also, is there perhaps
some way around these problems that would yield better results even
with the fonts which currently display the kasrah below the base
letter?  Because I tried many different fonts with reasonable coverage
of Arabic, and the vast majority of them produce this problem, so it
seems like the fonts which don't are quite rare.

(As you see from the last messages, hb-view produces the same display
as Emacs with HarfBuzz, so at least we are doing no worse in these
cases, and we have reason to believe this is not an Emacs-specific
problem.)

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] vertical text for RTL scripts?

2020-07-15 Thread Eli Zaretskii
> From: Phil M Perry 
> Date: Wed, 15 Jul 2020 12:38:49 -0400
> 
> my TTBHebrew example in HarfBuzz.pdf (did anyone look at it?)

I did.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


[HarfBuzz] HarfBuzz crash when shaping Arabic?

2020-07-14 Thread Eli Zaretskii
Could someone please look at this Emacs bug report:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=42352

and tell if something like this rings a bell?

According to the backtrace, the crash happened inside HarfBuzz (the
backtrace levels above that are the Emacs signal handling mechanism).
The user who reported that uses HarfBuzz 2.3.1 as packaged by Debian:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=42352#28

The crash happened when some Arabic text was passed to hb_shape_full
without providing the explicit direction of the text, but instead
relying on hb_buffer_guess_segment_properties to guess it.

Was there perhaps a problem in that version of HarfBuzz that could
crash like that?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] vertical text for RTL scripts?

2020-07-13 Thread Eli Zaretskii
> Cc: harfbuzz@lists.freedesktop.org
> From: Phil M Perry 
> Date: Mon, 13 Jul 2020 11:11:51 -0400
> 
> Eli, I realize that (except for Chinese, Japanese, and possibly Korean), 
> text is normally written horizontally (LTR or RTL). Vertical text is for 
> special uses such as signage and advertising.

Yes, I understand that, and was replying with that in mind.

> Anyway, I'm still not sure what the convention is for writing vertical 
> text in RTL languages. There's not much discussion of this online, 
> except for "I want to get a Hebrew tattoo down my spine saying 'daughter 
> of Jehovah' -- which way will read correctly?" The convention for LTR 
> scripts is to start at the top and grow downwards, which is like taking 
> the original LTR coordinate system and rotating it 90 degrees clockwise 
> (with individual letters rotated back). The next line (column) is to the 
> LEFT.

Yes, agreed.

> For RTL, my sources suggest that the last letter input (first one 
> read)

This is fundamentally incorrect: both input from keyboard and reading
are done in the same order, even for RTL languages.  The only order
which is reversed for RTL languages is the left-to-right order on
display: the first RTL letter read is generally the rightmost, unlike
with LTR scripts.

I think the above observation is important, because I'm guessing it is
the basis of your confusion regarding the vertical layout.  In the
vertical layout, the left vs right issue no longer exists (at least as
long as we are talking about a single column), so the distinction
between LTR and RTL scripts also disappears.

Therefore:

> should be at the TOP of the text column, which means rotating the 
> original horizontal coordinate system 90 degrees COUNTERclockwise. For 
> TTB of a RTL script, it is like a clockwise rotation, with the first 
> input letter at the top, but reading from the bottom/original right. 

No, the first input letter is at the top, and the first one you read
is also at the top.

> Embedded LTR text is read TTB. For BTT, it is like a COUNTERclockwise 
> rotation, with the first input letter at the bottom, reading from 
> top/original right. Unfortunately, this leaves embedded LTR text 
> backwards from what would be expected

No.  Embedded LTR text will also be laid out TTB, i.e. without
reordering it.

In short, in vertical layout there's no bidi reordering at all: both
LTR and RTL characters are displayed in the logical order, top to
bottom.  Technically, I think this happens because bidi reordering per
UAX#9 works on the line level, so when each character is a separate
line, reordering has no effect.

> Also, for BTT, is it correct that the next line (column) is to the
> RIGHT?

Yes, I believe the columns should progress from right to left for the
RTL text (modulo the base paragraph direction issue, which your
description completely ignores, so my assumption is that you are
talking about RTL text in a right-to-left paragraph and LTR text in a
left-to-right paragraph, not the other way around).

> Finally, I tried some English (LTR Latin) text vertically with "field" 
> in it, WITHOUT explicitly turning off ligatures (-liga), and it kept the 
> "f" and "i" separate (good)... does this mean that HarfBuzz officially 
> knows not to do ligatures with vertical text? Kerning doesn't appear to 
> be a problem, either.

That's something for the HarfBuzz experts here to answer; I'm not such
an expert.

HTH
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] vertical text for RTL scripts?

2020-07-12 Thread Eli Zaretskii
> From: Phil M Perry 
> Date: Sun, 12 Jul 2020 10:15:31 -0400
> 
> Now, if I specify TTB direction, what should I see? Likewise, what 
> should BTT direction show? I know very little about RTL/bidi scripts, 
> and googling for examples gives ambiguous and conflicting information. I 
> realize that most scripts and languages are rarely written vertically, 
> except for East Asian (CJK) languages, but it would be nice to know that 
> the code is handling them correctly.
> 
> If you want to write Hebrew vertically, would you choose TTB or BBT? 

Hebrew is not written vertically, no more than English or German are.
So if you must write it vertically, I guess TTB would be the preferred
layout, like with Latin scripts.  For example, that's how
vertically-laid-out shop signs are made.

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-25 Thread Eli Zaretskii
> Date: Sun, 24 May 2020 20:27:26 +0100
> From: Richard Wordingham 
> Cc: harfbuzz@lists.freedesktop.org
> 
> It seems to me that Emacs knows what script a cluster is in; perhaps
> it just hasn't united the concepts.

It's a kind of coincidence: different scripts almost always require
different fonts, and Emacs only composes characters displayed in the
same font.

> Users may have written some weird clustering combinations, and I can
> imagine some weird combinations in the Private Use Areas.  I should
> investigate.

Don't expect anything about PUA, Emacs doesn't assign any useful
properties to them.

> > That's a feature (you can disable it with disable-point-adjustment).
> 
> Is this documented in info, or does one have to trawl the code to find
> out what it does?

Every variable in Emacs has a doc string, and you can search them with
several apropos commands.  We don't describe in the manual every
obscure variable, there are too many of them.

> It seems that Emacs needs several levels of movement
> - by codepoints, by grapheme cluster, by akshara (will be the same as
> grapheme cluster in many cases) and by HarfBuzz cluster, or whatever
> is used to make access into lam-alif impossible.

I have no idea which one Emacs uses, not in these terms.  All I can
say is that, in HarfBuzz terms, we get the number of "elements" from
hb_buffer_get_length, and then index the arrays returned by
hb_buffer_get_glyph_infos.  Each "element" thus indexed is a separate
"thing" for display purposes, and Emacs by default won't let you
"enter" such a "thing", it will move across it in its entirety in one
go.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-24 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sun, 24 May 2020 18:00:45 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> In general the safest is to pass the whole paragraph of text and the start 
> and length of each item (item being a run with same font, direction, script, 
> and language).

I was talking about text that has a single font, direction, script,
and language.

> This, for example, ensures that HarfBuzz can do basic Arabic-like shaping 
> across item boundaries e.g. if you break items in the middle of an Arabic 
> word (due to font change, for example), you still get the 
> initial/medial/final forms across the boundary as appropriate. Or to put a 
> combining mark at the start of a paragraph on a dotted circle as it otherwise 
> has no base.
> 
> If this is not possible, then you can try to pass enough context, like reach 
> back and forward to first character that is not a combining mark. This may or 
> may not be enough.
> 
> Shaping space-delimited words is orthogonal to that, context is better be 
> always provided.

So this sounds like passing a physical line that ends in a newline
should be good enough?  Or are there issues that cross newlines as
well?

And what is a "paragraph" in this context?

> Some fonts do have OpenType lookups that interact with space (e.g. kerning 
> pairs involving space, or even substitutions involving space), so shaping 
> words independently will give suboptimal result. You can use HarfBuzz API to 
> find out if the font has OpenType layout rules involving space, or decide to 
> live with this limitation.

Which API provides this information?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-24 Thread Eli Zaretskii
> > I almost understand (and agree), sans one part: the "arbitrary parts"
> > of what you wrote.  If we want to produce a ligature out of "ffi", the
> > shaper will get "fii" and nothing more.  Which part here is arbitrary?
> 
> Sending "ffi" alone is an arbitrary decision. The font might have kerning 
> between "ffi" and what comes before and after it, but you won't get it. The 
> font might not have a ligature for "ffi" at all, but using kerning instead, 
> so you will get kerning between "ffi" glyphs and not other glyphs which is 
> arbitrary. It might be a cursive font that changes glyph shapes based on 
> surrounding glyphs, and you will get that for "ffi" and not elsewhere which 
> is arbitrary.
> 
> That is just plain wrong, there is no way around it.

So, to make sure I understand the correct solution: you are saying
that all the text to be displayed should go through the shaper, is
that right?

If so, how large should be the chunks of text to be passed to the
shaper in any one call, in order to have a correct result?  Would it
be enough to pass whitespace-separated words one by one? or do we need
to send entire physical lines (up to the terminating newline
character)? or maybe an entire paragraph?  What is the recommendation
here?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-24 Thread Eli Zaretskii
> Date: Sat, 23 May 2020 21:42:24 +0100
> From: Richard Wordingham 
> 
> > As for different scripts: if the character codepoints are the same,
> > Emacs currently assigns each character to a single script.
> 
> I'll need to dig deeper.  Composition of both 'a' and Greek alpha with
> an acute accent works, which suggest that the problem isn't there for
> characters with a script property of 'inherited'.

Emacs currently leaves it up to HarfBuzz to guess the script, as it
doesn't yet have the necessary smarts.

> > Emacs 24.4 is very old, and doesn't use HarfBuzz.  Please try Emacs 27
> > instead, it has several bugs in this area fixed, and will use HarfBuzz
> > if available at build time.
> 
> The behaviour in 27.05 is the almost the same as for 24.4, but the
> breaking in item (1) is automatically repaired.  The process seems slow
> - I can see the glyph become final and then revert back to being
> medial.  I'm puzzled by not being able to step into lam-alif but being
> able to step through a series 'beh's.  The step into command for
> advancing codepoint by codepoint semiworks.  The cluster shaping
> doesn't break at the cursor - Handa gave me a C code fix so I could
> achieve that - but the number of steps into to pass through a cluster
> matches the number of codepoints.
> 
> Pressing the 'delete' key still deletes a single character, but may be
> that because it's mapped to tpu-delete-current-char.

If you press DEL (or Backspace), it will delete a single codepoint.

> So, what's not working in Arabic is that one can't move the cursor
> through ligatures.

That's a feature (you can disable it with disable-point-adjustment).

The rest of your observations seem to be too Emacs-specific to discuss
here.  You are welcome to submit an Emacs bug report if you think
something isn't working as it should, or would like to discuss
Emacs-specific details.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> Cc: harfbuzz@lists.freedesktop.org
> From: Simon Cozens 
> Date: Sat, 23 May 2020 20:14:16 +0100
> 
> On 23/05/2020 08:44, Eli Zaretskii wrote:
> > Thanks.  Since (b) is not really feasible without redesigning the
> > entire Emacs display engine (for which I see no volunteers lining up
> > any time soon), I guess we will have to use some more-or-less
> > reasonable and somewhat unreliable heuristics by supporting only some
> > ligatures that are known in advance.
> 
> Travelling further in the wrong direction is always an option, but don't 
> expect it to get you closer to the right destination.

I don't think this is an adequate analogy.  What Emacs does is an
approximation to what should be done.  The approximation falls short
of the target, that's true, and might even produce clearly incorrect
results in some cases (although I've yet to see such cases, and I'm
using Emacs for editing non-ASCII text for 20 years).  But it is still
an approximation, so it is not really "the wrong direction" (which you
seem to interpret as 180 degrees off, otherwise even going in the
wrong direction might bring me closer to the destination, right?).
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> Date: Sat, 23 May 2020 20:06:32 +0100
> From: Richard Wordingham 
> 
> There are three different tools for producing what looks like an "ffi"
> ligature:
> 
> 1) Make a ligature
> 2) Contextual substitution
> 3) A mix of contextual substitution and kerning.
> 
> A font that uses the first will produce a ligature for Emacs.
> 
> A font that uses contextual substitution will not work - you will just
> see the 3 unligated characters with their default glyphs.
> 
> A font that uses a mix of contextual substitution and kerning will
> likewise fail.  However, if is possible that you might get the "ff"
> ligature and a normal 'i', or a normal 'f' and an "fi" ligature.
> 
> From the point of view of someone who expects full shaping, what result
> you get will be arbitrary, depending on how the font designer has
> marshalled his tools.

I understand.  Still, the result looks reasonably good in most cases,
especially in an editor whose main purpose is to edit programs, and
which doesn't pretend to produce typographical accuracy.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 20:54:15 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> > We pass to the shaper the part of text that matches the regexps you
> > can see at the end of misc-lang.el, then display the glyphs the shaper
> > returns.  The above description is a high-level overview; there are
> > many details that I cannot describe in a short message.  For example,
> > for Arabic, when we get back the grapheme clusters, we lay them out,
> > then skip to the end of the text that we passed to the shaper.
> 
> You mean this:
> https://repo.or.cz/emacs.git/blob/HEAD:/lisp/language/misc-lang.el#l78
> 
> I’m not sure how can I read it, but it seems to be missing the entire Arabic 
> Extended-A and Arabic Mathematical Alphabetic Symbols blocks. I’m not also 
> sure how it would handle using combining marks from other blocks with Arabic 
> text (say putting U+20D6 over an Arabic letter).

If you can suggest improvements to those patterns, please do, and
thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 20:40:44 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> Sending “ffi” alone is an arbitrary decision. The font might have kerning 
> between “ffi” and what comes before and after it, but you won’t get it. The 
> font might not hav a ligature for “ffi” at all, but using kerning instead, so 
> you will get kerning between “ffi” glyphs and not other glyphs which is 
> arbitrary. It might be a cursive font that changes glyph shapes based on 
> surrounding glyphs, and you will get that for “ffi” and not elsewhere which 
> is arbitrary.
> 
> That is just plain wrong, there is no way around it.

OK, thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 20:18:33 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> > The Emacs display engine examines the text to be displayed and laid
> > out one character at a time, and makes layout decisions after each
> > character or grapheme cluster it lays out.  Its design is therefore
> > fundamentally incompatible with shaping large substrings of buffer
> > text at once.  We do support that for short sequences of characters,
> > which seems to work well enough for complex shaping (a.k.a. "character
> > compositions") of scripts that require that, but we still do that one
> > grapheme cluster at a time.  
> 
> That wouldn’t work for Arabic. You can’t shape Arabic one grapheme cluster at 
> a time (or any other text actually, but the brokenness in Arabic will be 
> immediately obvious), so I’m most certain that is not exactly how Arabic is 
> handled in Emacs right now.

We pass to the shaper the part of text that matches the regexps you
can see at the end of misc-lang.el, then display the glyphs the shaper
returns.  The above description is a high-level overview; there are
many details that I cannot describe in a short message.  For example,
for Arabic, when we get back the grapheme clusters, we lay them out,
then skip to the end of the text that we passed to the shaper.

> > The character composition is implemented
> > in Lisp, which is called by the display engine, and which then calls
> > back into C to invoke the shaper.  This implementation is meant to
> > allow a great deal of control on what should be composed and how.  But
> > it is also relatively slow, which is another reason why doing that for
> > all the text to be laid out is impractical: it slows down redisplay to
> > the degree that it becomes annoying to users.
> 
> Having more control should not be at the price of doing things wrong.

No one said it should, that's just how things are.

> The whole composition concept of Emacs does not make any sense to me, all 
> text is “composed”. You can have a special mode that would disable shaping 
> for specific purposes (opening huge log files, wanting to see raw text with 
> no bidi or shaping, etc), but this can be done in cooperation with HarfBuzz 
> and not by bypassing it entirely.

We are talking about a piece of software designed 21 years ago.  I
realize that it makes no sense to you, but that's what we have, and
will probably have for the next 10 years or so.  We must make the most
out of what we have.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 20:09:50 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> Overall, if you can’t send the whole text (words are the absolute minimum, 
> but this has its issues as well), don’t just send arbitrary parts of it as 
> the result will be some inconsistent mess.

I almost understand (and agree), sans one part: the "arbitrary parts"
of what you wrote.  If we want to produce a ligature out of "ffi", the
shaper will get "fii" and nothing more.  Which part here is arbitrary?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> Date: Sat, 23 May 2020 16:54:51 +0100
> From: Richard Wordingham 
> Cc: harfbuzz@lists.freedesktop.org
> 
> > Emacs supports more than one rule for each composable sequence of
> > characters.
> 
> That doesn't help when the rules give conflicting divisions into
> clusters, which is the case with Tai Tham.

The assumption is that either the rules can be arranged in an order
that allows to use the first matching rule, or, failing that, that you
write your own composing function that implements whatever logic
that's required to select the right rule.

> The Devanagari rule only covers the Vedic marks in the Devanagari block,
> the 'stress signs' according to the comments.  Can rules essentially
> for different scripts now share combining marks?  The newer Vedic marks
> were supposed to be available to at least all Indian Indic scripts.

I don't know enough about this to make sure I even understand the
question, let alone can provide an answer.  One thing I can say is
that the regexp pattern in a rule can specify different context (the
surrounding characters) even if the character that triggers the rule
is the same.  Failing that, I guess the solution will again be the
function that produces the composition.

As for different scripts: if the character codepoints are the same,
Emacs currently assigns each character to a single script.

> > Does Emacs indeed fail to wrap Arabic text?  can you show an example?
> 
> Character level wrapping still almost works down at Emacs 24.4, but I
> don't know that it wasn't broken in later enhancements.  There are three
> features that make me think Emacs 24.4 might be different to the
> current state of affairs:
> 
> (1) Clicking into the text breaks text before the cursor, but not after
> it.
> (2) I can't step into lam-alif the way I step into Indic clusters.
> (3) Lam-alif isn't broken by line wrap.

Emacs 24.4 is very old, and doesn't use HarfBuzz.  Please try Emacs 27
instead, it has several bugs in this area fixed, and will use HarfBuzz
if available at build time.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> Date: Sat, 23 May 2020 16:33:12 +0100
> From: Richard Wordingham 
> 
> On Sat, 23 May 2020 11:25:38 +0300
> Eli Zaretskii  wrote:
> 
> > > From: Khaled Hosny 
> > > Date: Sat, 23 May 2020 09:51:21 +0200
> > > Cc: harfbuzz@lists.freedesktop.org
> > > What are you going to do about kerning, or mark positioning?
> > > Partially kerning arbitrary glyphs (because the sub string match
> > > some regular expression) is worse than not kerning at all.  
> > 
> > I don't think I understand the question.  How is kerning related to
> > the issue at hand?  I'm not an expert on typesetting text (so maybe I
> > don't even understand what exactly is meant by "kerning" in this
> > context), so please tell more details about this.
> 
> The simplest way of laying out proportionally spaced text is to have a
> fixed glyph-dependent distance ('advance width') from the 'origin' of a
> glyph to the origin of the next glyph and simply lay them out in a
> sequence, like movable type. However, if one chooses widths suitable
> for the sequences 'AM' and 'MV', then there may be an unsightly gap in
> the middle of 'AV'. Kerning is basically the process of adjusting those
> gaps.  Kerning is done by the shaper.  To do it, it needs the
> whole sequence of characters.

Ah, okay, thanks.  Then yes, Emacs just uses the advance width that we
get from the metrics of each glyph.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> Date: Sat, 23 May 2020 14:51:53 +0100
> From: Richard Wordingham 
> 
> > > They may of course have more than one set of such rules, with the
> > > rule sets defining different sets of sequences.  
> > 
> > Who are "they" in this context?
> 
> Devanagari and Tai Tham are two examples I am aware of.

Emacs supports more than one rule for each composable sequence of
characters.

> Devanagari has different rules for positioning of Vedic marks between
> fonts using the script tags dev and dev2 for it on one hand and the
> unofficial script tag dev3, which follows the USE rules for character
> ordering.  For tag dev, Microsoft says that  candrabindu, consonant> is one cluster; others, including Unicode, say
> it's two.  Candrabindu in the middle and candrabindu at the end mean
> different things; the former nasalises a consonant, while the latter
> nasalises a vowel.  The visual distinction exists, at least when
> half-forms are used.

See the rules set up near the end of indian.el in Emacs.  If they
don't cover what you describe, we can add more.

> > I'm not talking about Arabic.  Emacs has a set of regular expressions
> > for sequences of Arabic characters that need shaping, misc-lang.el in
> > Emacs.  If the set is incomplete, we can augment it.
> 
> That regular expression treats every Arabic word as in need of shaping. 
> 
> > If a font requires special shaping for any sequence of any number of
> > 26 (or maybe 52) ASCII letters, then the Emacs display engine will
> > need to be redesigned.  So this extreme possibility doesn't bother me.
> 
> In general, they do require it.  But how is this worse than handling
> Arabic?

I don't know.  Maybe it isn't.  Or maybe the slowdown while displaying
ASCII and moving the cursor through it will be unbearable.

> Is the problem that you want to keep the option of line
> wrapping splitting words for ASCII, but are not bothered for Arabic or
> other human languages?

Does Emacs indeed fail to wrap Arabic text?  can you show an example?

> > > How would you handle the possibility that all three of <æ>, 
> > > and  might be rendered by the same glyph, althouɡh they
> > > are comprised of 1, 2 and 3 characters respectively?  
> > 
> > By using a composition rule that matches both  and .
> > The rules are regexp-based, and expressing the above as a regexp is
> > simple.  Once a sequence of characters matches the regexp, Emacs calls
> > the shaper (hb_shape etc.) to produce the font glyphs for the
> > sequence, and displays the glyphs that the shaper returns.
> 
> I think you mean that Emacs would store the position of components by
> an index that was the sequence of characters, not the glyph ID.  That
> would also deal with precomposed characters - it would be the character
> sequence that mattered, and for cursor movement and rendering,
> the canonically equivalent sequence(s) and the precomposed character
> would remain distinct.

Sorry, I don't follow: what do you mean by "store"?  Emacs stores the
rules used to compose characters, and it stores the results of the
compositions already done by applying those rules, as part of
displaying some chunk of text.  Which one of these did you have in
mind?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 09:59:15 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> Also either Emacs is currently treating text that it enables shaping for as 
> second-class citizens where limitations/degraded performance is acceptable 
> (which is really really bad)

Could you tell more about which limitations and degraded performance
you had in mind?  I'm not sure we have this, but cannot tell without
understanding the issues.

> or “redesigning the entire Emacs display engine” is not really needed as you 
> can just declare all text as text that needs to be shaped and be done with it.

The Emacs display engine examines the text to be displayed and laid
out one character at a time, and makes layout decisions after each
character or grapheme cluster it lays out.  Its design is therefore
fundamentally incompatible with shaping large substrings of buffer
text at once.  We do support that for short sequences of characters,
which seems to work well enough for complex shaping (a.k.a. "character
compositions") of scripts that require that, but we still do that one
grapheme cluster at a time.  The character composition is implemented
in Lisp, which is called by the display engine, and which then calls
back into C to invoke the shaper.  This implementation is meant to
allow a great deal of control on what should be composed and how.  But
it is also relatively slow, which is another reason why doing that for
all the text to be laid out is impractical: it slows down redisplay to
the degree that it becomes annoying to users.

That is why solving these problems in the way that you suggest
requires a complete rewrite of the Emacs display code.  It simply
cannot currently support what you expect.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 09:51:21 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> > Thanks.  Since (b) is not really feasible without redesigning the
> > entire Emacs display engine (for which I see no volunteers lining up
> > any time soon), I guess we will have to use some more-or-less
> > reasonable and somewhat unreliable heuristics by supporting only some
> > ligatures that are known in advance.
> 
> What are you going to do about kerning, or mark positioning? Partially 
> kerning arbitrary glyphs (because the sub string match some regular 
> expression) is worse than not kerning at all.

I don't think I understand the question.  How is kerning related to
the issue at hand?  I'm not an expert on typesetting text (so maybe I
don't even understand what exactly is meant by "kerning" in this
context), so please tell more details about this.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sat, 23 May 2020 08:36:10 +0200
> Cc: harfbuzz@lists.freedesktop.org
> 
> >The only way of
> > doing this right, I'm told, is to either (a) query the font to get the
> > list of all the ligatures it supports, or (b) assume any combination
> > of characters can produce a ligature, and therefore we need to pass
> > all the characters intended for display through hb_shape.  The latter
> > in particular is in stark contrast to how the current Emacs display
> > code is designed and implemented.
> 
> (a) is not realistically possible as doing it properly has pretty much the 
> same cost as shaping the text. So your only reliable option is (b).

Thanks.  Since (b) is not really feasible without redesigning the
entire Emacs display engine (for which I see no volunteers lining up
any time soon), I guess we will have to use some more-or-less
reasonable and somewhat unreliable heuristics by supporting only some
ligatures that are known in advance.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Ligatures

2020-05-23 Thread Eli Zaretskii
> Date: Fri, 22 May 2020 22:22:49 +0100
> From: Richard Wordingham 
> 
> > The current support for producing ligatures works in the same way as
> > complex text shaping for scripts that require that, like Arabic and
> > Khmer: the sequences of characters that can be displayed as ligatures
> > are identified in advance with suitable regular expressions, and the
> > display engine then passes these sequences to hb_shape to produce the
> > ligatures.
> > 
> > This works well for scripts that require complex shaping, because such
> > scripts generally have well-defined rules for the sequences of
> > codepoints that need shaping.
> 
> They may of course have more than one set of such rules, with the rule
> sets defining different sets of sequences.

Who are "they" in this context?

> > However, I'm being told that this assumption is false, and that each
> > font defines ligatures from any number of arbitrary combinations of
> > characters, and therefore the exhaustive list of the ligatures is in
> > practice infinite and cannot be provided in advance.
> 
> This arbitrariness is true.  Over the set of all credible fonts for a
> given character repertoire, the number of ligating combinations is
> unbounded.

I understand that the number of combinations is theoretically
unbounded.  I'm asking if it is also unbounded in practice.  That is,
do font designers add ligatures for arbitrary combinations of
characters, regardless of some reasonable set of requirements?  For
example, is the set of ligatures of Latin characters shown here:

  https://en.wikipedia.org/wiki/Orthographic_ligature#Latin_alphabet

reasonably complete, or should I expect any number of other arbitrary
combinations of Latin characters popping up in fonts?  And if the
latter, then what is the purpose of providing such arbitrary
ligatures?

> > To be specific, I'm talking about 2 kinds of ligatures:
> > 
> >   . ligatures made of Latin characters, like "ffi" and "Th"
> >   . ligatures produced from symbols, like "==>" that is
> > converted into ⟹

Yes, these are the only cases that I'm asking here about.  I'm not
asking about shaping complex scripts such as Arabic, where this
problem doesn't exist AFAIK.

> Have you addressed the cursive scripts yet, such as Arabic?  At its
> simplest, most consonants have four shapes, initial, medial, final and
> isolated, and roughly speaking the shape used depends on the adjacent
> spacing characters.  For the most part, Emacs would have to pass whole
> words into HarfBuzz for shaping.  In some of the more advanced fonts,
> the vowel marks in a word may also affect the shape of the consonant
> skeleton.  And of course, sometimes the Arabic script prefers to join
> letters vertically, as well as having a few straightforward ligatures.

I'm not talking about Arabic.  Emacs has a set of regular expressions
for sequences of Arabic characters that need shaping, misc-lang.el in
Emacs.  If the set is incomplete, we can augment it.

> A cursive Latin script font may behave in the same way, with the shape
> of letters depending on what precedes and follows them.  With a small
> enough character repertoire, there might be no ligatures, but your
> rendering logic would fail miserably.

If a font requires special shaping for any sequence of any number of
26 (or maybe 52) ASCII letters, then the Emacs display engine will
need to be redesigned.  So this extreme possibility doesn't bother me.

> How would you handle the possibility that all three of <æ>,  and
>  might be rendered by the same glyph, althouɡh they are
> comprised of 1, 2 and 3 characters respectively?

By using a composition rule that matches both  and .
The rules are regexp-based, and expressing the above as a regexp is
simple.  Once a sequence of characters matches the regexp, Emacs calls
the shaper (hb_shape etc.) to produce the font glyphs for the
sequence, and displays the glyphs that the shaper returns.

> And if Emacs is not imposing a normalisation, then all the
> precomposed characters in Unicode might have been entered as one or
> as more than one character?

If you are talking about composition with combining characters, Emacs
already has the rules to compose them as described above.  You can try
this in your Emacs: insert a, then U+0301 COMBINING ACUTE ACCENT, and
you should see them composed into a single glyph (provided that you
use a suitable font).

But I'm not asking about character composition in general, I'm asking
specifically about ligatures of ASCII characters, without any
non-ASCII codepoints or combining accents.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


[HarfBuzz] Ligatures

2020-05-22 Thread Eli Zaretskii
Hi,

This is a bit off-topic, but I thought it could be appropriate to ask
here, since we have here some of the best experts on this subject.

We are discussing support for ligatures in Emacs, specifically when
using HarfBuzz as the shaping engine.  See the discussion from

  https://lists.gnu.org/archive/html/emacs-devel/2020-05/msg02493.html

The current support for producing ligatures works in the same way as
complex text shaping for scripts that require that, like Arabic and
Khmer: the sequences of characters that can be displayed as ligatures
are identified in advance with suitable regular expressions, and the
display engine then passes these sequences to hb_shape to produce the
ligatures.

This works well for scripts that require complex shaping, because such
scripts generally have well-defined rules for the sequences of
codepoints that need shaping.  My original thoughts were that
ligatures could be supported in the same way, based on the assumption
that the list of possible ligatures is finite and can be stored in a
suitable data stricture in advance.

However, I'm being told that this assumption is false, and that each
font defines ligatures from any number of arbitrary combinations of
characters, and therefore the exhaustive list of the ligatures is in
practice infinite and cannot be provided in advance.  The only way of
doing this right, I'm told, is to either (a) query the font to get the
list of all the ligatures it supports, or (b) assume any combination
of characters can produce a ligature, and therefore we need to pass
all the characters intended for display through hb_shape.  The latter
in particular is in stark contrast to how the current Emacs display
code is designed and implemented.

To be specific, I'm talking about 2 kinds of ligatures:

  . ligatures made of Latin characters, like "ffi" and "Th"
  . ligatures produced from symbols, like "==>" that is
converted into ⟹

Can someone please tell what are the recommended practices regarding
these ligatures?  Is the set of possible ligatures indeed infinite and
impossible to know in advance?  And does HarfBuzz have APIs to query a
font about the ligatures it supports?

Thanks in advance for any help.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Support for Stylistic Sets

2019-09-15 Thread Eli Zaretskii
> Date: Sun, 15 Sep 2019 12:37:25 +0100
> From: Richard Wordingham 
> 
> > > > Does HarfBuzz guess the language?  
> > >
> > > Yes.
> 
> It seems to use the current locale.  That will usually be wrong for
> cuneiform, and generally be wrong for multilingual text.

But Emacs currently doesn't know better anyway.  When it does, we will
pass that information to HarfBuzz, but for now I see no reason to
replace HarfBuzz's guess based on the locale by Emacs's guess based on
that same locale.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Support for Stylistic Sets

2019-09-15 Thread Eli Zaretskii
> From: Nikolay Sivov 
> Date: Sun, 15 Sep 2019 10:03:01 +0300
> Cc: Richard Wordingham , 
>   Harfbuzz 
> 
>  > Essentially yes, i.e. unsupported features will simply be ignored.
> 
>  Then there's no need to know whether a feature is supported.  Thanks.
> 
> MS Word for example shows a preview for each support ssXX feature, and user 
> can select one they want.
> 
> I don't know how (or why) you plan to use that for emacs, but you'll need to 
> have some logic to figure out
> which one to enable.

I think this should be up to the user and/or the application,
i.e. Lisp program that wants to take advantage of these features.

Or maybe I misunderstand what you mean by "figure out which one to
enable"?  Can you elaborate on the potential pitfalls?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Support for Stylistic Sets

2019-09-14 Thread Eli Zaretskii
> Date: Sat, 14 Sep 2019 21:33:00 +0100
> From: Richard Wordingham 
> Cc: harfbuzz@lists.freedesktop.org
> 
> On Sat, 14 Sep 2019 21:15:04 +0300
> Eli Zaretskii  wrote:
> 
> > > Date: Sat, 14 Sep 2019 18:13:25 +0100
> > > From: Richard Wordingham 
> > > 
> > > I think it's safe to specify the use of unsupported features, in
> > > which case this is a luxury feature.  
> > 
> > you mean, specifying an unsupported feature will not cause hb_shape to
> > fail, but instead just use the nominal glyphs?
> 
> Essentially yes, i.e. unsupported features will simply be ignored.

Then there's no need to know whether a feature is supported.  Thanks.

> > Emacs currently leaves it to HarfBuzz to guess the language, so I
> > don't think this is an issue.
> 
> Does HarfBuzz guess the language?

Yes.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Support for Stylistic Sets

2019-09-14 Thread Eli Zaretskii
> Date: Sat, 14 Sep 2019 18:13:25 +0100
> From: Richard Wordingham 
> 
> I think it's safe to specify the use of unsupported features, in which
> case this is a luxury feature.

you mean, specifying an unsupported feature will not cause hb_shape to
fail, but instead just use the nominal glyphs?

> One complication is that features are provided by a font on a (per
> script) per language basis.

Why is that a complication?  The user who requests the feature should
do so only for text of a suitable script, no?

> For example, my Da Lekh font provides feature ss19 for the default
> language, but not for Lao, Tai Lü or 'Shan'. In this font, Feature
> ss19 means apply Lao style, and that is applied automatically if the
> font is told it is being used for Lao. It would be a bit off to tag
> aerated Pali text as Lao just to get a Lao style. Aerated Pali has
> different line-breaking rules to Lao, which is written without visible
> word separation.

Emacs currently leaves it to HarfBuzz to guess the language, so I
don't think this is an issue.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Support for Stylistic Sets

2019-09-14 Thread Eli Zaretskii
> From: Nikolay Sivov 
> Date: Sat, 14 Sep 2019 17:00:53 +0300
> Cc: Harfbuzz 
> 
>  AFAIU, HarfBuzz does support Stylistic Sets, but it is not clear to me
>  what should an application do to request glyphs corresponding to a
>  certain stylistic set.
> 
>  Suppose an application wants to display a text string using a specific
>  stylistic set -- could someone please outline the sequence of API
>  calls to get that, or point me to some documentation which describes
>  that?
> 
> Hi, Eli.
> 
> I think it must be a matter of enabling features explicitly, in case you're 
> asking about it would be features
> ss01-ss20, see hb_shape() arguments documentation.
> Basically, you set hb_feature_t fields to appropriate tag, value (1 for 
> enabled), and start/end limits. That should
> do it.

I'm beginning to see the light, thanks.

So hb_feature_t's 'value' field should always be 1 for an enabled
features, and its 'tag' field should be something like

  HB_TAG('s', 's', '0', '1')

is that right?

The next question is how to know whether a given hb_font_t supports a
given feature?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Support for Stylistic Sets

2019-09-14 Thread Eli Zaretskii
Hi,

AFAIU, HarfBuzz does support Stylistic Sets, but it is not clear to me
what should an application do to request glyphs corresponding to a
certain stylistic set.

Suppose an application wants to display a text string using a specific
stylistic set -- could someone please outline the sequence of API
calls to get that, or point me to some documentation which describes
that?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Display issue with DejaVu Sans Mono font

2019-08-19 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Mon, 19 Aug 2019 01:05:51 +0200
> Cc: Harfbuzz 
> 
> > So this is indeed some problem with that particular font?
> 
> It is partly a font issue (missing anchors and combining marks default 
> position to the left of base glyph), and partly HarfBuzz design decision of 
> preferring composed forms. See 
> https://github.com/harfbuzz/harfbuzz/issues/653.

Thanks for the pointer.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Display issue with DejaVu Sans Mono font

2019-08-17 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Sun, 18 Aug 2019 00:48:14 +0200
> Cc: Harfbuzz 
> 
> >  https://lists.gnu.org/archive/html/bug-gnu-emacs/2019-08/msg01082.html
> > 
> > Is there something wrong with this font when displaying this sequence,
> > or is there some kind of bug in Emacs and/or HarfBuzz?
> 
> The second accent is placed next to the glyph, but hb-view is incorrectly 
> clipping the image, as you can see from hb-shape output:
> 
> $ hb-shape DejaVuSansMono.ttf -u '061,301,302'
> [aacute=0+1233|uni0302=0+0]
> 
> Adding some margins gives:
> 
> $ hb-view DejaVuSansMono.ttf -u '061,301,302’ --margin=0,150,0,0
> 
> 
> 
> HarfBuzz will compose U+0061 + U+0301 to U+00E1 (since it prefers composed 
> form when supported by the font), and that glyph does not have anchors to 
> position any marks above it, so the circumflex ends up with its default 
> position next to the glyph.

So this is indeed some problem with that particular font?  Because
other fonts, including monospaced ones, don't seem to produce the same
problem: the U+0302 glyph is correctly placed on the base character.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Failure in hb_font_get_nominal_glyph

2019-07-25 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Thu, 25 Jul 2019 12:08:43 -0400
> Cc: Khaled Hosny , harfbuzz@lists.freedesktop.org
> 
> Looks good to me. 

Thanks!
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Failure in hb_font_get_nominal_glyph

2019-07-25 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Wed, 24 Jul 2019 15:21:15 -0400
> Cc: Eli Zaretskii , 
>   "harfbuzz@lists.freedesktop.org" 
> 
> Ah, right.  Yes.  Before 2.0.0 you'd have to call hb_ot_font_set_funcs() 
> explicitly...
> 
> Thanks Khaled!

Thanks.

Just to be sure I understand: is the below the right fix?

diff --git a/src/w32uniscribe.c b/src/w32uniscribe.c
index aa6bebd..8fbbe7e 100644
--- a/src/w32uniscribe.c
+++ b/src/w32uniscribe.c
@@ -32,6 +32,7 @@ #define _WIN32_WINNT 0x0600
 #include 
 #ifdef HAVE_HARFBUZZ
 # include 
+# include /* for hb_ot_font_set_funcs */
 # if GNUC_PREREQ (4, 3, 0)
 #  define bswap_32(v)  __builtin_bswap32(v)
 # else
@@ -1305,7 +1308,12 @@ w32hb_get_font (struct font *font, double *scale)
   hb_face_t *hb_face =
 hb_face_create_for_tables (w32hb_get_font_table, font_handle, NULL);
   if (hb_face_get_glyph_count (hb_face) > 0)
-hb_font = hb_font_create (hb_face);
+{
+  hb_font = hb_font_create (hb_face);
+  /* This is needed for HarfBuzz before 2.0.0; it is the default
+in later versions.  */
+  hb_ot_font_set_funcs (hb_font);
+}
 
   struct uniscribe_font_info *uniscribe_font =
 (struct uniscribe_font_info *) font;
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Failure in hb_font_get_nominal_glyph

2019-07-23 Thread Eli Zaretskii
Could someone please take a look at the problems described here:

  https://lists.gnu.org/archive/html/emacs-devel/2019-07/msg00540.html
  https://lists.gnu.org/archive/html/emacs-devel/2019-07/msg00557.html
  https://lists.gnu.org/archive/html/emacs-devel/2019-07/msg00558.html
  https://lists.gnu.org/archive/html/emacs-devel/2019-07/msg00561.html

and tell whether it is expected that HarfBuzz 1.7.5 is too old to
support hb_font_get_nominal_glyph reliably on MS-Windows?  According
to the HarfBuzz docs, that function is available since v1.2.3.

Or maybe the code we have in Emacs has a bug?  If you want to have a
look at the code that fails, it is here:

  http://git.savannah.gnu.org/cgit/emacs.git/tree/src/w32uniscribe.c#n1328

In a nutshell, the question is: why would hb_font_get_nominal_glyph
fail for the Courier New font, even when we are requesting a glyph for
an ASCII character?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Order of combining diacriticals

2019-06-20 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Thu, 20 Jun 2019 22:09:24 +0200
> Cc: Behdad Esfahbod , Harfbuzz 
> 
> 
> I mean whether you are using HarfBuzz with FreeType font functions,
> internal ones or something custom does not matter for fallback Hebrew
> shaping.
> 
> If you want to additionally use HarfBuzz with bitmap or Type 1 fonts
> on Windows, you would need to implement custom font functions for
> thase that would use GDI API to access glyph metrics and kerning, but
> this is orthogonal to fallback shaping.

Ah, okay.  I understand now, thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Order of combining diacriticals

2019-06-20 Thread Eli Zaretskii
> From: Khaled Hosny 
> Date: Thu, 20 Jun 2019 17:33:47 +0200
> Cc: Behdad Esfahbod , Harfbuzz 
> 
> 
> > >. For fonts that have no 'hebr' features, Emacs performs
> > >  substitution of known precomposed characters before it invokes the
> > >  shaping engine.  In this case, it substituted U+FB31 for the
> > >  sequence U+05D1,U+05BC, and passed the sequence U+FB31,U+05B0 to
> > >  HarfBuzz.
> > >
> > > You should remove all such hacks.
> >
> > I understand that for HarfBuzz they are probably not needed, if the
> > necessary functions for accessing the glyphs are provided (something
> > that might not be true on Windows, where we don't use Freetype
> > directly).
> 
> This functionality either depends on Unicode decompositions (or in
> case of Hebrew hard-coded tables in HarfBuzz), so the font functions
> used make no difference.

I'm not sure I understand what font functions you are talking about
here.

The simplest font backends in Emacs: Xfont on Unix and GDI on
MS-Windows, when working with fonts that don't have the necessary OTF
features, might be unable, to figure out that certain combinations of
base character and combining mark have precomposed glyphs in the font
being used.  So Emacs feeds them the precomposed characters instead.

How are font functions related to this?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HB config

2019-06-19 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Tue, 18 Jun 2019 12:12:44 -0700
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
> Hi Jonathan, Dominik, others,
> 
> You might have noticed I spent last couple of months trimming down HarfBuzz 
> binary size.  I put some notes
> together in the repo:
> 
>   https://github.com/harfbuzz/harfbuzz/blob/master/CONFIG.md
> 
> I like to hear any feedback, as well as any other tricks that need to be 
> documented.

Thank you very much, there's a lot of very useful information there.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Order of combining diacriticals

2019-06-14 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Fri, 14 Jun 2019 11:34:17 -0700
> Cc: Khaled Hosny , 
>   "harfbuzz@lists.freedesktop.org" 
> 
> On Thu, Jun 13, 2019 at 2:18 AM Eli Zaretskii  wrote:
> 
>. For fonts that have no 'hebr' features, Emacs performs
>  substitution of known precomposed characters before it invokes the
>  shaping engine.  In this case, it substituted U+FB31 for the
>  sequence U+05D1,U+05BC, and passed the sequence U+FB31,U+05B0 to
>  HarfBuzz.
> 
> You should remove all such hacks.

I understand that for HarfBuzz they are probably not needed, if the
necessary functions for accessing the glyphs are provided (something
that might not be true on Windows, where we don't use Freetype
directly).  But Emacs also has other font backends, which are not as
capable.

In any case, this particular situation uncovered a subtle bug in how
Emacs uses the information provided by HarfBuzz, so it was a Good
Thing we did have this particular hack.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Order of combining diacriticals

2019-06-12 Thread Eli Zaretskii
In Emacs, we use HB_BUFFER_CLUSTER_LEVEL_MONOTONE_GRAPHEMES cluster
level, because HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS produced
incorrect display.  With this level, whenever I type a Hebrew base
character with more than one diacritical, I need to type them in
certain order, otherwise the display is incorrect.

For example, in this series of characters:

  U+05D1 HEBREW LETTER BET
  U+05B0 HEBREW POINT SHEVA
  U+05BC HEBREW POINT DAGESH

I need to type them in the above order; if I type DAGESH before SHEVA,
the produced display is incorrect.

Is this expected with level-0 clusters?  Or should I look for a bug in
how Emacs uses HarfBuzz?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-07 Thread Eli Zaretskii
> Date: Fri, 7 Jun 2019 05:31:33 +0200
> From: Khaled Hosny 
> Cc: Behdad Esfahbod , harfbuzz@lists.freedesktop.org
> 
> > > HarfBuzz handles everything it understands.  It was designed, in fact, 
> > > such that when combined with
> > > FreeType or other external font funcs implementation, it even "handles" 
> > > font formats it does not understand. 
> > > Eg. HarfBuzz doesn't understand BDF, PCF, etc, but if you use hb-ft, you 
> > > can use hb-ft for everything, and
> > > BDF, PCF etc also magically work because HarfBuzz defers to FreeType for 
> > > glyph access, and simply
> > > "passes through" for the rest.  It was designed such that you can keep 
> > > one shaping code path.
> > 
> > We don't currently use hb-ft on Windows.  But thanks, I think I
> > understand.
> 
> You can achieve the same by implementing font functions for the font
> formats HarfBuzz does not directly support, using e.g. GDI API to access
> glyph info in these fonts (see hb_font_funcs_set_* functions).

Thanks, I will look into this, time permitting.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Emacs now uses HarfBuzz

2019-06-07 Thread Eli Zaretskii
This is to let you know that the master branch of Emacs now uses
HarfBuzz as its shaping engine.

I would like to thank everyone here for your help in making this
happen, whether by contributing code or by advice (or both).

You may now wish to add Emacs to the list pf projects which use
HarfBuzz.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-06 Thread Eli Zaretskii
> Date: Thu, 6 Jun 2019 01:31:19 +0100
> From: Richard Wordingham 
> 
> On Wed, 05 Jun 2019 20:26:41 +0300
> Eli Zaretskii  wrote:
> 
> > To make the question perhaps more concrete: the current code considers
> > a font to be a match for shaping with HarfBuzz if it's either OTF or
> > TTF, and covers at least one Unicode sub-range above u+00FF
> > codepoint.  Is this a reasonable test, or should the code consider
> > additional font features?
> 
> Even that's fraught.  For example, my Tai Tham font Da Lekh includes
> some Thai characters because they're used with Tai Tham text, but
> doesn't include Thai script characters that aren't.  I trust you're
> allowing for the fact that a font for an Indian script will typically
> use the dandas from the Devanagari block, without the font supporting
> anything else from the Devanagari block.

That's another layer of matching in Emacs.  The lower layer constructs
a list of all fonts that could match, and then a higher layer tests
which one of those actually match the requirements of the script.  I
was talking about the former one, you are talking about the latter.

> 1) Some good old faces may lack punctuation characters and logograms.
> This doesn't mean the fonts haven't been equipped with new, good GSUB
> and GPOS tables.
> 
> 2) There seems to be an implication that Lao usage only uses one set of
> digits.
> 
> 3) A Lao-based font would omit some consonants because they aren't used
> in the Lao tradition.
> 
> 4) Some of the consonant marks are alien to modern Northern Thai
> habits, and may therefore be omitted from an old typeface.
> 
> Some fonts omit explicit shaping for Tai Tham because they entirely
> reasonably want to avoid the USE.  (Rumour has it that Andrew Glass
> wants to ban some words from being shaped properly.)  They rely on the
> shaping being done by other features as applied to the default script.
> This doesn't work well on Windows, but could work well with HarfBuzz as
> the renderer. It's only a heuristic that they have a restricted
> repertoire - proper DIY Indic rearrangement is a pain, but even I can
> achieve it.
> 
> Restricted repertoire would be very reasonable for a Myanmar script
> font - it's a more extreme version of the fact that Icelandic and
> German don't have the same set of letters.

If all else fails, Emacs offers a facility for specifying the fonts to
be used, which could go down to individual codepoints.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-06 Thread Eli Zaretskii
> Date: Thu, 6 Jun 2019 09:56:18 +0700
> From: Martin Hosken 
> 
> In case it is unclear, harfbuzz can quite happily handle any TTF or OTF 
> whether or not it is designed to be shaped with OpenType or not. So you only 
> need one code path and can simply pass any font to harfbuzz for shaping and 
> harfbuzz will do the Right Thing (TM). Good news, I would suggest :)

Thanks, I think it's clear.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-05 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Wed, 5 Jun 2019 12:45:00 -0700
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
> HarfBuzz handles everything it understands.  It was designed, in fact, such 
> that when combined with
> FreeType or other external font funcs implementation, it even "handles" font 
> formats it does not understand. 
> Eg. HarfBuzz doesn't understand BDF, PCF, etc, but if you use hb-ft, you can 
> use hb-ft for everything, and
> BDF, PCF etc also magically work because HarfBuzz defers to FreeType for 
> glyph access, and simply
> "passes through" for the rest.  It was designed such that you can keep one 
> shaping code path.

We don't currently use hb-ft on Windows.  But thanks, I think I
understand.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-05 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Wed, 5 Jun 2019 12:07:36 -0700
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
> In other words, I don't know of a legitimate way to filter out broken fonts 
> like code2000.  If that's what you are
> asking for.

No, I wasn't asking about Code2000, I was asking a more general
question.

> Let me ask it differently: why do you think you need to filter anything out?

I assumed that some fonts will not benefit from HarfBuzz, i.e. will
not support complex script shaping, because they lack some fundamental
features HarfBuzz needs.

When Emacs needs to find a font for displaying a character which is
not supported by the default font, it scans the available fonts on the
system, looking for matching fonts.  On Windows, we currently have 2
matching criteria: one for fonts suitable for shaping with Uniscribe,
the other for all the rest (the latter generally don't support complex
script shaping).  For HarfBuzz, the code currently employs the same
matching criteria as for Uniscribe (I described them roughly in a
previous message).  I was asking whether HarfBuzz has additional
requirements from fonts, or would any font that's good for Uniscribe
will be good for HarfBuzz.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Selecting fonts for HarfBuzz

2019-06-05 Thread Eli Zaretskii
> Date: Wed, 05 Jun 2019 05:36:11 +0300
> From: Eli Zaretskii 
> Cc: harfbuzz@lists.freedesktop.org
> 
> > We assume fonts support shaping.  Ie. we don't have a way to check for font 
> > suitability for correct shaping.
> 
> I understand, thanks.  I wasn't asking how to do that with HarfBuzz, I
> was asking what font features should my font matching function examine
> to make sure the font will "support shaping" in the HarfBuzz sense.
> Features that can be tested without actually shaping some text, of
> course, i.e. without actually opening the font and using it.

To make the question perhaps more concrete: the current code considers
a font to be a match for shaping with HarfBuzz if it's either OTF or
TTF, and covers at least one Unicode sub-range above u+00FF
codepoint.  Is this a reasonable test, or should the code consider
additional font features?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-06-02 Thread Eli Zaretskii
> Date: Sun, 2 Jun 2019 20:29:15 +0100
> From: Richard Wordingham 
> Cc: harfbuzz@lists.freedesktop.org
> 
> On Sun, 02 Jun 2019 21:01:35 +0300
> Eli Zaretskii  wrote:
> 
> > The version of HarfBuzz I built on Windows and am using with Emacs has
> > Graphite support, so I reckon I don't have to worry about picking up a
> > Graphite shaper?
> 
> It depends what you want to do with the shaper.  If you want to study
> what it does in the way of sequencing the glyphs, you need to ensure
> you use the shaper you want to study!  The order the glyphs are
> presented to the renderer may be very different between using a
> Graphite shaper and using the HarfBuzz OpenType shaper.  For one thing,
> swapping glyphs round is easy in Graphite and complicated in OpenType.

I don't think I understand what you mean by "Graphite shaper".  I'm
using just HarfBuzz (which has Graphite capabilities); no other shaper
is involved.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Selecting fonts for HarfBuzz

2019-06-02 Thread Eli Zaretskii
When searching the system for suitable fonts, are there any
considerations or features the client should prefer, or prefer not to
have, besides preferring OTF/TTF fonts, to produce the best shaping
via HarfBuzz?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-06-02 Thread Eli Zaretskii
> Date: Sun, 2 Jun 2019 18:35:07 +0100
> From: Richard Wordingham 
> 
> It looks as though you will have to resort to Padauk for Windows 7
> Uniscribe shaping for Myanmar, and trust that you don't accidentally
> pick up a Graphite shaper.

With Emacs learning to shape text via HarfBuzz, Uniscribe is about to
become deprecated for Emacs on Windows.  Which is good, since
Microsoft want the users to move away of Uniscribe, and the
replacement DirectWrite will probably never be supported by Emacs.

The version of HarfBuzz I built on Windows and am using with Emacs has
Graphite support, so I reckon I don't have to worry about picking up a
Graphite shaper?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Patches for building HarfBuzz with mingw.org's MinGW

2019-06-02 Thread Eli Zaretskii
Hi,

I'd like to submit a few small patches that allow HarfBuzz to be built
on Windows with mingw.org's MinGW toolchain.  (And before you ask: the
reason you don't see the problems I describe below in your MinGW
builds is that you use MinGW64, which is a different flavor of MinGW.)

The patches are against HarfBuzz 2.5.1.

Here are the patches, with explanations:

1. This patch is needed because MinGW doesn't have _BitScanForward and
_BitScanReverse.  They are only used with old GCC versions, so
conditioning their calls by those old versions of GCC is good enough,
IMO.

--- src/hb-algs.hh~02019-06-01 08:49:47.0 +0300
+++ src/hb-algs.hh  2019-06-02 11:03:52.373677900 +0300
@@ -400,7 +400,7 @@
 return sizeof (unsigned long long) * 8 - __builtin_clzll (v);
 #endif
 
-#if (defined(_MSC_VER) && _MSC_VER >= 1500) || defined(__MINGW32__)
+#if (defined(_MSC_VER) && _MSC_VER >= 1500) || (defined(__MINGW32__) && 
(__GNUC__ < 4))
   if (sizeof (T) <= sizeof (unsigned int))
   {
 unsigned long where;
@@ -474,7 +474,7 @@
 return __builtin_ctzll (v);
 #endif
 
-#if (defined(_MSC_VER) && _MSC_VER >= 1500) || defined(__MINGW32__)
+#if (defined(_MSC_VER) && _MSC_VER >= 1500) || (defined(__MINGW32__) && 
(__GNUC__ < 4))
   if (sizeof (T) <= sizeof (unsigned int))
   {
 unsigned long where;


2. This patch is needed because mingw.org's MinGW defines
MemoryBarrier as an inline function, not as a macro.
__MINGW32_VERSION is defined only by mingw.org's MinGW, so the change
shouldn't affect MinGW64.

--- src/hb-atomic.hh~0  2019-05-27 20:07:58.0 +0300
+++ src/hb-atomic.hh2019-06-02 10:55:49.013099500 +0300
@@ -107,7 +107,7 @@
 
 static inline void _hb_memory_barrier ()
 {
-#ifndef MemoryBarrier
+#if !defined(MemoryBarrier) && !defined(__MINGW32_VERSION)
   /* MinGW has a convoluted history of supporting MemoryBarrier. */
   LONG dummy = 0;
   InterlockedExchange (, 1);


3. This patch is needed because MinGW doesn't define
E_NOT_SUFFICIENT_BUFFER.

--- src/hb-uniscribe.cc~0   2019-05-14 03:28:16.0 +0300
+++ src/hb-uniscribe.cc 2019-06-02 11:04:43.843081900 +0300
@@ -31,6 +31,10 @@
 #include 
 #include 
 
+#ifndef E_NOT_SUFFICIENT_BUFFER
+#define E_NOT_SUFFICIENT_BUFFER HRESULT_FROM_WIN32 (ERROR_INSUFFICIENT_BUFFER)
+#endif
+
 #include "hb-uniscribe.h"
 
 #include "hb-open-file.hh"


4. This patch is needed because mingw.org's MinGW doesn't have the
intrin.h header file; instead, the intrinsics are declared by
including windows.h.

--- src/hb.hh~0 2019-05-14 09:42:00.0 +0300
+++ src/hb.hh   2019-06-02 11:06:01.413041500 +0300
@@ -183,8 +183,15 @@
 #include 
 
 #if (defined(_MSC_VER) && _MSC_VER >= 1500) || defined(__MINGW32__)
+#ifdef __MINGW32_VERSION
+#ifndef WIN32_LEAN_AND_MEAN
+#define WIN32_LEAN_AND_MEAN 1
+#endif
+#include 
+#else
 #include 
 #endif
+#endif
 
 #define HB_PASTE1(a,b) a##b
 #define HB_PASTE(a,b) HB_PASTE1(a,b)


Thank you for developing HarfBuzz.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-06-02 Thread Eli Zaretskii
> Date: Fri, 31 May 2019 08:54:50 +0300
> From: Eli Zaretskii 
> Cc: harfbuzz@lists.freedesktop.org
> 
> > Date: Thu, 30 May 2019 21:19:00 +0100
> > From: Richard Wordingham 
> > Cc: harfbuzz@lists.freedesktop.org
> > 
> > > I don't see any reordering here (with HarfBuzz), but maybe it's
> > > because the only font I have that covers Myanmar is Code2000.
> > 
> > That's probably the problem.  I have Version 1.171 of the font, and the
> > closest is comes to layout support for Myanmar is empty lists of
> > lookups for undefined script "myan".  The script tag should be "mymr",
> > so HarfBuzz applies no script-specific shaping.  There may be other
> > issues, as changing "myan" to "mymr" doesn't fix the problem.
> 
> That figures, as Emacs by default claims there are no fonts on this
> system that support Myanmar, and I need to force it use Code2000.
> 
> I will try with other fonts later.

With Da Lekh I do see the reordering, but only with HarfBuzz as the
font backend; Uniscribe doesn't seem to support that, at least not the
version on Windows 7.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-05-30 Thread Eli Zaretskii
> Date: Thu, 30 May 2019 21:19:00 +0100
> From: Richard Wordingham 
> Cc: harfbuzz@lists.freedesktop.org
> 
> > I don't see any reordering here (with HarfBuzz), but maybe it's
> > because the only font I have that covers Myanmar is Code2000.
> 
> That's probably the problem.  I have Version 1.171 of the font, and the
> closest is comes to layout support for Myanmar is empty lists of
> lookups for undefined script "myan".  The script tag should be "mymr",
> so HarfBuzz applies no script-specific shaping.  There may be other
> issues, as changing "myan" to "mymr" doesn't fix the problem.

That figures, as Emacs by default claims there are no fonts on this
system that support Myanmar, and I need to force it use Code2000.

I will try with other fonts later.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-05-30 Thread Eli Zaretskii
> Date: Thu, 30 May 2019 19:48:34 +0100
> From: Richard Wordingham 
> 
> The reordering is that the order in the backing store is:
> 
> 
> 
> but the ordering in the display, left to right, is:
> 
> 
> 
> I'd be surprised if this caused much problem.  I think the big issue is
> related to the different meaning of advance width for left-to right and
> right-to-left layout.  The OpenType scheme just changes the order of
> the major base glyphs for (non-Kharoshthi) Indic reordering, so what you
> see on the page is what you have in the glyph sequence. 

I don't see any reordering here (with HarfBuzz), but maybe it's
because the only font I have that covers Myanmar is Code2000.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-05-30 Thread Eli Zaretskii
> Date: Thu, 30 May 2019 01:18:24 +0100
> From: Richard Wordingham 
> 
> On Wed, 29 May 2019 22:32:12 +0300
> Eli Zaretskii  wrote:
> 
> The attached files shows a rendering of  KA, U+1A6E TAI THAM VOWEL SIGN E, U+1A63 TAI THAM VOWEL SIGN AA>; one
> could equally well use .  The
> visual order (in the direction of the script, from left to right) is
> . 

What font(s) do you use for these scripts?

Also, I'm not sure I understand why you describe some kind of
reordering in this case: AFAICT, all of the characters you mentioned
have string L directionality.  So why would they need to be reordered?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to make sure an hb_font_t object is valid?

2019-05-30 Thread Eli Zaretskii
> From: Ebrahim Byagowi 
> Date: Thu, 30 May 2019 11:14:13 +0430
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
> Oh hb_font_t, I am sorry, as far as I know they are always valid, I don't 
> know of a case that it can be invalid
> other than having an invalid hb_face_t. Maybe others can help better on this. 
> Relying on hb_shape_full result is
> not that common practice as most of clients don't use it and they use 
> hb_shape which returns void, I suggest
> you to stick to that also.

As far as I remember, it was Khaled who wrote the shaper, and his
original code used hb_shape_full.  We just didn't dare to change that,
although I can see that the arguments we actually pass to the shaper
don't really justify calling hb_shape_full, it only provides the
return value, unlike hb_shape, so maybe Khaled wanted that value for
more solid code?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] HarfBuzz shaping of R2L text

2019-05-29 Thread Eli Zaretskii
> Date: Wed, 29 May 2019 21:18:48 +0200
> From: Khaled Hosny 
> Cc: harfbuzz@lists.freedesktop.org
> 
> AFAIK, yes this is expected. Usually the glyph order shouldn’t matter,
> one just draws them as they are ordered by HarfBuzz and for anything
> that requires glyph to glyph to character mapping, the clusters provide
> all the information needed.

The display looks correct, I was just surprised that the order was
reversed regardless of the buffer's direction.

> As it happens, somewhere in Emacs does not like that for whatever reason
> and would raw the glyph in the wrong order, so it my HarfBuzz in Emacs
> integration code I used hb_buffer_reverse_clusters() right after shaping
> to get the glyph correctly drawn.

AFAICT, hb_buffer_reverse_clusters doesn't reverse the order of the
glyphs, it only renumbers the clusters such that they are in ascending
order.  And in the specific case I described, there's only one cluster
anyway (I use HB_BUFFER_CLUSTER_LEVEL_MONOTONE_GRAPHEMES, because
HB_BUFFER_CLUSTER_LEVEL_MONOTONE_CHARACTERS caused problems on
Windows).

> No idea how Emacs would deal with reordered Indic glyphs which don’t
> always follow the input order.

Can you show an example of such a situation and what is expected from
the correct shaping and display?  I could then see what happens in
Emacs.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] How to make sure an hb_font_t object is valid?

2019-05-29 Thread Eli Zaretskii
Last time I asked a similar question, I was told to use
hb_face_get_glyph_count.  But eventually I need to know that an
hb_font_t I create from the face is valid and can be used for
shaping.  What are the best practices for doing that?

Or maybe the shaper will return 'false' when given an invalid font,
and all I need is to test the return value of the shaper?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] HarfBuzz shaping of R2L text

2019-05-29 Thread Eli Zaretskii
Hi,

While testing the results of hb_shape_full called to shape R2L text, I
observed behavior that surprised me: shaping an R2L base letter with a
diacritical produces a sequence of glyphs in reverse order, i.e. the
glyph for the diacritical comes first, before the base letter.

For example, if I shape the sequence (in the logical order)

  U+05EA HEBREW LETTER TAV
  U+05BB HEBREW POINT QUBUTS

the glyphs left in the buffer by the shaper are in reverse order,
first QUBUTS, then TAV.  I thought that this was because of bidi
reordering, but the result doesn't change if I set the buffer
direction to LTR before calling the shaper.  The order of the clusters
does change with the direction, i.e. with LTR the first cluster is
zero, followed by 1, etc., whereas with RTL the clusters are in the
decreasing order.  But the glyphs are always in the same order: the
point first, then the letter.

I see the same with the Arabic script if I shape U+0633 followed by
U+0651 (in logical order).

This doesn't happen with LTR text in unidirectional scripts, including
with Latin text when shaping a base letter followed by a diacritical.

Is this expected behavior?  If so, what are the reasons?  Also, can it
be controlled by the client application?  E.g., Uniscribe can be told
to produce glyphs in the logical order, after shaping them for RTL
display.

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-29 Thread Eli Zaretskii
> Cc: beh...@behdad.org, harfbuzz@lists.freedesktop.org
> From: Jonathan Kew 
> Date: Sat, 11 May 2019 22:15:46 +0100
> 
> > Would wrapping in a blob the buffer returned by GetFontData be enough?
> 
> If you use GetFontData to get the complete font as a single buffer (i.e. 
> pass zero for the dwTable parameter), yes.
> 
> Alternatively, you could use hb_face_create_for_tables, with a 
> reference_table_func that uses GetFontData to read individual tables 
> when harfbuzz asks for them.

FTR, I found that using GetFontData to produce a blob that wraps the
entire data of a font does work, but is not really practical, except
in small test programs.  If you have a program that occasionally needs
to load many fonts in order to display many different scripts at the
same time (Emacs basically does that all the time), you will likely
run out of memory, especially in 32-bit builds, because some fonts are
simply huge (I've seen fonts of several dozen MBs).  A 32-bit build of
Emacs ran out of memory when displaying the HELLO file, which shows a
greeting in many different scripts.

So eventually, I went with the hb_face_create_for_tables method.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Units of members of hb_glyph_position_t

2019-05-28 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Tue, 28 May 2019 15:03:48 -0400
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
>  > You pick what value you want to represent one pixel as.  Say, you choose 
> 1024.  Then if you want to
>  render at
>  > "16px" font size, you set scale to 16*1024.  That's all. 
> 
>  And then the values of hb_glyph_position_t should be divided by 1024
>  to produce pixels when using this hb_font_t object?
> 
> Yes.

OK, thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Units of members of hb_glyph_position_t

2019-05-28 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Tue, 28 May 2019 14:46:45 -0400
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
> You pick what value you want to represent one pixel as.  Say, you choose 
> 1024.  Then if you want to render at
> "16px" font size, you set scale to 16*1024.  That's all. 

And then the values of hb_glyph_position_t should be divided by 1024
to produce pixels when using this hb_font_t object?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Units of members of hb_glyph_position_t

2019-05-28 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Mon, 27 May 2019 21:21:10 -0400
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
>  control those units (if they are under the client program's control).
> 
> They are controlled mainly using hb_font_set_scale().
>  
>  In particular, I get a huge value of x_advance for the letter U+05EA
>  HEBREW LETTER TAV when it is followed by U+05BB HEBREW POINT QUBUTS.
>  The value of x_advance I get is 1229, which is too large even after
>  dividing by 64 (which, btw, I still am not sure is TRT in my case,
> 
> FreeType works in 26.6 fixed-point, ie. 64 units per 1.0.  That's where the 
> 64 value comes from.  And you
> don't see it in your code because hb_ft_font_create* sets that on hb_font for 
> you.
> 
> In your Windows code, you should call hb_font_set_scale().  I believe right 
> now you are *not* calling, and you
> get values in the face's UPEM.  That's the default scale for fonts.  You can 
> get the face UPEM using
> hb_face_get_upem(). 

OK, I figured out how to scale the units from UPEM to pixels for a
given font size, and now I see reasonable results after such scaling.

However, I think something is still amiss, because I still don't
understand how to determine the values with which to call
hb_font_set_scale.  Say I call it with an integer value N, what will
that produce in terms of values of hb_glyph_position_t?  Will the
values there be in the 0..N range, where N means the full height of
the em box?  If so, how would I then convert those values to pixels --
this conversion will need the font size as well, right?  And if so, I
might well leave the values in UPEM units, and convert them to pixels
by hand.  I feel that I'm still missing something, since you said "you
should call hb_font_set_scale".  So presumably if I call that
function, conversion to pixels will somehow become easier?

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Units of members of hb_glyph_position_t

2019-05-27 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Mon, 27 May 2019 21:21:10 -0400
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
>  control those units (if they are under the client program's control).
> 
> They are controlled mainly using hb_font_set_scale().

What happens if hb_font_set_scale is not called?  Is there some kind
of default?

>  In particular, I get a huge value of x_advance for the letter U+05EA
>  HEBREW LETTER TAV when it is followed by U+05BB HEBREW POINT QUBUTS.
>  The value of x_advance I get is 1229, which is too large even after
>  dividing by 64 (which, btw, I still am not sure is TRT in my case,
> 
> FreeType works in 26.6 fixed-point, ie. 64 units per 1.0.  That's where the 
> 64 value comes from.  And you
> don't see it in your code because hb_ft_font_create* sets that on hb_font for 
> you.

hb_ft_font_create is not used in the Windows code, because the Windows
code doesn't use Freetype to open and otherwise manipulate fonts.

> In your Windows code, you should call hb_font_set_scale().  I believe right 
> now you are *not* calling, and you
> get values in the face's UPEM.  That's the default scale for fonts.  You can 
> get the face UPEM using
> hb_face_get_upem(). 

OK, thanks, I will look into this.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Units of members of hb_glyph_position_t

2019-05-27 Thread Eli Zaretskii
> Date: Mon, 27 May 2019 21:21:47 +0300
> From: Eli Zaretskii 
> 
> I cannot figure out in what units are these values reported, or how to
> control those units (if they are under the client program's control).
> In particular, I get a huge value of x_advance for the letter U+05EA
> HEBREW LETTER TAV when it is followed by U+05BB HEBREW POINT QUBUTS.
> The value of x_advance I get is 1229, which is too large even after
> dividing by 64 (which, btw, I still am not sure is TRT in my case,
> because I don't understand the source of the 64 value).

Btw, if someone wants to look at the code I'm using to call the
shaper, it's here:

  http://git.savannah.gnu.org/cgit/emacs.git/tree/src/ftfont.c?h=harfbuzz

The function that calls the HarfBuzz shaper starts at line 2978 on
that file.  This code is for GNU/Linux, but the code I'm using on
Windows (which is not yet in the repository) is an exact copy of that
function.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] Units of members of hb_glyph_position_t

2019-05-27 Thread Eli Zaretskii
I cannot figure out in what units are these values reported, or how to
control those units (if they are under the client program's control).
In particular, I get a huge value of x_advance for the letter U+05EA
HEBREW LETTER TAV when it is followed by U+05BB HEBREW POINT QUBUTS.
The value of x_advance I get is 1229, which is too large even after
dividing by 64 (which, btw, I still am not sure is TRT in my case,
because I don't understand the source of the 64 value).

Can someone please help me figure out what am I doing wrong?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get a glyph code for a character?

2019-05-25 Thread Eli Zaretskii
> Date: Sat, 25 May 2019 17:17:23 +0200
> From: Khaled Hosny 
> Cc: Richard Wordingham ,
>   harfbuzz@lists.freedesktop.org
> 
> On Sat, May 25, 2019 at 06:08:42PM +0300, Eli Zaretskii wrote:
> > > Date: Sat, 25 May 2019 15:50:38 +0100
> > > From: Richard Wordingham 
> > > 
> > > I presume you're after the glyph indicated by the raw cmap, e.g.
> > > without localisation.
> > 
> > Not sure what kind of localisation are you alluding to here.  I must
> > confess that I'm relatively ignorant about fonts, glyphs, and shaping,
> > so I'm probably missing a lot here.  For example, I have no idea what
> > is a "raw cmap".
> 
> For any given script and language, the font might provide a different
> localized glyph than the default one. Only hb_shape[_full]() will apply
> such localization.

Ah, okay.  Well, as you know, Emacs currently doesn't know the script
of a character at all, and only knows the global session-wide value of
the language, not the language of the text from which the character
came.  So in practice it seems the nominal glyph will do for now.

> Then hb_shape() is the right tool here. HarfBuzz will also automatically
> insert dotted circle for combining marks that are at the start of the
> text string if HB_BUFFER_FLAG_BOT is set on the buffer. You can safely
> set HB_BUFFER_FLAG_BOT and HB_BUFFER_FLAG_EOT on any buffer as long as
> the text passed to hb_buffer_add* functions is the full paragraph text
> not just a chunk of it (that is another reason why one should pass the
> full paragraph and the item offset and length to these function instead
> of just the substring).

Thanks, I will look into this later.  Right now I have a more urgent
issues: the glyph metrics seem to be wrong (width too large or
somesuch, not sure yet).

In general, though, Emacs never lays out entire paragraphs of text, I
think we pass at most a single screen line to the shaper.  Changing
that would probably need a significant redesign of the display code.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get a glyph code for a character?

2019-05-25 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Sat, 25 May 2019 11:01:31 -0400
> Cc: "harfbuzz@lists.freedesktop.org" 
> 
>  What is the best way of providing such a method with HarfBuzz on
>  MS-Windows?  One possibility is obviously to call hb_shape, but maybe
>  there's a simpler way for a single codepoint?
> 
> hb_font_get_nominal_glyph().
> 
> Use of such facilities in an application is quite suspect though.
>  
>  Btw, what does hb_font_get_glyph() return?
> 
> Boolean indicating whether the font supports that character.

Great, thank you very much.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get a glyph code for a character?

2019-05-25 Thread Eli Zaretskii
> Date: Sat, 25 May 2019 15:50:38 +0100
> From: Richard Wordingham 
> 
> I presume you're after the glyph indicated by the raw cmap, e.g.
> without localisation.

Not sure what kind of localisation are you alluding to here.  I must
confess that I'm relatively ignorant about fonts, glyphs, and shaping,
so I'm probably missing a lot here.  For example, I have no idea what
is a "raw cmap".

> Using hb_shape could very well result in the addition of a dotted
> circle for a combining mark - is that what you want?

AFAIK, this method is only called in Emacs for a combining mark when
we indeed want it displayed as a separate character, with the dotted
circle.  It is normally called for base (non-combining) characters.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] How to get a glyph code for a character?

2019-05-25 Thread Eli Zaretskii
One of the methods an Emacs font-backend should provide is the
encode_char method, which returns the glyph code of the selected font
for a character given by its Unicode codepoint.  For example, the XFT
backend uses the XftCharIndex function for that purpose, and the
Freetype backend uses FT_Get_Char_Index.

What is the best way of providing such a method with HarfBuzz on
MS-Windows?  One possibility is obviously to call hb_shape, but maybe
there's a simpler way for a single codepoint?

Btw, what does hb_font_get_glyph() return?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-24 Thread Eli Zaretskii
> From: Konstantin Ritt 
> Date: Fri, 24 May 2019 19:16:24 +0300
> Cc: Ebrahim Byagowi , Harfbuzz 
> 
> 
> hb_blob_t *my_reference_table(hb_face_t * /*face*/, hb_tag_t tag, void 
> *user_data)
> {
> HDC hdc = (HDC)user_data;
> SelectObject(hdc, hfont);
> 
> char *buffer = NULL;
> DWORD length = 0;
> 
> length = GetFontData(hdc, byte_swap(tag), 0, buffer, length);
> if (length == GDI_ERROR)
> return hb_blob_get_empty();
> 
> buffer = (char *)::malloc(length);
> length = GetFontData(hdc, byte_swap(tag), 0, buffer, length);
> if (length == GDI_ERROR)
> length = 0;
> 
> return hb_blob_create((const char *)buffer, length, 
> HB_MEMORY_MODE_READONLY, buffer, ::free);
> }
> 
> hb_face_t *my_face_create_from_hdc(HDC hdc)
> {
> return hb_face_create_for_tables(my_reference_table, (void *)hdc, NULL);
> }

Thanks, I think how to manage the memory of a blob is now clear to me.

But the question about hb_face_t management is still not entirely
clear.  I don't really need hb_face_t, I only create it as an
intermediate step towards hb_font_t.  So my question is: once I have
hb_font_t, can I destroy the hb_face_t I used to create hb_font_t?  If
not, how do I arrange for hb_face_t to be destroyed when the
corresponding hb_font_t is destroyed?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-24 Thread Eli Zaretskii
> From: Ebrahim Byagowi 
> Date: Fri, 24 May 2019 20:13:43 +0430
> Cc: Harfbuzz 
> 
> Pardon me for the may inaccurate following answer I have to write quickly,

Thanks for your help.

> > Also, does HarfBuzz support TrueType Collection (TTC) files, and if so, 
> > does it want the data only for the
> currently selected font or all
> of the data?
> 
> It does, if you want harfbuzz handles it for you, you should give it the full 
> blob and set the index you like in
> second argument of hb_face_create, otherwise you should handle it yourself.

OK, this brings me to another question: what should I in general pass
as the 2nd argument of hb_face_create?  Suppose I'm using a TTF or OTF
font file, should I always pass zero as the 2nd argument?  What is the
semantics of that argument?

> > I'm now working on the HarfBuzz font driver for Emacs on Windows using 
> > GetFontData with the dwTable
> argument zero, to get the entire data of the font.
> 
> Is it DirectWrite? Have you seen the helper we have the in hb-directwrite.h 
> and hb-uniscribe.h? They can be
> very useful.

I'm not using DirectWrite, nor am I using Uniscribe.  My HarfBuzz is
built without these two, as I understand building with these back-ends
is only needed for comparison.  I want to use the HarfBuzz shaper, and
only it (Emacs already has support for Uniscribe).

But yes, I do consult these files to figure out answers to my
questions.

> >  does their memory need to be freed in some manner after I have the 
> > hb_font_t object, or do I have to keep
> them as long as hb_font_t is in use? 
> 
> Don't free it yourself specially if in use, you can use harfbuzz destroy 
> callback so harfbuzz can handle it for
> you.

Sorry, I don't think I understand: what do you mean by "harfbuzz
destroy callback"?  If you mean the 'destroy" argument of
hb_blob_create, then AFAIU this is called only to destroy user_data,
and I don't have user_data, I pass NULL as the 4th argument of
hb_blob_create.  And hb_face_create doesn't have any callback argument
at all.

I see in the few programs in util/ that both the blob and the face are
destroyed as soon as hb_font_t object is created, which is why I
thought I could do the same.  But now you seem to say I shouldn't?

For that matter, what should I use as the 'mode' argument of
hb_blob_create?

This page:

  https://harfbuzz.github.io/object-model-blobs.html

shows an example of calling hb_blob_create with 'free' (in my case,
'xfree') as the 'destroy' callback, so I guess my interpretation of
that argument as being pertinent to user_data was incorrect?  Still,
the questions about memory management for hb_face_t and about the
semantics of the hb_memory_mode_t enum values are left unanswered.

> >  I see that hb_blob_create, hb_face_create etc. return empty objects when 
> > they fail.  But I see no "is-empty"
> function or macro in the docs, did I miss something?
> 
> Some of the objects may work with empty comparison but it is not broken face
> https://github.com/harfbuzz/harfbuzz/issues/1572 but something does it very 
> accurately is
> hb_face_get_glyph_count

AFAIU, you are saying that if hb_face_get_glyph_count returns zero,
the face is empty and shouldn't be used, is that right?

> > Where do those 64.0 factors come from? 
> 
> Subpixel accuracy, harfbuzz works with integers but as subpixel accuracy 
> needed you have to we need to do
> some scaling. Scaling is not the pixels but _set_ppem and _set_ptem is (this 
> is very inaccurate, but I hope
> would be useful)

Does this mean I should use the factor of 64 in my code as well?  Or
does that value depend on some properties of the font?

> 
> > Or point me to the documentation where that is described, if I missed it?
> 
> https://harfbuzz.github.io/ may address some of your issues

Thanks again for your help.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-24 Thread Eli Zaretskii
Ping!  Could someone please help me understand how the memory for the
various HarfBuzz objects should be handled?  Or point me to the
documentation where that is described, if I missed it?  Please??

> Date: Sat, 18 May 2019 14:33:45 +0300
> From: Eli Zaretskii 
> Cc: harfbuzz@lists.freedesktop.org
> 
> > Cc: beh...@behdad.org, harfbuzz@lists.freedesktop.org
> > From: Jonathan Kew 
> > Date: Sat, 11 May 2019 22:15:46 +0100
> > 
> > >> If you've got access to the font as a file or as a single buffer in
> > >> memory, then wrapping the entire thing as a blob and handing it to
> > >> hb_face_create will be simplest.
> > > 
> > > Would wrapping in a blob the buffer returned by GetFontData be enough?
> > 
> > If you use GetFontData to get the complete font as a single buffer (i.e. 
> > pass zero for the dwTable parameter), yes.
> 
> I'm now working on the HarfBuzz font driver for Emacs on Windows using
> GetFontData with the dwTable argument zero, to get the entire data of
> the font.  The question for which I cannot find an answer is regarding
> the memory management of the font data I get from GetFontData.  The
> buffer into which I get the font data is malloc'ed.  Then I create a
> blob from that buffer using hb_blob_create, use that blob to create a
> face with hb_face_create, and finally use the face to create a font
> with hb_font_create.  The result of hb_font_create I cache and use
> thereafter each time I need to call hb_shape_full.  But what about the
> hb_blob_t and the hb_face_t objects created in the process -- does
> their memory need to be freed in some manner after I have the
> hb_font_t object, or do I have to keep them as long as hb_font_t is in
> use?  The question about the blob also directly affects whether I need
> to keep around the buffer allocated for the GetFontData call, or can
> it be freed once I have the hb_font_t object.
> 
> Another question is about error handling.  I see that hb_blob_create,
> hb_face_create etc. return empty objects when they fail.  But I see no
> "is-empty" function or macro in the docs, did I miss something?  If
> not, how does one test for errors in a C program?  I assumed that any
> errors cause subsequent calls to fail, and so only checked the last
> call to hb_font_create for errors -- is that correct?
> 
> Thanks in advance for any help.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-18 Thread Eli Zaretskii
> Date: Sat, 11 May 2019 22:44:49 +0300
> From: Eli Zaretskii 
> Cc: harfbuzz@lists.freedesktop.org
> 
> > From: Behdad Esfahbod 
> > Date: Sat, 11 May 2019 12:25:57 -0700
> > Cc: Jonathan Kew , 
> > "harfbuzz@lists.freedesktop.org" 
> > 
> > you can even implement Windows-backed font-funcs.  Several projects
> > do that.  Say, look at Qt maybe? 
> 
> I looked at XeTeX, but it goes the Freetype way.  I'll look at Qt,
> thanks.

For the record: Qt seems to use GetFontData for individual OTF
tables.  See qwindowsfontdatabase.cpp in the Qt sources.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-18 Thread Eli Zaretskii
> Cc: beh...@behdad.org, harfbuzz@lists.freedesktop.org
> From: Jonathan Kew 
> Date: Sat, 11 May 2019 22:15:46 +0100
> 
> >> If you've got access to the font as a file or as a single buffer in
> >> memory, then wrapping the entire thing as a blob and handing it to
> >> hb_face_create will be simplest.
> > 
> > Would wrapping in a blob the buffer returned by GetFontData be enough?
> 
> If you use GetFontData to get the complete font as a single buffer (i.e. 
> pass zero for the dwTable parameter), yes.

I'm now working on the HarfBuzz font driver for Emacs on Windows using
GetFontData with the dwTable argument zero, to get the entire data of
the font.  The question for which I cannot find an answer is regarding
the memory management of the font data I get from GetFontData.  The
buffer into which I get the font data is malloc'ed.  Then I create a
blob from that buffer using hb_blob_create, use that blob to create a
face with hb_face_create, and finally use the face to create a font
with hb_font_create.  The result of hb_font_create I cache and use
thereafter each time I need to call hb_shape_full.  But what about the
hb_blob_t and the hb_face_t objects created in the process -- does
their memory need to be freed in some manner after I have the
hb_font_t object, or do I have to keep them as long as hb_font_t is in
use?  The question about the blob also directly affects whether I need
to keep around the buffer allocated for the GetFontData call, or can
it be freed once I have the hb_font_t object.

Another question is about error handling.  I see that hb_blob_create,
hb_face_create etc. return empty objects when they fail.  But I see no
"is-empty" function or macro in the docs, did I miss something?  If
not, how does one test for errors in a C program?  I assumed that any
errors cause subsequent calls to fail, and so only checked the last
call to hb_font_create for errors -- is that correct?

Thanks in advance for any help.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
> Cc: beh...@behdad.org, harfbuzz@lists.freedesktop.org
> From: Jonathan Kew 
> Date: Sat, 11 May 2019 22:15:46 +0100
> 
> > Would wrapping in a blob the buffer returned by GetFontData be enough?
> 
> If you use GetFontData to get the complete font as a single buffer (i.e. 
> pass zero for the dwTable parameter), yes.
> 
> Alternatively, you could use hb_face_create_for_tables, with a 
> reference_table_func that uses GetFontData to read individual tables 
> when harfbuzz asks for them.

OK, thanks.  I think this is a large chunk of the solution to my
problem.

Assuming that I want to use GetFontData, what factors and aspects
should I consider when deciding whether to create a single blob with
the entire font's data or to go for the hb_face_create_for_tables
variety?

Also, does HarfBuzz support TrueType Collection (TTC) files, and if
so, does it want the data only for the currently selected font or all
of the data?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Sat, 11 May 2019 12:25:57 -0700
> Cc: Jonathan Kew , 
>   "harfbuzz@lists.freedesktop.org" 
> 
> you can even implement Windows-backed font-funcs.  Several projects
> do that.  Say, look at Qt maybe? 

I looked at XeTeX, but it goes the Freetype way.  I'll look at Qt,
thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
> Cc: Behdad Esfahbod ,
>  "harfbuzz@lists.freedesktop.org" 
> From: Jonathan Kew 
> Date: Sat, 11 May 2019 20:11:17 +0100
> 
> > Yes. The font file.  Maybe describe what you are trying to do?
> > 
> 
> If you've got access to the font as a file or as a single buffer in 
> memory, then wrapping the entire thing as a blob and handing it to 
> hb_face_create will be simplest.

Would wrapping in a blob the buffer returned by GetFontData be enough?

> In a case where you don't necessarily have easy access to the complete 
> font file, but have platform APIs that you can use to retrieve specific 
> font tables (like IDWriteFontFace::TryGetFontTable on Windows, or 
> CGFontCopyTableForTag on macOS), that's where you might prefer to use 
> hb_face_create_for_tables (like Firefox does). This expects you to 
> provide a reference_table_func that will return a blob containing the 
> data of any given font table (identified by its 32-bit OpenType table tag).

So there should be a function for each of the OpenType table tag, each
function returning a pointer to the table's data?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Sat, 11 May 2019 11:51:16 -0700
> Cc: Jonathan Kew , 
>   "harfbuzz@lists.freedesktop.org" 
> 
>  Not sure yet.  What is a "font" for this purpose?  Does it have to be
>  the full contents of a font file on disk?
> 
> Yes. The font file.  Maybe describe what you are trying to do?

I'm trying to use on MS-Windows the HarfBuzz shaping function for
Emacs, which Khaled wrote.  The code as written uses Freetype-specific
data (FT_Face), and I'm trying to provide it with the Windows
equivalents instead.

As for passing the font file's data to hb_blob_create: it is quite
unusual to manipulate physical font files on MS-Windows, the usual
paradigm is to use a "logical font", which is a specification for a
font, and then retrieve the metrics of the font using dedicated APIs.
So I wonder whether there's an alternative to accessing the physical
font files.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
> From: Behdad Esfahbod 
> Date: Sat, 11 May 2019 11:26:29 -0700
> Cc: Jonathan Kew , 
>   "harfbuzz@lists.freedesktop.org" 
> 
> Or just use hb_face_create() and hb_font_create(). 

Thanks, that's what I thought I needed to do to begin with.  However,
hb_face_create needs a 'blob' argument, and I couldn't understand how
to call hb_blob_create to get a suitable blob.  Then I looked at how
hb_ft_face_create does it, and saw that it makes a blob out of FT_Face
structure.  But I couldn't see how that blob is used by HarfBuzz, and
so couldn't decide how to call hb_blob_create in my case.

Was I asking myself the right questions?  If so, how can I find the
answers?

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
> From: Jonathan Kew 
> Date: Sat, 11 May 2019 11:08:42 +0100
> 
> You can use hb_face_create_for_tables, passing it a function that can 
> retrieve font tables (as hb_blobs) when requested by harfbuzz.
> 
> This is what Firefox does, to use harfbuzz with Windows or MacOS font 
> APIs; a starting point to explore the Firefox code would be [1], where 
> we call hb_face_create_for_tables and pass it HBGetTable as the 
> reference_table_func. This calls down to the GetFontTable() method, 
> which has separate implementations for the various platforms.

Thanks, this gets me a notch forward, but I'm afraid there's still a
lot of fog.  Specifically, what does HarfBuzz expect from the
hb_blob's it retrieves this way, and where and how does it use those
blobs?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] How to get hb_face_t and hb_font_t without Freetype?

2019-05-11 Thread Eli Zaretskii
Is it possible to create a hb_face_t without going through Freetype?
If so, could someone please tell what that would entail?

The tutorial only shows how to create hb_font_t using Freetype, and I
found no other documentation related to this, except the functions'
signatures.  The implementations of hb_ft_face_create and
hb_ft_face_create look deceptively simple, so maybe it wouldn't be
hard to implement something similar without going through Freetype.
But the question is what is needed from the data stashed away by
hb_blob_create, and where is that data used?  I guess there are some
callbacks specific to Freetype which the HarfBuzz shaper needs, and
those callbacks need to access the blob data?  But none of that seems
to be documented.

Could someone please post some information about these issues, or
point me to existing documentation if I missed it?

The context for these questions is WIP to add a HarfBuzz shaping
capabilities to Emacs on MS-Windows.  The existing HarfBuzz
integration, for Posix platforms, uses Freetype, because Freetype is
already used by Emacs on Posix systems to access font capabilities.
But on Windows Emacs uses native Windows interfaces to access and
utilize font and text metrics data, so going through Freetype would
probably add interfaces whose equivalents already exist.  The question
is how to use those equivalents to give HarfBuzz what it needs.

TIA
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] Building and testing HarfBuzz 2.3.0 on MinGW

2019-02-08 Thread Eli Zaretskii
> From: Ebrahim Byagowi 
> Date: Fri, 8 Feb 2019 16:02:46 +0330
> Cc: Nathan Willis , Harfbuzz 
> 
> 
> > My conclusion was that ICU is not needed, but maybe it has some advantages,
> 
> It will be a good idea if someone ships ICU anyway, they use their ICU (or 
> glib, which can provide unicode
> callbacks also) instead having extra a harfbuzz buildin UCDN, at least for 
> size reduction reasons.
> [...]
> > Glib is needed for running a large part of the test suite
> 
> It can provide unicode callbacks also as just said before.

Thanks, but I still don't think I understand: given that Unicode
character properties are all derived from the same UCD database, what
would be the motivation to use Glib or ICU for these purposes, even if
these libraries are already linked into a program?  Do ICU/Glib
support some extensions that UCDN doesn't, or are more likely to
support the latest Unicode Standard?

> > It is not clear to me what are GObject and Introspection needed for; it 
> > would be good to clarify that.
> 
> Roughly, gnome way of writing language bindings, ie. make non C/C++ language 
> users able to interact with
> the library with Gnome provided facilities. Not needed for C/C++ or users 
> don't use gobject introspection
> anyway.

Thanks, this part is now clear, I think.

> > Btw, the information about "Building on Windows" is IMO outdated:
> > nowadays one can use the "normal" Unix configure/make steps assuming
> > one has MSYS and MinGW installed.  That's what I did.  There should be
> > no need anymore for any Windows-specific build procedures.
> 
> Not everyone will agree with you on that I guess, maybe different use-cases 
> or something, as you see vcpkg
> project https://github.com/Microsoft/vcpkg/graphs/contributors is still a 
> pretty busy project, that's why I
> suggest vcpkg for non-msys Windows users, even instead directly using our 
> cmake on Windows. Vcpkg
> itself uses our cmake but can switch to meson if needed and it can target 
> Linux in addition to Windows, for
> use-cases I am not aware of.

So maybe that section should be extended to mention both methods?
People who build HarfBuzz on Windows are likely to have MSYS installed
anyway, because building the dependencies mostly does require it.  And
even if they don't have it already installed, it's good to mention
that, just so that the reader would know such a method is supported.
When I first read that, I was left wondering whether the normal
configure && make paradigm will get me a port as functional as the
method described on that page.

Thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Building and testing HarfBuzz 2.3.0 on MinGW

2019-02-08 Thread Eli Zaretskii
> From: Nathan Willis 
> Date: Mon, 4 Feb 2019 12:28:02 +
> Cc: harfbuzz@lists.freedesktop.org
> 
> On Sat, Jan 26, 2019 at 5:35 PM Eli Zaretskii  wrote:
> 
> > 1) It would be good to have some guidance in some README or in the
> > HTML docs regarding the optional dependencies and configuration
> > options, and their significance.  For example, it turns out Glib is
> > needed to run a large portion of the test suite, something that wasn't
> > clear (I initially concluded that I didn't need Glib at all).  Also,
> > hb-shape is not built if Glib isn't available.  Similarly, hb-view is
> > not build unless both Cairo and cairo-ft are available.
> >
> >
> I added https://harfbuzz.github.io/building.html#configuration a few weeks
> ago; would you mind elaborating on what is missing there from your POV?

Thanks for adding this, and sorry for the long delay in responding.

The information you added tells when to use the optional configure
switches.  That is important, but there's a more general issue of what
optional dependencies are needed for which parts of HarfBuzz's
functionalities.  This is important for someone who wants to build
HarfBuzz with the minimal set of dependencies, but without losing any
functionality important for one's use case.  Without a good
understanding of these issues, one cannot easily decide on which of
the configure switches to use, and more importantly what packages need
to be installed before building HarfBuzz.

In response to my questions, Khaled once provided some of the
information about that.  I now combine that below with what I learned
while building HarfBuzz:

  . ICU is needed for accessing Unicode character properties; UCDN is
the built-in alternative to that which has no external
dependencies.  My conclusion was that ICU is not needed, but maybe
it has some advantages, in which case it would be good to describe
them.

  . Cairo is needed for command-line tools (so can be skipped if one
only wants the library).  Note that Cairo alone is not enough for
building the command-line tools, you also need cairo-ft, and for
hb-shape one also needs Glib.

  . Freetype is one of two font callbacks; the other is built-in and
has no external dependencies.  The decision whether to use
Freetype largely depends on whether the program(s) to be linked
against HarfBuzz already use Freetype.

  . Fontconfig is only needed for command-line tools.

  . Graphite2 is becoming less and less important, as fonts which
require that are rare, and their importance for minority scripts
is diminishing with recent OpenType developments.

  . Glib is needed for running a large part of the test suite, so if
one decides not to build with Glib, a separate build with Glib
just for running the test suite is a good idea.

  . Python is required (and should be on PATH) for most of the test
suite.

  . It is not clear to me what are GObject and Introspection needed
for; it would be good to clarify that.

Btw, the information about "Building on Windows" is IMO outdated:
nowadays one can use the "normal" Unix configure/make steps assuming
one has MSYS and MinGW installed.  That's what I did.  There should be
no need anymore for any Windows-specific build procedures.

Thanks, and let me know if I can help more with this documentation
effort.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Building and testing HarfBuzz 2.3.0 on MinGW

2019-02-05 Thread Eli Zaretskii
> From: Nathan Willis 
> Date: Mon, 4 Feb 2019 12:28:02 +
> Cc: harfbuzz@lists.freedesktop.org
> 
> On Sat, Jan 26, 2019 at 5:35 PM Eli Zaretskii  wrote:
> 
>  1) It would be good to have some guidance in some README or in the
>  HTML docs regarding the optional dependencies and configuration
>  options, and their significance.  For example, it turns out Glib is
>  needed to run a large portion of the test suite, something that wasn't
>  clear (I initially concluded that I didn't need Glib at all).  Also,
>  hb-shape is not built if Glib isn't available.  Similarly, hb-view is
>  not build unless both Cairo and cairo-ft are available.
> 
> I added https://harfbuzz.github.io/building.html#configuration a few weeks 
> ago; would you mind elaborating on
> what is missing there from your POV?

Hi,

I didn't forget, I just have my plate full.  Will respond in a few
days.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] Building and testing HarfBuzz 2.3.0 on MinGW

2019-01-27 Thread Eli Zaretskii
> From: Ebrahim Byagowi 
> Date: Sun, 27 Jan 2019 01:02:01 +0330
> Cc: Harfbuzz 
> 
> 1) Agreed

Btw, one other prerequisite for running the test suite is Python.  I
suggest that to be mentioned as well.  In my case, Python was not on
PATH, and most tests failed.

> 2) Something feels wrong as we compile all these in our msys2 CI already and 
> that shouldn't be that different
> from your setup

I saw that similar failures were reported here:

  https://github.com/harfbuzz/harfbuzz/issues/1560

So I upgraded my Freetype 2.5.0.1 to the latest 2.9.1, and then all
the tests passed.  Therefore, I suggest that the oldest version of
Freetype that is considered "good enough" for the test suite be
referenced in the documentation of prerequisites for running the
tests.

> 3) Uniscribe and DirectWrite backends and now CoreText, are mostly for 
> comparison while development, so
> developers can check what can be expected behavior while development, and are 
> not used in the test suit at
> least which tends to be platform agnostic so don't use them at all if you can.

Got it, thanks.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


[HarfBuzz] Building and testing HarfBuzz 2.3.0 on MinGW

2019-01-26 Thread Eli Zaretskii
I'm resending this after subscribing to the list, since my original
message, send a month ago, only got an automated response that it's
waiting for a moderator.  (Does someone actually tend to the
moderator's tasks of this list?)

I've built HarfBuzz 2.3.0 on MS-Windows using mingw.org's MinGW
(https://osdn.net/projects/mingw/, different from MinGW64).  In
general, the build was successful, with a small number of changes that
I will soon report to the issue tracker.

I have a few questions/suggestions as result of this experience, which
I'd like to voice.  Thanks in advance for any responses.

1) It would be good to have some guidance in some README or in the
HTML docs regarding the optional dependencies and configuration
options, and their significance.  For example, it turns out Glib is
needed to run a large portion of the test suite, something that wasn't
clear (I initially concluded that I didn't need Glib at all).  Also,
hb-shape is not built if Glib isn't available.  Similarly, hb-view is
not build unless both Cairo and cairo-ft are available.

2) Several tests fail.  For example, "indic-joiners" and "use" in
shaping/data/in-house, CVAR-1 and CVAR-2 in
shaping/data/text-rendering-tests, most of the gpos_* tests in
shaping/data/aots:, etc.  I also built HarfBuzz on GNU/Linux, and I
see failures in almost the same tests.  The reason for the failures
are some differences between the expected and the actual outputs.  Are
these real problems, for which you'd like me to report issues, or is
this a known problem?  Did someone succeed to run the entire test
suite without a single failure?

3) I'm uncertain about the use of Uniscribe in the Windows build.  I
was told that it was only used "for comparison", which I interpreted
to mean it was used in the test suite.  But I don't think it's the
case, since the Uniscribe dependent functions of HarfBuzz are in the
library, so it seems like Uniscribe is used by the library itself.
What is the purpose of using Uniscribe (and DirectWrite, when that is
compiled in)?
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz