Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-08-03 Thread Behdad Esfahbod
On Thu, Aug 2, 2018 at 12:46 AM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> On Wed, 1 Aug 2018 17:31:06 -0700
> Behdad Esfahbod  wrote:
>
> > On Mon, Jul 30, 2018 at 6:21 PM, Richard Wordingham <
> > richard.wording...@ntlworld.com> wrote:
> >
> > > On Mon, 30 Jul 2018 17:04:42 -0700
> > > Behdad Esfahbod  wrote:
>
> > > > It's impossible to hit that limit...  Ok, it would be impossible
> > > > if we increase it to 32.  I'll do that.
>
> > > That'll probably work, but I'm now intrigued.  Why have a limit that
> > > will never be hit?  Are you just catering for HarfBuzz's logic
> > > simply going badly wrong in very unusual circumstances?
>
> > Yes, simply as defense against malicious fonts and how the subsetter's
> > glyph-closure routine can be tricked to collect (way) more glyphs than
> > shaper can actually reach.
>
> But if the limit is never hit, then the defence is never used, and so
> it may as well not be there.  Or is it meant to initimidate
> designers of malicious fonts who study Harfbuzz?
>

The limit is never hit **during shaping**.  But our current glyph_closure()
can get tricked by malicious fonts without this limit. Hence the limit.



-- 
behdad
http://behdad.org/
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-08-02 Thread Richard Wordingham
On Wed, 1 Aug 2018 17:31:06 -0700
Behdad Esfahbod  wrote:

> On Mon, Jul 30, 2018 at 6:21 PM, Richard Wordingham <
> richard.wording...@ntlworld.com> wrote:  
> 
> > On Mon, 30 Jul 2018 17:04:42 -0700
> > Behdad Esfahbod  wrote:

> > > It's impossible to hit that limit...  Ok, it would be impossible
> > > if we increase it to 32.  I'll do that.  

> > That'll probably work, but I'm now intrigued.  Why have a limit that
> > will never be hit?  Are you just catering for HarfBuzz's logic
> > simply going badly wrong in very unusual circumstances?

> Yes, simply as defense against malicious fonts and how the subsetter's
> glyph-closure routine can be tricked to collect (way) more glyphs than
> shaper can actually reach.

But if the limit is never hit, then the defence is never used, and so
it may as well not be there.  Or is it meant to initimidate
designers of malicious fonts who study Harfbuzz?

Richard.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-08-01 Thread Behdad Esfahbod
On Mon, Jul 30, 2018 at 6:21 PM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> On Mon, 30 Jul 2018 17:04:42 -0700
> Behdad Esfahbod  wrote:
>
> > On Thu, Jul 26, 2018 at 12:06 AM, Richard Wordingham <
> > richard.wording...@ntlworld.com> wrote:
> >
> > > On Tue, 24 Jul 2018 16:31:50 + (UTC)
> > > beh...@kemper.freedesktop.org (Behdad Esfahbod) wrote:
> > >
> > > The following change bothers me:
> > >
> > > >  src/hb-ot-layout-common-private.hh |7 +++
> > > >  src/hb-ot-layout.cc|5 -
> > > >  2 files changed, 11 insertions(+), 1 deletion(-)
> > > >
> > > > New commits:
> > > > commit 85646fdadb2f102333485e07425361795b4e0412
> > > > Author: Garret Rieger 
> > > > Date:   Mon Jul 23 15:37:18 2018 -0700
> > > >
> > > > [subset] Limit the iterations of the closure algorithm.
> > > > Prevents O(n^2) run times.
> > > >
> > > > diff --git a/src/hb-ot-layout-common-private.hh
> > > > b/src/hb-ot-layout-common-private.hh index 21caf9e9..7ff0dbeb
> > > > 100644 --- a/src/hb-ot-layout-common-private.hh
> > > > +++ b/src/hb-ot-layout-common-private.hh
> > > > @@ -41,6 +41,13 @@
> > > >  #ifndef HB_MAX_CONTEXT_LENGTH
> > > >  #define HB_MAX_CONTEXT_LENGTH64
> > > >  #endif
> > > > +#ifndef HB_CLOSURE_MAX_STAGES
> > > > +/*
> > > > + * The maximum number of times a lookup can be applied during
> > > > shaping.
> > > > + * Used to limit the number of iterations of the closure
> > > > algorithm.
> > > > + */
> > > > +#define HB_CLOSURE_MAX_STAGES8
> > > > +#endif
> > >
> > > I presume that this is intended to prevent a denial of service
> > > attack,
> >
> > Correct.
> >
> >
> > > at the cost of trashing a subset font.
> > >
> >
> > Not really.
> >
> >
> > > In non-malicious use, how is the victim supposed to detect that and
> > > then how he needs to change HarfBuzz or his font?  Does he have to
> > > read all the text using the subset font simply to detect a
> > > problem?  How does one test that a font does not hit this limit?
> >
> >
> > It's impossible to hit that limit...  Ok, it would be impossible if we
> > increase it to 32.  I'll do that.
>
> That'll probably work, but I'm now intrigued.  Why have a limit that
> will never be hit?  Are you just catering for HarfBuzz's logic simply
> going badly wrong in very unusual circumstances?
>

Yes, simply as defense against malicious fonts and how the subsetter's
glyph-closure routine can be tricked to collect (way) more glyphs than
shaper can actually reach.



>
> The further points is just nit-picking and can be safely ignored.
>
> > >   Does one have to
> > > iterate over the power set of the supported characters for each
> > > script?  That's O(2^n) - impossible to do!
> > >
> > > The description of HB_CLOSURE_MAX_STAGES is completely wrong.  I was
> > > initially alarmed because I have lookups that are invoked in more
> > > than 8 places in substitution subtables.  A more accurate, but
> > > still not perfect, definition, would be 'the maximum number of
> > > times lookup can change a bit of text'.
> > >
> >
> > Nope.  Stage is a technical term in HarfBuzz GSUB processing.
> >
> > According to OpenType spec, lookups are processed in increasing order
> > of their indices.  This implies that each lookup is processed one.
> > But then the script shaping specs say some features are applied
> > separately.  Each of those separated list of features/lookups applied
> > are called one stage.  The total number of stages in any shaper is
> > the total number of times a lookup can be applied in theory.
>
> That applies to lookups that are always formally unconditionally
> applied. It doesn't apply to lookups invoked in response to context or
> chaincontext lookups.
>
> > Note
> > that this does NOT limit recursion through Context and ChainContext
> > lookups.
>
> Richard.
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>



-- 
behdad
http://behdad.org/
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-07-30 Thread Richard Wordingham
On Mon, 30 Jul 2018 17:04:42 -0700
Behdad Esfahbod  wrote:

> On Thu, Jul 26, 2018 at 12:06 AM, Richard Wordingham <
> richard.wording...@ntlworld.com> wrote:  
> 
> > On Tue, 24 Jul 2018 16:31:50 + (UTC)
> > beh...@kemper.freedesktop.org (Behdad Esfahbod) wrote:
> >
> > The following change bothers me:
> >  
> > >  src/hb-ot-layout-common-private.hh |7 +++
> > >  src/hb-ot-layout.cc|5 -
> > >  2 files changed, 11 insertions(+), 1 deletion(-)
> > >
> > > New commits:
> > > commit 85646fdadb2f102333485e07425361795b4e0412
> > > Author: Garret Rieger 
> > > Date:   Mon Jul 23 15:37:18 2018 -0700
> > >
> > > [subset] Limit the iterations of the closure algorithm.
> > > Prevents O(n^2) run times.
> > >
> > > diff --git a/src/hb-ot-layout-common-private.hh
> > > b/src/hb-ot-layout-common-private.hh index 21caf9e9..7ff0dbeb
> > > 100644 --- a/src/hb-ot-layout-common-private.hh
> > > +++ b/src/hb-ot-layout-common-private.hh
> > > @@ -41,6 +41,13 @@
> > >  #ifndef HB_MAX_CONTEXT_LENGTH
> > >  #define HB_MAX_CONTEXT_LENGTH64
> > >  #endif
> > > +#ifndef HB_CLOSURE_MAX_STAGES
> > > +/*
> > > + * The maximum number of times a lookup can be applied during
> > > shaping.
> > > + * Used to limit the number of iterations of the closure
> > > algorithm.
> > > + */
> > > +#define HB_CLOSURE_MAX_STAGES8
> > > +#endif  
> >
> > I presume that this is intended to prevent a denial of service
> > attack, 
> 
> Correct.
> 
> 
> > at the cost of trashing a subset font.
> >  
> 
> Not really.
> 
> 
> > In non-malicious use, how is the victim supposed to detect that and
> > then how he needs to change HarfBuzz or his font?  Does he have to
> > read all the text using the subset font simply to detect a
> > problem?  How does one test that a font does not hit this limit?  
> 
> 
> It's impossible to hit that limit...  Ok, it would be impossible if we
> increase it to 32.  I'll do that.

That'll probably work, but I'm now intrigued.  Why have a limit that
will never be hit?  Are you just catering for HarfBuzz's logic simply
going badly wrong in very unusual circumstances?


The further points is just nit-picking and can be safely ignored.

> >   Does one have to
> > iterate over the power set of the supported characters for each
> > script?  That's O(2^n) - impossible to do!
> >
> > The description of HB_CLOSURE_MAX_STAGES is completely wrong.  I was
> > initially alarmed because I have lookups that are invoked in more
> > than 8 places in substitution subtables.  A more accurate, but
> > still not perfect, definition, would be 'the maximum number of
> > times lookup can change a bit of text'.
> >  
> 
> Nope.  Stage is a technical term in HarfBuzz GSUB processing.
> 
> According to OpenType spec, lookups are processed in increasing order
> of their indices.  This implies that each lookup is processed one.
> But then the script shaping specs say some features are applied
> separately.  Each of those separated list of features/lookups applied
> are called one stage.  The total number of stages in any shaper is
> the total number of times a lookup can be applied in theory.

That applies to lookups that are always formally unconditionally
applied. It doesn't apply to lookups invoked in response to context or
chaincontext lookups.

> Note
> that this does NOT limit recursion through Context and ChainContext
> lookups.

Richard.
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-07-30 Thread Behdad Esfahbod
On Thu, Jul 26, 2018 at 7:39 AM, Petr Kobalíček 
wrote:

> I initially thought that this was to prevent an infinite recursion of
> contextual lookups.
>
> I'm working with OpenType myself (not harfbuzz) and this is something that
> I think is not clarified in the specification. Can a contextual
> substitution invoke another contextual substitution and recurse?
>

Yes it can.



> Is HB_CLOSURE_MAX_STAGES here to enforce hard limit?
>

No.  But neighboring HB_MAX_NESTING_LEVEL does that.  Currently set to 6.

The font Noto Nastaliq Urdu uses nested recursive lookups extensively.


> To write a bit more about it. I thought that contextual lookup has
> basically 3 parts:
>
>   - backtrack sequence
>   - input sequence
>   - lookahead sequence
>
> I would imagine that "input" sequence would be pretty short, like one
> character most of the time, and the lookup applied if we have a match would
> only consist of "input sequence". So the question is, does it make sense to
> apply another contextual lookup to just the isolated "input sequence" in
> case we had a match?
>

Yes.


> Do you guys here know any material that would further explain how such
> cases of GSUB should be correctly handled?
>

How HarfBuzz does it is my best understanding of how it should be done (not
necessarily how Microsoft does it, but compatible-enough).


> Best,
> Petr.
>
>
> On Thu, Jul 26, 2018 at 9:06 AM, Richard Wordingham <
> richard.wording...@ntlworld.com> wrote:
>
>> On Tue, 24 Jul 2018 16:31:50 + (UTC)
>> beh...@kemper.freedesktop.org (Behdad Esfahbod) wrote:
>>
>> The following change bothers me:
>>
>> >  src/hb-ot-layout-common-private.hh |7 +++
>> >  src/hb-ot-layout.cc|5 -
>> >  2 files changed, 11 insertions(+), 1 deletion(-)
>> >
>> > New commits:
>> > commit 85646fdadb2f102333485e07425361795b4e0412
>> > Author: Garret Rieger 
>> > Date:   Mon Jul 23 15:37:18 2018 -0700
>> >
>> > [subset] Limit the iterations of the closure algorithm.
>> > Prevents O(n^2) run times.
>> >
>> > diff --git a/src/hb-ot-layout-common-private.hh
>> > b/src/hb-ot-layout-common-private.hh index 21caf9e9..7ff0dbeb 100644
>> > --- a/src/hb-ot-layout-common-private.hh
>> > +++ b/src/hb-ot-layout-common-private.hh
>> > @@ -41,6 +41,13 @@
>> >  #ifndef HB_MAX_CONTEXT_LENGTH
>> >  #define HB_MAX_CONTEXT_LENGTH64
>> >  #endif
>> > +#ifndef HB_CLOSURE_MAX_STAGES
>> > +/*
>> > + * The maximum number of times a lookup can be applied during
>> > shaping.
>> > + * Used to limit the number of iterations of the closure algorithm.
>> > + */
>> > +#define HB_CLOSURE_MAX_STAGES8
>> > +#endif
>>
>> I presume that this is intended to prevent a denial of service attack,
>> at the cost of trashing a subset font.
>>
>> In non-malicious use, how is the victim supposed to detect that and
>> then how he needs to change HarfBuzz or his font?  Does he have to read
>> all the text using the subset font simply to detect a problem?  How
>> does one test that a font does not hit this limit?  Does one have to
>> iterate over the power set of the supported characters for each
>> script?  That's O(2^n) - impossible to do!
>>
>> The description of HB_CLOSURE_MAX_STAGES is completely wrong.  I was
>> initially alarmed because I have lookups that are invoked in more than
>> 8 places in substitution subtables.  A more accurate, but still not
>> perfect, definition, would be 'the maximum number of times lookup can
>> change a bit of text'.
>>
>> A limit of 8 does not strike me as obviously generous.  Some contextual
>> changes can ripple through a string, and I would not be totally
>> surprised to find that 8+1 or more lookups act on some irreducible
>> strings in my Da Lekh font.  The consolations are that there are
>> probably shorter paths to create the resultant glyphs from the input
>> set, and one iteration will often process several lookups in the
>> correct sequence.
>>
>> Richard.
>> ___
>> HarfBuzz mailing list
>> HarfBuzz@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>>
>
>
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
>


-- 
behdad
http://behdad.org/
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-07-30 Thread Behdad Esfahbod
On Thu, Jul 26, 2018 at 12:06 AM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> On Tue, 24 Jul 2018 16:31:50 + (UTC)
> beh...@kemper.freedesktop.org (Behdad Esfahbod) wrote:
>
> The following change bothers me:
>
> >  src/hb-ot-layout-common-private.hh |7 +++
> >  src/hb-ot-layout.cc|5 -
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > New commits:
> > commit 85646fdadb2f102333485e07425361795b4e0412
> > Author: Garret Rieger 
> > Date:   Mon Jul 23 15:37:18 2018 -0700
> >
> > [subset] Limit the iterations of the closure algorithm.
> > Prevents O(n^2) run times.
> >
> > diff --git a/src/hb-ot-layout-common-private.hh
> > b/src/hb-ot-layout-common-private.hh index 21caf9e9..7ff0dbeb 100644
> > --- a/src/hb-ot-layout-common-private.hh
> > +++ b/src/hb-ot-layout-common-private.hh
> > @@ -41,6 +41,13 @@
> >  #ifndef HB_MAX_CONTEXT_LENGTH
> >  #define HB_MAX_CONTEXT_LENGTH64
> >  #endif
> > +#ifndef HB_CLOSURE_MAX_STAGES
> > +/*
> > + * The maximum number of times a lookup can be applied during
> > shaping.
> > + * Used to limit the number of iterations of the closure algorithm.
> > + */
> > +#define HB_CLOSURE_MAX_STAGES8
> > +#endif
>
> I presume that this is intended to prevent a denial of service attack,
>

Correct.


> at the cost of trashing a subset font.
>

Not really.


> In non-malicious use, how is the victim supposed to detect that and
> then how he needs to change HarfBuzz or his font?  Does he have to read
> all the text using the subset font simply to detect a problem?  How
> does one test that a font does not hit this limit?


It's impossible to hit that limit...  Ok, it would be impossible if we
increase it to 32.  I'll do that.


>   Does one have to
> iterate over the power set of the supported characters for each
> script?  That's O(2^n) - impossible to do!
>
> The description of HB_CLOSURE_MAX_STAGES is completely wrong.  I was
> initially alarmed because I have lookups that are invoked in more than
> 8 places in substitution subtables.  A more accurate, but still not
> perfect, definition, would be 'the maximum number of times lookup can
> change a bit of text'.
>

Nope.  Stage is a technical term in HarfBuzz GSUB processing.

According to OpenType spec, lookups are processed in increasing order of
their indices.  This implies that each lookup is processed one.  But then
the script shaping specs say some features are applied separately.  Each of
those separated list of features/lookups applied are called one stage.  The
total number of stages in any shaper is the total number of times a lookup
can be applied in theory.  Note that this does NOT limit recursion through
Context and ChainContext lookups.

commit 66ccd8ac405c9c25b37de9eb467a7382880dda35 (HEAD -> master,
origin/master)
Author: Behdad Esfahbod 
Date:   Mon Jul 30 17:03:06 2018 -0700

[serialize] Increase stage count from 8 to 32

Indic shaper uses many stages.  Now we are provably not limiting
functionality whereas the previous limit of 8 was assuming real-world
practices.

diff --git a/src/hb-ot-layout-common-private.hh
b/src/hb-ot-layout-common-private.hh
index dec237c8..1cf530ea 100644
--- a/src/hb-ot-layout-common-private.hh
+++ b/src/hb-ot-layout-common-private.hh
@@ -45,8 +45,10 @@
 /*
  * The maximum number of times a lookup can be applied during shaping.
  * Used to limit the number of iterations of the closure algorithm.
+ * This must be larger than the number of times add_pause() is
+ * called in a collect_features call of any shaper.
  */
-#define HB_CLOSURE_MAX_STAGES  8
+#define HB_CLOSURE_MAX_STAGES  32
 #endif





> A limit of 8 does not strike me as obviously generous.  Some contextual
> changes can ripple through a string, and I would not be totally
> surprised to find that 8+1 or more lookups act on some irreducible
> strings in my Da Lekh font.  The consolations are that there are
> probably shorter paths to create the resultant glyphs from the input
> set, and one iteration will often process several lookups in the
> correct sequence.
>
> Richard.
>

-- 
behdad
http://behdad.org/
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

2018-07-26 Thread Petr Kobalíček
I initially thought that this was to prevent an infinite recursion of
contextual lookups.

I'm working with OpenType myself (not harfbuzz) and this is something that
I think is not clarified in the specification. Can a contextual
substitution invoke another contextual substitution and recurse? Is
HB_CLOSURE_MAX_STAGES
here to enforce hard limit?

To write a bit more about it. I thought that contextual lookup has
basically 3 parts:

  - backtrack sequence
  - input sequence
  - lookahead sequence

I would imagine that "input" sequence would be pretty short, like one
character most of the time, and the lookup applied if we have a match would
only consist of "input sequence". So the question is, does it make sense to
apply another contextual lookup to just the isolated "input sequence" in
case we had a match?

Do you guys here know any material that would further explain how such
cases of GSUB should be correctly handled?

Best,
Petr.

On Thu, Jul 26, 2018 at 9:06 AM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> On Tue, 24 Jul 2018 16:31:50 + (UTC)
> beh...@kemper.freedesktop.org (Behdad Esfahbod) wrote:
>
> The following change bothers me:
>
> >  src/hb-ot-layout-common-private.hh |7 +++
> >  src/hb-ot-layout.cc|5 -
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > New commits:
> > commit 85646fdadb2f102333485e07425361795b4e0412
> > Author: Garret Rieger 
> > Date:   Mon Jul 23 15:37:18 2018 -0700
> >
> > [subset] Limit the iterations of the closure algorithm.
> > Prevents O(n^2) run times.
> >
> > diff --git a/src/hb-ot-layout-common-private.hh
> > b/src/hb-ot-layout-common-private.hh index 21caf9e9..7ff0dbeb 100644
> > --- a/src/hb-ot-layout-common-private.hh
> > +++ b/src/hb-ot-layout-common-private.hh
> > @@ -41,6 +41,13 @@
> >  #ifndef HB_MAX_CONTEXT_LENGTH
> >  #define HB_MAX_CONTEXT_LENGTH64
> >  #endif
> > +#ifndef HB_CLOSURE_MAX_STAGES
> > +/*
> > + * The maximum number of times a lookup can be applied during
> > shaping.
> > + * Used to limit the number of iterations of the closure algorithm.
> > + */
> > +#define HB_CLOSURE_MAX_STAGES8
> > +#endif
>
> I presume that this is intended to prevent a denial of service attack,
> at the cost of trashing a subset font.
>
> In non-malicious use, how is the victim supposed to detect that and
> then how he needs to change HarfBuzz or his font?  Does he have to read
> all the text using the subset font simply to detect a problem?  How
> does one test that a font does not hit this limit?  Does one have to
> iterate over the power set of the supported characters for each
> script?  That's O(2^n) - impossible to do!
>
> The description of HB_CLOSURE_MAX_STAGES is completely wrong.  I was
> initially alarmed because I have lookups that are invoked in more than
> 8 places in substitution subtables.  A more accurate, but still not
> perfect, definition, would be 'the maximum number of times lookup can
> change a bit of text'.
>
> A limit of 8 does not strike me as obviously generous.  Some contextual
> changes can ripple through a string, and I would not be totally
> surprised to find that 8+1 or more lookups act on some irreducible
> strings in my Da Lekh font.  The consolations are that there are
> probably shorter paths to create the resultant glyphs from the input
> set, and one iteration will often process several lookups in the
> correct sequence.
>
> Richard.
> ___
> HarfBuzz mailing list
> HarfBuzz@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz