Re: ReactOS: stack vs heap

2023-09-05 Thread Werner LEMBERG


> Thanks. One thing I don't understand, is the "+ 2" in the code below:
>
>   AF_LatinBlue  blue_sorted[AF_BLUE_STRINGSET_MAX_LEN + 2];
>
>   for ( i = 0; i < axis->blue_count; i++ )
> blue_sorted[i] = >blues[i];
>
> If that + 2 is correct to be there, then we need a similar + 2 in
> aflatin.h for blues. But I cannot see why it's needed.

Good catch!  This is due to a synchronization error between FreeType
and ttfautohint: I've originally developed the code for ttfautohint,
which adds two artificial blue zones, then imported it back into
FreeType, where I forgot to adjust the array size accordingly.

Now fixed in git, thanks.


   Werner



Re: ReactOS: stack vs heap

2023-09-04 Thread Behdad Esfahbod
On Mon, Sep 4, 2023 at 8:27 PM Werner LEMBERG  wrote:

>
> > Upon further investigation, I think my proposed change is correct.
>
> Thanks a lot, committed!
>

Thanks. One thing I don't understand, is the "+ 2" in the code below:

  AF_LatinBlue  blue_sorted[AF_BLUE_STRINGSET_MAX_LEN + 2];





  for ( i = 0; i < axis->blue_count; i++ )
blue_sorted[i] = >blues[i];

If that + 2 is correct to be there, then we need a similar + 2 in aflatin.h
for blues. But I cannot see why it's needed.


Re: ReactOS: stack vs heap

2023-09-04 Thread Werner LEMBERG


> Upon further investigation, I think my proposed change is correct.

Thanks a lot, committed!


Werner



Re: ReactOS: stack vs heap

2023-09-04 Thread Behdad Esfahbod
Upon further investigation, I think my proposed change is correct.

behdad
http://behdad.org/


On Mon, Sep 4, 2023 at 4:26 PM Behdad Esfahbod  wrote:

> What I said is wrong. But the blues array should be dynamically allocated,
> and use a embedded version for small values. I'll work on it.
>
> behdad
> http://behdad.org/
>
>
> On Mon, Sep 4, 2023 at 3:47 PM Behdad Esfahbod  wrote:
>
>> On Mon, Sep 4, 2023 at 3:39 PM Behdad Esfahbod  wrote:
>>
>>> On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov <
>>> apodt...@gmail.com> wrote:
>>>
 >
 > Wanted to point out that compiling with gcc and adding
 "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is
 probably the easiest way to track down large stacks at the moment. Note
 that af_cjk_metrics_init_widths (44480 bytes) and
 af_latin_metrics_init_widths (52992 bytes) are by far the largest.
 cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are
 a few others like the rasterizer stacks that are between 10-20kb which one
 may also want to look into, but these have been less problematic on my
 experience (though that may have been due to the even larger stacks being
 allocated first). Just wanted to point out how to measure and that the
 rasterizer might not be the first place to look.

 That is surprisingly large. Someone should examine how much of it is
 actually used. The rendering pool of "visited cells" (pixels) is
 rather predictable for a given outline. That is why it is easy for me.

>>>
>>> I took a look at the autohinter one. The problem comes from:
>>>
>>> AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX];
>>>
>>> in struct  AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260.
>>>
>>> My gut feeling is that that's a typo and should be:
>>>
>>>  AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX_LEN];
>>>
>>> where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that.
>>>
>>
>> Same thing in afcjk.h.
>>
>> I'm finding these using:
>>
>> $ make CPPFLAGS=-Wframe-larger-than=4096
>>
>


Re: ReactOS: stack vs heap

2023-09-04 Thread Behdad Esfahbod
What I said is wrong. But the blues array should be dynamically allocated,
and use a embedded version for small values. I'll work on it.

behdad
http://behdad.org/


On Mon, Sep 4, 2023 at 3:47 PM Behdad Esfahbod  wrote:

> On Mon, Sep 4, 2023 at 3:39 PM Behdad Esfahbod  wrote:
>
>> On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov 
>> wrote:
>>
>>> >
>>> > Wanted to point out that compiling with gcc and adding
>>> "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is
>>> probably the easiest way to track down large stacks at the moment. Note
>>> that af_cjk_metrics_init_widths (44480 bytes) and
>>> af_latin_metrics_init_widths (52992 bytes) are by far the largest.
>>> cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are
>>> a few others like the rasterizer stacks that are between 10-20kb which one
>>> may also want to look into, but these have been less problematic on my
>>> experience (though that may have been due to the even larger stacks being
>>> allocated first). Just wanted to point out how to measure and that the
>>> rasterizer might not be the first place to look.
>>>
>>> That is surprisingly large. Someone should examine how much of it is
>>> actually used. The rendering pool of "visited cells" (pixels) is
>>> rather predictable for a given outline. That is why it is easy for me.
>>>
>>
>> I took a look at the autohinter one. The problem comes from:
>>
>> AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX];
>>
>> in struct  AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260.
>>
>> My gut feeling is that that's a typo and should be:
>>
>>  AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX_LEN];
>>
>> where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that.
>>
>
> Same thing in afcjk.h.
>
> I'm finding these using:
>
> $ make CPPFLAGS=-Wframe-larger-than=4096
>


Re: ReactOS: stack vs heap

2023-09-04 Thread Behdad Esfahbod
On Mon, Sep 4, 2023 at 3:39 PM Behdad Esfahbod  wrote:

> On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov 
> wrote:
>
>> >
>> > Wanted to point out that compiling with gcc and adding
>> "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is
>> probably the easiest way to track down large stacks at the moment. Note
>> that af_cjk_metrics_init_widths (44480 bytes) and
>> af_latin_metrics_init_widths (52992 bytes) are by far the largest.
>> cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are
>> a few others like the rasterizer stacks that are between 10-20kb which one
>> may also want to look into, but these have been less problematic on my
>> experience (though that may have been due to the even larger stacks being
>> allocated first). Just wanted to point out how to measure and that the
>> rasterizer might not be the first place to look.
>>
>> That is surprisingly large. Someone should examine how much of it is
>> actually used. The rendering pool of "visited cells" (pixels) is
>> rather predictable for a given outline. That is why it is easy for me.
>>
>
> I took a look at the autohinter one. The problem comes from:
>
> AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX];
>
> in struct  AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260.
>
> My gut feeling is that that's a typo and should be:
>
>  AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX_LEN];
>
> where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that.
>

Same thing in afcjk.h.

I'm finding these using:

$ make CPPFLAGS=-Wframe-larger-than=4096


Re: ReactOS: stack vs heap

2023-09-04 Thread Behdad Esfahbod
On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov 
wrote:

> >
> > Wanted to point out that compiling with gcc and adding
> "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is
> probably the easiest way to track down large stacks at the moment. Note
> that af_cjk_metrics_init_widths (44480 bytes) and
> af_latin_metrics_init_widths (52992 bytes) are by far the largest.
> cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are
> a few others like the rasterizer stacks that are between 10-20kb which one
> may also want to look into, but these have been less problematic on my
> experience (though that may have been due to the even larger stacks being
> allocated first). Just wanted to point out how to measure and that the
> rasterizer might not be the first place to look.
>
> That is surprisingly large. Someone should examine how much of it is
> actually used. The rendering pool of "visited cells" (pixels) is
> rather predictable for a given outline. That is why it is easy for me.
>

I took a look at the autohinter one. The problem comes from:

AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX];

in struct  AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260.

My gut feeling is that that's a typo and should be:

 AF_LatinBlueRec  blues[AF_BLUE_STRINGSET_MAX_LEN];

where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that.


Re: ReactOS: stack vs heap

2023-09-01 Thread Alexei Podtelezhnikov
>
> Wanted to point out that compiling with gcc and adding "-stack-usage=2000" to 
> get reports about stacks larger than 2000 bytes is probably the easiest way 
> to track down large stacks at the moment. Note that 
> af_cjk_metrics_init_widths (44480 bytes) and af_latin_metrics_init_widths 
> (52992 bytes) are by far the largest. cf2_interpT2CharString (27520 bytes) is 
> also surprisingly large. There are a few others like the rasterizer stacks 
> that are between 10-20kb which one may also want to look into, but these have 
> been less problematic on my experience (though that may have been due to the 
> even larger stacks being allocated first). Just wanted to point out how to 
> measure and that the rasterizer might not be the first place to look.

That is surprisingly large. Someone should examine how much of it is
actually used. The rendering pool of "visited cells" (pixels) is
rather predictable for a given outline. That is why it is easy for me.



Re: ReactOS: stack vs heap

2023-09-01 Thread Ben Wagner
On Fri, Sep 1, 2023, 9:45 AM Alexei Podtelezhnikov 
wrote:

>
> 
> >> I will try the dynamic heap allocations for the rendering
> >> buffer. This might be the largest of them, I think. In addition,
> >> this should help with the rendering speed when rendering complex
> >> shapes like
> >> https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType
> >> makes several attempts until a sub-band can fit into a static stack
> >> buffer. We should be able to fit it into a dynamic buffer easily. I
> >> wonder if CabinSketch should be about as complex as we can tolerate
> >> and refuse anything much more complex than this. A lot of time-outs
> >> will be resolved...
> >
> > Perhaps a hybrid approach is the right one: Use the current
> > infrastructure up to a certain size, being as fast as possible because
> > dynamic allocation overhead can avoided, and resort to dynamic
> > allocation otherwise.
>
> Werner,
>
> FreeType is not shy about allocating buffers to load a glyph. This is just
> one more I highly doubt that it matters even at small sizes. We always
> allocate FT_Bitmap even for rendering too. As a matter of fact FreeType
> loses to the dense renderers when rendering complex glyphs precisely
> because of multiple restarts to fit the small buffer.
>
> Alexei


Wanted to point out that compiling with gcc and adding "-stack-usage=2000"
to get reports about stacks larger than 2000 bytes is probably the easiest
way to track down large stacks at the moment. Note that
af_cjk_metrics_init_widths (44480 bytes) and af_latin_metrics_init_widths
(52992 bytes) are by far the largest. cf2_interpT2CharString (27520 bytes)
is also surprisingly large. There are a few others like the rasterizer
stacks that are between 10-20kb which one may also want to look into, but
these have been less problematic on my experience (though that may have
been due to the even larger stacks being allocated first). Just wanted to
point out how to measure and that the rasterizer might not be the first
place to look.


Re: ReactOS: stack vs heap

2023-09-01 Thread Werner LEMBERG
>> Perhaps a hybrid approach is the right one: Use the current
>> infrastructure up to a certain size, being as fast as possible because
>> dynamic allocation overhead can avoided, and resort to dynamic
>> allocation otherwise.
>
> FreeType is not shy about allocating buffers to load a glyph. This
> is just one more I highly doubt that it matters even at small
> sizes. We always allocate FT_Bitmap even for rendering too.  As a
> matter of fact FreeType loses to the dense renderers when rendering
> complex glyphs precisely because of multiple restarts to fit the
> small buffer.

Well, I thought of a solution similar to Behdad's approach in commits
6f16b10019d7 and 56ddafa01ce2...

However, I leave that to you, since you've worked much more
intensively on the rendering code than me.


   Werner



Re: ReactOS: stack vs heap

2023-09-01 Thread Alexei Podtelezhnikov



>> I will try the dynamic heap allocations for the rendering
>> buffer. This might be the largest of them, I think. In addition,
>> this should help with the rendering speed when rendering complex
>> shapes like
>> https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType
>> makes several attempts until a sub-band can fit into a static stack
>> buffer. We should be able to fit it into a dynamic buffer easily. I
>> wonder if CabinSketch should be about as complex as we can tolerate
>> and refuse anything much more complex than this. A lot of time-outs
>> will be resolved...
> 
> Perhaps a hybrid approach is the right one: Use the current
> infrastructure up to a certain size, being as fast as possible because
> dynamic allocation overhead can avoided, and resort to dynamic
> allocation otherwise.

Werner, 

FreeType is not shy about allocating buffers to load a glyph. This is just one 
more I highly doubt that it matters even at small sizes. We always allocate 
FT_Bitmap even for rendering too. As a matter of fact FreeType loses to the 
dense renderers when rendering complex glyphs precisely because of multiple 
restarts to fit the small buffer.

Alexei


Re: ReactOS: stack vs heap

2023-09-01 Thread Oliver Brunner
Hi Alexei,

Alexei Podtelezhnikov schrieb:
> Do you understand why they are so averse to large stack allocations?

I am a long time lurker on this list just for this reason ;-). AROS (an
AmigaOS3 API compatible OS) has inherited the not automatically growing
stack of AmigaOS. As AROS mainly uses freetype for rendering of fonts
(thanks for that!) the problem with large stack allocations exist also
there. The problem is that the freetype library must live within the stack
size of the calling executable and the application might not expect such
large allocations from a library.

I decided not to send in any patches as AROS is not so widely used and my
patches are sufficient for AROS. I assumed this problem is not an issue
for anybody else. My patches use the native AROS memory allocation
functions so can't be integrated without rewriting, but they are trivial
(more or less just malloc/free).

I patched the following functions in freetype-2.10.4 which caused problems
on AROS:

src/autofit/afcjk.c:
In af_cjk_metrics_init_widths() hints[1] and dummy are quite big.

src/autofit/aflatin.c
In af_latin_metrics_init_widths() hints[1] and dummy are quite big.

src/autofit/aflatin2.c
In af_latin2_metrics_init_widths() dummy[1] is quite big.

src/autofit/afmodule.c
In af_autofitter_load_glyph() hints[1] and dummy are quite big.

I patched the AROS compiler code to monitor the stack usage of each
freetype function while running executeables and the rest of the freetype
code seems not to use much stack but I might not have covered every code
path while testing of course.

Regards

o1i
-- 
| Oliver Brunner | a...@oliver-brunner.de
| "o1i"  |




Re: ReactOS: stack vs heap

2023-08-31 Thread Werner LEMBERG


> I will try the dynamic heap allocations for the rendering
> buffer. This might be the largest of them, I think. In addition,
> this should help with the rendering speed when rendering complex
> shapes like
> https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType
> makes several attempts until a sub-band can fit into a static stack
> buffer. We should be able to fit it into a dynamic buffer easily. I
> wonder if CabinSketch should be about as complex as we can tolerate
> and refuse anything much more complex than this. A lot of time-outs
> will be resolved...

Perhaps a hybrid approach is the right one: Use the current
infrastructure up to a certain size, being as fast as possible because
dynamic allocation overhead can avoided, and resort to dynamic
allocation otherwise.


Werner



Re: ReactOS: stack vs heap

2023-08-31 Thread Alexei Podtelezhnikov
Hi Ben,

I will try the dynamic heap allocations for the rendering buffer. This
might be the largest of them, I think. In addition, this should help
with the rendering speed when rendering complex shapes like
https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType
makes several attempts until a sub-band can fit into a static stack
buffer. We should be able to fit it into a dynamic buffer easily. I
wonder if CabinSketch should be about as complex as we can tolerate
and refuse anything much more complex than this. A lot of time-outs
will be resolved...

Alexei




On Thu, Aug 31, 2023 at 7:40 AM Ben Wagner  wrote:
>
> I've been meaning for a long time to propose something like this. One (or 
> more) of those stacks is bigger than 50KB. While most desktop apps have 
> threads with large stacks (~xMBs) there are users that run many threads in 
> one process and force the stack size of the threads to be small (~xxKBs) to 
> enforce small stacks so that the stack memory stays hot. On macOS secondary 
> threads are also pretty small at 512KB. Calls to FreeType often happen deep 
> in some application layout code which often is taking up a bunch of stack 
> frames, so a sudden deep 50KB stack frame can be too big.
>
> In any event, I have seen these really big stack frames be an issue in 
> practice. I've so far generally been able to get around this issue but it 
> would be nice to not need the work arounds. I think a lot of people would be 
> happy if FreeType reduced the size of these stack frames.
>
> On Wed, Aug 30, 2023, 11:20 PM Alexei Podtelezhnikov  
> wrote:
>>
>> Hi folks,
>>
>> Found this patch from ReactOS
>> https://git.reactos.org/?p=reactos.git;a=blob;f=sdk/lib/3rdparty/freetype/freetype_ros.diff
>>
>> Do you understand why they are so averse to large stack allocations?
>>
>> Thanks,
>> Alexei
>>


--
Alexei A. Podtelezhnikov, PhD



Re: ReactOS: stack vs heap

2023-08-31 Thread Ben Wagner
I've been meaning for a long time to propose something like this. One (or
more) of those stacks is bigger than 50KB. While most desktop apps have
threads with large stacks (~xMBs) there are users that run many threads in
one process and force the stack size of the threads to be small (~xxKBs) to
enforce small stacks so that the stack memory stays hot. On macOS secondary
threads are also pretty small at 512KB. Calls to FreeType often happen deep
in some application layout code which often is taking up a bunch of stack
frames, so a sudden deep 50KB stack frame can be too big.

In any event, I have seen these really big stack frames be an issue in
practice. I've so far generally been able to get around this issue but it
would be nice to not need the work arounds. I think a lot of people would
be happy if FreeType reduced the size of these stack frames.

On Wed, Aug 30, 2023, 11:20 PM Alexei Podtelezhnikov 
wrote:

> Hi folks,
>
> Found this patch from ReactOS
>
> https://git.reactos.org/?p=reactos.git;a=blob;f=sdk/lib/3rdparty/freetype/freetype_ros.diff
>
> Do you understand why they are so averse to large stack allocations?
>
> Thanks,
> Alexei
>
>


ReactOS: stack vs heap

2023-08-30 Thread Alexei Podtelezhnikov
Hi folks,

Found this patch from ReactOS
https://git.reactos.org/?p=reactos.git;a=blob;f=sdk/lib/3rdparty/freetype/freetype_ros.diff

Do you understand why they are so averse to large stack allocations?

Thanks,
Alexei