Re: ReactOS: stack vs heap
> Thanks. One thing I don't understand, is the "+ 2" in the code below: > > AF_LatinBlue blue_sorted[AF_BLUE_STRINGSET_MAX_LEN + 2]; > > for ( i = 0; i < axis->blue_count; i++ ) > blue_sorted[i] = >blues[i]; > > If that + 2 is correct to be there, then we need a similar + 2 in > aflatin.h for blues. But I cannot see why it's needed. Good catch! This is due to a synchronization error between FreeType and ttfautohint: I've originally developed the code for ttfautohint, which adds two artificial blue zones, then imported it back into FreeType, where I forgot to adjust the array size accordingly. Now fixed in git, thanks. Werner
Re: ReactOS: stack vs heap
On Mon, Sep 4, 2023 at 8:27 PM Werner LEMBERG wrote: > > > Upon further investigation, I think my proposed change is correct. > > Thanks a lot, committed! > Thanks. One thing I don't understand, is the "+ 2" in the code below: AF_LatinBlue blue_sorted[AF_BLUE_STRINGSET_MAX_LEN + 2]; for ( i = 0; i < axis->blue_count; i++ ) blue_sorted[i] = >blues[i]; If that + 2 is correct to be there, then we need a similar + 2 in aflatin.h for blues. But I cannot see why it's needed.
Re: ReactOS: stack vs heap
> Upon further investigation, I think my proposed change is correct. Thanks a lot, committed! Werner
Re: ReactOS: stack vs heap
Upon further investigation, I think my proposed change is correct. behdad http://behdad.org/ On Mon, Sep 4, 2023 at 4:26 PM Behdad Esfahbod wrote: > What I said is wrong. But the blues array should be dynamically allocated, > and use a embedded version for small values. I'll work on it. > > behdad > http://behdad.org/ > > > On Mon, Sep 4, 2023 at 3:47 PM Behdad Esfahbod wrote: > >> On Mon, Sep 4, 2023 at 3:39 PM Behdad Esfahbod wrote: >> >>> On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov < >>> apodt...@gmail.com> wrote: >>> > > Wanted to point out that compiling with gcc and adding "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is probably the easiest way to track down large stacks at the moment. Note that af_cjk_metrics_init_widths (44480 bytes) and af_latin_metrics_init_widths (52992 bytes) are by far the largest. cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are a few others like the rasterizer stacks that are between 10-20kb which one may also want to look into, but these have been less problematic on my experience (though that may have been due to the even larger stacks being allocated first). Just wanted to point out how to measure and that the rasterizer might not be the first place to look. That is surprisingly large. Someone should examine how much of it is actually used. The rendering pool of "visited cells" (pixels) is rather predictable for a given outline. That is why it is easy for me. >>> >>> I took a look at the autohinter one. The problem comes from: >>> >>> AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX]; >>> >>> in struct AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260. >>> >>> My gut feeling is that that's a typo and should be: >>> >>> AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX_LEN]; >>> >>> where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that. >>> >> >> Same thing in afcjk.h. >> >> I'm finding these using: >> >> $ make CPPFLAGS=-Wframe-larger-than=4096 >> >
Re: ReactOS: stack vs heap
What I said is wrong. But the blues array should be dynamically allocated, and use a embedded version for small values. I'll work on it. behdad http://behdad.org/ On Mon, Sep 4, 2023 at 3:47 PM Behdad Esfahbod wrote: > On Mon, Sep 4, 2023 at 3:39 PM Behdad Esfahbod wrote: > >> On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov >> wrote: >> >>> > >>> > Wanted to point out that compiling with gcc and adding >>> "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is >>> probably the easiest way to track down large stacks at the moment. Note >>> that af_cjk_metrics_init_widths (44480 bytes) and >>> af_latin_metrics_init_widths (52992 bytes) are by far the largest. >>> cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are >>> a few others like the rasterizer stacks that are between 10-20kb which one >>> may also want to look into, but these have been less problematic on my >>> experience (though that may have been due to the even larger stacks being >>> allocated first). Just wanted to point out how to measure and that the >>> rasterizer might not be the first place to look. >>> >>> That is surprisingly large. Someone should examine how much of it is >>> actually used. The rendering pool of "visited cells" (pixels) is >>> rather predictable for a given outline. That is why it is easy for me. >>> >> >> I took a look at the autohinter one. The problem comes from: >> >> AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX]; >> >> in struct AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260. >> >> My gut feeling is that that's a typo and should be: >> >> AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX_LEN]; >> >> where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that. >> > > Same thing in afcjk.h. > > I'm finding these using: > > $ make CPPFLAGS=-Wframe-larger-than=4096 >
Re: ReactOS: stack vs heap
On Mon, Sep 4, 2023 at 3:39 PM Behdad Esfahbod wrote: > On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov > wrote: > >> > >> > Wanted to point out that compiling with gcc and adding >> "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is >> probably the easiest way to track down large stacks at the moment. Note >> that af_cjk_metrics_init_widths (44480 bytes) and >> af_latin_metrics_init_widths (52992 bytes) are by far the largest. >> cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are >> a few others like the rasterizer stacks that are between 10-20kb which one >> may also want to look into, but these have been less problematic on my >> experience (though that may have been due to the even larger stacks being >> allocated first). Just wanted to point out how to measure and that the >> rasterizer might not be the first place to look. >> >> That is surprisingly large. Someone should examine how much of it is >> actually used. The rendering pool of "visited cells" (pixels) is >> rather predictable for a given outline. That is why it is easy for me. >> > > I took a look at the autohinter one. The problem comes from: > > AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX]; > > in struct AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260. > > My gut feeling is that that's a typo and should be: > > AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX_LEN]; > > where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that. > Same thing in afcjk.h. I'm finding these using: $ make CPPFLAGS=-Wframe-larger-than=4096
Re: ReactOS: stack vs heap
On Sat, Sep 2, 2023 at 12:31 AM Alexei Podtelezhnikov wrote: > > > > Wanted to point out that compiling with gcc and adding > "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is > probably the easiest way to track down large stacks at the moment. Note > that af_cjk_metrics_init_widths (44480 bytes) and > af_latin_metrics_init_widths (52992 bytes) are by far the largest. > cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are > a few others like the rasterizer stacks that are between 10-20kb which one > may also want to look into, but these have been less problematic on my > experience (though that may have been due to the even larger stacks being > allocated first). Just wanted to point out how to measure and that the > rasterizer might not be the first place to look. > > That is surprisingly large. Someone should examine how much of it is > actually used. The rendering pool of "visited cells" (pixels) is > rather predictable for a given outline. That is why it is easy for me. > I took a look at the autohinter one. The problem comes from: AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX]; in struct AF_LatinAxisRec_. The value of AF_BLUE_STRINGSET_MAX is ~260. My gut feeling is that that's a typo and should be: AF_LatinBlueRec blues[AF_BLUE_STRINGSET_MAX_LEN]; where AF_BLUE_STRINGSET_MAX_LEN is 8. I haven't tested that.
Re: ReactOS: stack vs heap
> > Wanted to point out that compiling with gcc and adding "-stack-usage=2000" to > get reports about stacks larger than 2000 bytes is probably the easiest way > to track down large stacks at the moment. Note that > af_cjk_metrics_init_widths (44480 bytes) and af_latin_metrics_init_widths > (52992 bytes) are by far the largest. cf2_interpT2CharString (27520 bytes) is > also surprisingly large. There are a few others like the rasterizer stacks > that are between 10-20kb which one may also want to look into, but these have > been less problematic on my experience (though that may have been due to the > even larger stacks being allocated first). Just wanted to point out how to > measure and that the rasterizer might not be the first place to look. That is surprisingly large. Someone should examine how much of it is actually used. The rendering pool of "visited cells" (pixels) is rather predictable for a given outline. That is why it is easy for me.
Re: ReactOS: stack vs heap
On Fri, Sep 1, 2023, 9:45 AM Alexei Podtelezhnikov wrote: > > > >> I will try the dynamic heap allocations for the rendering > >> buffer. This might be the largest of them, I think. In addition, > >> this should help with the rendering speed when rendering complex > >> shapes like > >> https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType > >> makes several attempts until a sub-band can fit into a static stack > >> buffer. We should be able to fit it into a dynamic buffer easily. I > >> wonder if CabinSketch should be about as complex as we can tolerate > >> and refuse anything much more complex than this. A lot of time-outs > >> will be resolved... > > > > Perhaps a hybrid approach is the right one: Use the current > > infrastructure up to a certain size, being as fast as possible because > > dynamic allocation overhead can avoided, and resort to dynamic > > allocation otherwise. > > Werner, > > FreeType is not shy about allocating buffers to load a glyph. This is just > one more I highly doubt that it matters even at small sizes. We always > allocate FT_Bitmap even for rendering too. As a matter of fact FreeType > loses to the dense renderers when rendering complex glyphs precisely > because of multiple restarts to fit the small buffer. > > Alexei Wanted to point out that compiling with gcc and adding "-stack-usage=2000" to get reports about stacks larger than 2000 bytes is probably the easiest way to track down large stacks at the moment. Note that af_cjk_metrics_init_widths (44480 bytes) and af_latin_metrics_init_widths (52992 bytes) are by far the largest. cf2_interpT2CharString (27520 bytes) is also surprisingly large. There are a few others like the rasterizer stacks that are between 10-20kb which one may also want to look into, but these have been less problematic on my experience (though that may have been due to the even larger stacks being allocated first). Just wanted to point out how to measure and that the rasterizer might not be the first place to look.
Re: ReactOS: stack vs heap
>> Perhaps a hybrid approach is the right one: Use the current >> infrastructure up to a certain size, being as fast as possible because >> dynamic allocation overhead can avoided, and resort to dynamic >> allocation otherwise. > > FreeType is not shy about allocating buffers to load a glyph. This > is just one more I highly doubt that it matters even at small > sizes. We always allocate FT_Bitmap even for rendering too. As a > matter of fact FreeType loses to the dense renderers when rendering > complex glyphs precisely because of multiple restarts to fit the > small buffer. Well, I thought of a solution similar to Behdad's approach in commits 6f16b10019d7 and 56ddafa01ce2... However, I leave that to you, since you've worked much more intensively on the rendering code than me. Werner
Re: ReactOS: stack vs heap
>> I will try the dynamic heap allocations for the rendering >> buffer. This might be the largest of them, I think. In addition, >> this should help with the rendering speed when rendering complex >> shapes like >> https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType >> makes several attempts until a sub-band can fit into a static stack >> buffer. We should be able to fit it into a dynamic buffer easily. I >> wonder if CabinSketch should be about as complex as we can tolerate >> and refuse anything much more complex than this. A lot of time-outs >> will be resolved... > > Perhaps a hybrid approach is the right one: Use the current > infrastructure up to a certain size, being as fast as possible because > dynamic allocation overhead can avoided, and resort to dynamic > allocation otherwise. Werner, FreeType is not shy about allocating buffers to load a glyph. This is just one more I highly doubt that it matters even at small sizes. We always allocate FT_Bitmap even for rendering too. As a matter of fact FreeType loses to the dense renderers when rendering complex glyphs precisely because of multiple restarts to fit the small buffer. Alexei
Re: ReactOS: stack vs heap
Hi Alexei, Alexei Podtelezhnikov schrieb: > Do you understand why they are so averse to large stack allocations? I am a long time lurker on this list just for this reason ;-). AROS (an AmigaOS3 API compatible OS) has inherited the not automatically growing stack of AmigaOS. As AROS mainly uses freetype for rendering of fonts (thanks for that!) the problem with large stack allocations exist also there. The problem is that the freetype library must live within the stack size of the calling executable and the application might not expect such large allocations from a library. I decided not to send in any patches as AROS is not so widely used and my patches are sufficient for AROS. I assumed this problem is not an issue for anybody else. My patches use the native AROS memory allocation functions so can't be integrated without rewriting, but they are trivial (more or less just malloc/free). I patched the following functions in freetype-2.10.4 which caused problems on AROS: src/autofit/afcjk.c: In af_cjk_metrics_init_widths() hints[1] and dummy are quite big. src/autofit/aflatin.c In af_latin_metrics_init_widths() hints[1] and dummy are quite big. src/autofit/aflatin2.c In af_latin2_metrics_init_widths() dummy[1] is quite big. src/autofit/afmodule.c In af_autofitter_load_glyph() hints[1] and dummy are quite big. I patched the AROS compiler code to monitor the stack usage of each freetype function while running executeables and the rest of the freetype code seems not to use much stack but I might not have covered every code path while testing of course. Regards o1i -- | Oliver Brunner | a...@oliver-brunner.de | "o1i" |
Re: ReactOS: stack vs heap
> I will try the dynamic heap allocations for the rendering > buffer. This might be the largest of them, I think. In addition, > this should help with the rendering speed when rendering complex > shapes like > https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType > makes several attempts until a sub-band can fit into a static stack > buffer. We should be able to fit it into a dynamic buffer easily. I > wonder if CabinSketch should be about as complex as we can tolerate > and refuse anything much more complex than this. A lot of time-outs > will be resolved... Perhaps a hybrid approach is the right one: Use the current infrastructure up to a certain size, being as fast as possible because dynamic allocation overhead can avoided, and resort to dynamic allocation otherwise. Werner
Re: ReactOS: stack vs heap
Hi Ben, I will try the dynamic heap allocations for the rendering buffer. This might be the largest of them, I think. In addition, this should help with the rendering speed when rendering complex shapes like https://fonts.google.com/specimen/Cabin+Sketch. Currently, FreeType makes several attempts until a sub-band can fit into a static stack buffer. We should be able to fit it into a dynamic buffer easily. I wonder if CabinSketch should be about as complex as we can tolerate and refuse anything much more complex than this. A lot of time-outs will be resolved... Alexei On Thu, Aug 31, 2023 at 7:40 AM Ben Wagner wrote: > > I've been meaning for a long time to propose something like this. One (or > more) of those stacks is bigger than 50KB. While most desktop apps have > threads with large stacks (~xMBs) there are users that run many threads in > one process and force the stack size of the threads to be small (~xxKBs) to > enforce small stacks so that the stack memory stays hot. On macOS secondary > threads are also pretty small at 512KB. Calls to FreeType often happen deep > in some application layout code which often is taking up a bunch of stack > frames, so a sudden deep 50KB stack frame can be too big. > > In any event, I have seen these really big stack frames be an issue in > practice. I've so far generally been able to get around this issue but it > would be nice to not need the work arounds. I think a lot of people would be > happy if FreeType reduced the size of these stack frames. > > On Wed, Aug 30, 2023, 11:20 PM Alexei Podtelezhnikov > wrote: >> >> Hi folks, >> >> Found this patch from ReactOS >> https://git.reactos.org/?p=reactos.git;a=blob;f=sdk/lib/3rdparty/freetype/freetype_ros.diff >> >> Do you understand why they are so averse to large stack allocations? >> >> Thanks, >> Alexei >> -- Alexei A. Podtelezhnikov, PhD
Re: ReactOS: stack vs heap
I've been meaning for a long time to propose something like this. One (or more) of those stacks is bigger than 50KB. While most desktop apps have threads with large stacks (~xMBs) there are users that run many threads in one process and force the stack size of the threads to be small (~xxKBs) to enforce small stacks so that the stack memory stays hot. On macOS secondary threads are also pretty small at 512KB. Calls to FreeType often happen deep in some application layout code which often is taking up a bunch of stack frames, so a sudden deep 50KB stack frame can be too big. In any event, I have seen these really big stack frames be an issue in practice. I've so far generally been able to get around this issue but it would be nice to not need the work arounds. I think a lot of people would be happy if FreeType reduced the size of these stack frames. On Wed, Aug 30, 2023, 11:20 PM Alexei Podtelezhnikov wrote: > Hi folks, > > Found this patch from ReactOS > > https://git.reactos.org/?p=reactos.git;a=blob;f=sdk/lib/3rdparty/freetype/freetype_ros.diff > > Do you understand why they are so averse to large stack allocations? > > Thanks, > Alexei > >
ReactOS: stack vs heap
Hi folks, Found this patch from ReactOS https://git.reactos.org/?p=reactos.git;a=blob;f=sdk/lib/3rdparty/freetype/freetype_ros.diff Do you understand why they are so averse to large stack allocations? Thanks, Alexei