Sounds good! +1 And interesting numbers - yeah, often with ctor evalling I've seen function size shrink while data size increases, so that makes sense. And 2.7% less functions is definitely useful.
(Btw, I suspect that even if ctor evalling increases the total size or total gzipped size by a little that it might be worth it for startup time, but I haven't measured that carefully.) On Tue, Aug 28, 2018 at 10:47 AM Charles Vaughn <[email protected]> wrote: > I thought about using malloc without reserving first, but as you say not > being able to free safely makes that a non starter. One option, if dlmalloc > can guarantee realloc returning the original pointer when shrinking is to > malloc at ctor-eval time a reasonable upper limit for the static/stack > area. Then everything after the data segment is managed by malloc. While > looking into this it seems to me the size of the static area is fixed > during the initial script load, as staticSealed = true; is a top level > expression. There can be differences between the node and the browser > (possibly also worker version) but it should be possible to compute a > maximum that will work in all cases. The scary situation with this though > is calling allocate before the runtime has initialized. It's exported by > default, so integrating code is free to call it, meaning there's no way to > statically determine it won't be called. > > Here's my proposal, I'm happy to add this to a binaryen/emscripten github > issue as well. > > Add a new config switch, -s EVAL_CTOR_MALLOC, defaulting to false. When > it's on, the following behaviors will be added: > > > - The static area will use a default fixed value of 10K (with later > work to compute this value directly). Allocating past it will abort. > - Calling allocate after the static area is sealed, but before the > runtime has initalized will raise an exception > - Specifying a different stack or total memory size via Module will > abort > > > I've prototyped these changes with a local binaryen on my project, and the > size reduction is modest, and in some cases might be counter-intuitive. > Uncompressed, ctor_eval is 1.5% smaller, 1.1% with gzip-9. > > However, looking closely with Bloaty McBloat face: > > Without ctor-eval: > > VM SIZE FILE SIZE > -------------- -------------- > NAN% 0 [14051 Others] 1.82Mi 82.4% > NAN% 0 [section Data] 399Ki 17.6% > 100.0% 0 TOTAL 2.22Mi 100.0% > > With: > VM SIZE FILE SIZE > -------------- -------------- > NAN% 0 [13207 Others] 1.77Mi 81.2% > NAN% 0 [section Data] 418Ki 18.8% > 100.0% 0 TOTAL 2.18Mi 100.0% > > Making a reduction of 2.7% of function size, which is more expensive > browser side than the data section, especially considering WASM engines are > doing less upfront optimizations (init functions will never take advantage > of optimized execution). It may also be helpful to have free zero out the > memory during ctor-eval. Once there's some support for init function > reordering, it might be worth adding a memory cost pass, and consider not > optimizing initializers that are smaller than the memory they take up. > > On Monday, August 27, 2018 at 2:45:25 PM UTC-7, Alon Zakai wrote: >> >> I think that's right. >> >> STATIC_PREALLOC_SIZE looks like it could work. I worry though about >> asking the user to set that value. But if I understand correctly, in that >> idea you'd be able to do a normal malloc at eval-ctor time? That would be >> very good if so. >> >> I'm not sure, but I think we can do it without a user-specified value. We >> know where the data segment ends, and in eval-ctors we can place mallocs >> starting there. Then at the end we know how much mallocing we did, and >> either increase the data segment or do a static alloc at runtime (before >> any others). However, even that seems tricky, as those mallocs would not be >> freeable. So if STATIC_PREALLOC_SIZE allows a normal malloc to be done, so >> things are freeable normally, that would be better. >> >> >> >> >> >> On Tue, Aug 21, 2018 at 4:01 PM Charles Vaughn <[email protected]> wrote: >> >>> Thanks for that info. With that and digging in the code, I think I've >>> got a better understanding of why malloc support would be challenging. >>> >>> It looks like an asm.js/WASM memory is laid out like (data segment) >>> + (staticAlloc space) + (stack) + (dynamicAlloc space) + (mallocable >>> heap). The dynamicAlloc space seems to be for data that comes in before the >>> runtime initializes, like the filesystem stuff you mentioned. I believe >>> other code calling allocate or getMemory before initialization would also >>> hit this. Stack space is configurable via the module object. >>> >>> If we added something like STATIC_PREALLOC_SIZE. That would fix the size >>> of the staticAlloc space, as well as use it for any dynamicAllocs (possibly >>> introducing the chance dynamicAllocs fails). If a user built with both >>> TOTAL_STACK and STATIC_PREALLOC_SIZE, then binaryen would be able to would >>> be able to compute the same DYNAMIC_TOP as the actual invocation. >>> >>> On Tuesday, August 21, 2018 at 1:02:03 PM UTC-7, Alon Zakai wrote: >>>> >>>> I think malloc is something neither the asm.js nor wasm ctor evallers >>>> support currently (asm.js code looks like it allows using DYNAMICTOP_PTR, >>>> but if the value there changes, we fail to eval that ctor). In both cases, >>>> the tricky thing is to turn the malloc into a fully static allocation, >>>> which needs some care as the location of dynamically allocated memory is >>>> not always set at compile time (we allow static allocations during startup, >>>> like for the filesystem - this is something we could reconsider). >>>> >>>> Re-ordering should work, yeah. In asm2wasm we currently just have the >>>> list of constructors, so we'd need to also preserve their priorities after >>>> LLVM, but that doesn't seem too hard. For the wasm backend, I believe they >>>> are all collapsed into a single ctor anyhow, so that model would need to >>>> change to allow such optimization. >>>> >>>> On Mon, Aug 20, 2018 at 5:58 PM Charles Vaughn <[email protected]> >>>> wrote: >>>> >>>>> Looking into why EVAL_CTORs isn't helping with my project, I've come >>>>> across a limitation that seems to only exist for WASM, not asm.js. Notably >>>>> some of the initializers in my project invoke malloc (I believe by way of >>>>> shared pointer initialization). It looks like malloc invokes sbrk (which >>>>> is >>>>> explicitly disallowed by the asm.js ctor_evaller). In the case of binaryen >>>>> this fails by way of sbrk trying to access DYNAMICTOP_PTR, which ends up >>>>> as >>>>> a '...stopping since could not eval: tried to access a dangerous >>>>> (import-initialized) global: global$0' >>>>> >>>>> It does seem like something that could be handled, and would be a big >>>>> win for more dynamic initialization type scenarios. I believe it works >>>>> when >>>>> targetting asm.js as that handles its memory allocation differently. >>>>> >>>>> Another point is that constructor evaluation order is flexible. I know >>>>> there is some machinery to control initializer ordering, which may limit >>>>> this approach in cases where it does matter, but it's possible for the >>>>> constructor evaluator to re-order constructors so that eval-able ones are >>>>> moved to the front of the execution list. >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "emscripten-discuss" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "emscripten-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- > You received this message because you are subscribed to the Google Groups > "emscripten-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
