Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread Sam Parker-Haynes
Hi Emanuel, I actually enabled loop unrolling for Wasm, in LLVM, for this same reason. The Turboshaft unroller doesn't appear to be triggering for the one loop I looked at, as I think it's too big. I have tweaked the TS unroller heuristics to generally unroll less, as added the compilation time

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread 'Darius Mercadier' via v8-dev
> This sounds like the easiest option, so the first thing to try..? Sure, go for it :) > With the caveat that yesterday I looked at one loop, on one machine, *the* > loop is still doing a reasonable amount of vector number crunching, perf reports the overhead of load and compare as 18% and 10%,

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread Sam Parker-Haynes
> Do you mean a load from a memory region that we would protect when we want an interrupt to happen so that the load traps then and we would rely on the trap handler to process the interrupt request? If so: maybe, but that's still a load, which isn't great. Yes, but because the subsequent instr

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread 'Darius Mercadier' via v8-dev
Hi, > what is the upper bound for interrupt latency? There is no clear official upper bound. LoopStackCheckElisionReducer removes stack checks from loops that have less than kMaxIterForStackCheckRemoval (= 5000) iterations, but I chose that number fairly arbitrarily, using the "meh, sounds vaguel

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread Jakob Kummerow
> > what is the upper bound for interrupt latency? If we have an inner loop, > without calls, I would assume we could have a reasonable number of > instructions and iterations before that latency would be adversely affected? > Yes, interrupting doesn't need to be immediate. There are situations wh

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread Sam Parker-Haynes
Hi, It seems my previous reply to Emanuel didn't send... Basically, I have enabled wasm unrolling in LLVM to help mitigate this already, and I have modified the TS unroller to help improve compile time, and performance, so I'd like to avoid increasing code size. With respect to figuring out wh

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-13 Thread 'Darius Mercadier' via v8-dev
Hi Sam, Just so that everybody is on the same page: these stack checks are really interrupt checks rather than stack-overflow checks or something like that. And yea, performing them on every loop iteration is a bit of a waste of time, but we kinda need them in every loop where we can't prove that

Re: [v8-dev] Overhead of WasmStackCheck

2025-06-12 Thread 'Emanuel Ziegler' via v8-dev
Hi Sam, In principle, loop unrolling should already reduce the number of stack checks, but it could be that it's insufficient or that for whatever reason this optimization does not get applied here. Did you take a look at the generated code? Cheers, Emanuel On Thu, Jun 12, 2025 at 4:44 PM Sa

[v8-dev] Overhead of WasmStackCheck

2025-06-12 Thread Sam Parker-Haynes
Hi! While running some AI code, the loop header WasmStackCheck was appearing quite heavily in the profile. Disabling the checks results in ~1.5% speedup. So, is it necessary to execute these for every iteration? Or could we wrap inner loops, devoid of a stack check, in a new loop with one so t