Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
> On Sep 21, 2018, at 9:44 AM, Guillaume Emont wrote: > > Quoting Yusuke Suzuki (2018-09-21 10:10:59) >> Yeah, I'm not planning to enable LLInt ASM interpreter on 32bit architectures >> since no buildbot exists for this configuration. > > I'm confused. Do you mean you don't want to enable LLint instead of > CLoop, for the case when JIT is disabled on 32-bit architectures? If you guys want to take responsibility for 32-bit then you can enable whatever LLInt config you want on 32-bit. > FTR, the configuration LLInt(with offlineasm)+jit+dfg is tested in > 32-bit testbots for at least mips, armv7 and x86. > > >> And we should make 32bit architectures JSVALUE64, so LLInt JSVALUE32_64 >> should >> be removed in the future. > > See what Filip and Michael were saying. We believe that we need > JSVALUE32_64, and we are willing to maintain it, as the performance gap > between LLInt or CLoop and JIT+DFG on 32-bit architectures is > significant. I’m saying we should remove JSVALUE32_64. That is my preference. I’m letting it stay in tree so long as someone maintains it, but honestly I’d prefer it if it wasn’t maintained and if we could let it die. I’d like to see the majority of JSC development move to 64-bit. I’d prefer if new features or enhancements were 64-bit only, since that means that it will take less time to develop and test them. I think that folks doing JSC development should be encouraged to land changes only for 64-bit since that’s our focus as a project. -Filip > > Guillaume > >> >> On Fri, Sep 21, 2018 at 2:33 AM Michael Catanzaro >> wrote: >> >>>On Thu, Sep 20, 2018 at 12:02 PM, Filip Pizlo wrote: >>> - Enable cloop/JSVALUE64 to work on 32-bit. I don’t think it does >>> right now, but that’s probably trivial to fix. >>> - Switch Darwin ports to that configuration for 32-bit. >>> - When changes land to support new features, make it mandatory to >>> support JSVALUE64 and optional to support JSVALUE32_64. Such changes >>> should include whoever volunteers to maintain JSVALUE32_64 in CC. >>> >>> If you guys consider JSVALUE32_64 to be critical, then you can go >>> ahead and maintain it. We’ll let JSVALUE32_64 stay in the tree so >>> long as someone is maintaining it. >> >>Yes that's fine with us. I think that's the previous agreement, anyway. >>:) >> >>Michael >> >> >> >> >> -- >> Best regards, >> Yusuke Suzuki > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > https://lists.webkit.org/mailman/listinfo/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
Quoting Yusuke Suzuki (2018-09-21 10:10:59) > Yeah, I'm not planning to enable LLInt ASM interpreter on 32bit architectures > since no buildbot exists for this configuration. I'm confused. Do you mean you don't want to enable LLint instead of CLoop, for the case when JIT is disabled on 32-bit architectures? FTR, the configuration LLInt(with offlineasm)+jit+dfg is tested in 32-bit testbots for at least mips, armv7 and x86. > And we should make 32bit architectures JSVALUE64, so LLInt JSVALUE32_64 should > be removed in the future. See what Filip and Michael were saying. We believe that we need JSVALUE32_64, and we are willing to maintain it, as the performance gap between LLInt or CLoop and JIT+DFG on 32-bit architectures is significant. Guillaume > > On Fri, Sep 21, 2018 at 2:33 AM Michael Catanzaro > wrote: > > On Thu, Sep 20, 2018 at 12:02 PM, Filip Pizlo wrote: > > - Enable cloop/JSVALUE64 to work on 32-bit. I don’t think it does > > right now, but that’s probably trivial to fix. > > - Switch Darwin ports to that configuration for 32-bit. > > - When changes land to support new features, make it mandatory to > > support JSVALUE64 and optional to support JSVALUE32_64. Such changes > > should include whoever volunteers to maintain JSVALUE32_64 in CC. > > > > If you guys consider JSVALUE32_64 to be critical, then you can go > > ahead and maintain it. We’ll let JSVALUE32_64 stay in the tree so > > long as someone is maintaining it. > > Yes that's fine with us. I think that's the previous agreement, anyway. > :) > > Michael > > > > > -- > Best regards, > Yusuke Suzuki ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
Yeah, I'm not planning to enable LLInt ASM interpreter on 32bit architectures since no buildbot exists for this configuration. And we should make 32bit architectures JSVALUE64, so LLInt JSVALUE32_64 should be removed in the future. On Fri, Sep 21, 2018 at 2:33 AM Michael Catanzaro wrote: > On Thu, Sep 20, 2018 at 12:02 PM, Filip Pizlo wrote: > > - Enable cloop/JSVALUE64 to work on 32-bit. I don’t think it does > > right now, but that’s probably trivial to fix. > > - Switch Darwin ports to that configuration for 32-bit. > > - When changes land to support new features, make it mandatory to > > support JSVALUE64 and optional to support JSVALUE32_64. Such changes > > should include whoever volunteers to maintain JSVALUE32_64 in CC. > > > > If you guys consider JSVALUE32_64 to be critical, then you can go > > ahead and maintain it. We’ll let JSVALUE32_64 stay in the tree so > > long as someone is maintaining it. > > Yes that's fine with us. I think that's the previous agreement, anyway. > :) > > Michael > > -- Best regards, Yusuke Suzuki ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
On Thu, Sep 20, 2018 at 12:02 PM, Filip Pizlo wrote: - Enable cloop/JSVALUE64 to work on 32-bit. I don’t think it does right now, but that’s probably trivial to fix. - Switch Darwin ports to that configuration for 32-bit. - When changes land to support new features, make it mandatory to support JSVALUE64 and optional to support JSVALUE32_64. Such changes should include whoever volunteers to maintain JSVALUE32_64 in CC. If you guys consider JSVALUE32_64 to be critical, then you can go ahead and maintain it. We’ll let JSVALUE32_64 stay in the tree so long as someone is maintaining it. Yes that's fine with us. I think that's the previous agreement, anyway. :) Michael ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
Most JSC development focuses on JSVALUE64. JSVALUE32_64 is currently years behind JSVALUE64 - it has no concurrent JIT, no concurrent GC, no FTL. We regularly do tuning that ends up affecting both JSVALUE32_64 and JSVALUE64 without even testing its impact on JSVALUE32_64. JSVALUE32_64 is a second-class citizen in JSC. I propose this: - Enable cloop/JSVALUE64 to work on 32-bit. I don’t think it does right now, but that’s probably trivial to fix. - Switch Darwin ports to that configuration for 32-bit. - When changes land to support new features, make it mandatory to support JSVALUE64 and optional to support JSVALUE32_64. Such changes should include whoever volunteers to maintain JSVALUE32_64 in CC. If you guys consider JSVALUE32_64 to be critical, then you can go ahead and maintain it. We’ll let JSVALUE32_64 stay in the tree so long as someone is maintaining it. -Filip > On Sep 20, 2018, at 9:07 AM, Michael Catanzaro wrote: > > > I believe Guillaume has previously established that results in a substantial > performance regression for WPE. It is currently running in production on tens > of millions of consumer set top boxes. I think that's substantial testing. :) > > Michael > ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
On Thu, Sep 20, 2018 at 11:49 AM, Geoffrey Garen wrote: Is there something Linux-specific about CLoop or LLInt, or is this a compiler difference? No clue. I'll refer you to the results of Guillaume's investigation: https://lists.webkit.org/pipermail/webkit-dev/2018-February/029877.html Michael ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
> Interestingly, the improvement is not so large. In Linux box, it was 2x. But > in macOS, it is 15%. Is there something Linux-specific about CLoop or LLInt, or is this a compiler difference? Thanks, Geoff ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
I believe Guillaume has previously established that results in a substantial performance regression for WPE. It is currently running in production on tens of millions of consumer set top boxes. I think that's substantial testing. :) Michael ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
I think that we should move to removing JSVALUE32_64, since it doesn’t get significant testing or maintenance anymore. I’d love it if 32-bit targets used the cloop with JSVALUE64, so that we can rip out the 32-bit jit and offlineasm backends, and remove the 32-bit representation code from the runtime. I’m fine with using asm llint on 64-bit platforms, but using it on 32-bit platforms seems like it’ll be short lived. -Filip > On Sep 20, 2018, at 12:00 AM, Yusuke Suzuki > wrote: > > I've just set up MacBook Pro to measure the effect on macOS. > > The results are the followings. > > VMs tested: > "baseline" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/nojit/Release/jsc > "patched" at > /Users/yusukesuzuki/dev/WebKit/WebKitBuild/nojit-llint/Release/jsc > > Collected 2 samples per benchmark/VM, with 2 VM invocations per benchmark. > Emitted a call to gc() between sample > measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used > the jsc-specific preciseTime() > function to get microsecond-level timing. Reporting benchmark execution times > with 95% confidence intervals in > milliseconds. > >baseline patched > > > ai-astar 1738.056+-49.666 ^ > 1568.904+-44.535^ definitely 1.1078x faster > audio-beat-detection 1127.677+-15.749 ^ > 972.323+-23.908^ definitely 1.1598x faster > audio-dft 942.952+-107.209 > 919.933+-310.247 might be 1.0250x faster > audio-fft 985.489+-47.414 ^ > 796.955+-25.476^ definitely 1.2366x faster > audio-oscillator 967.891+-34.854 ^ > 801.778+-18.226^ definitely 1.2072x faster > imaging-darkroom 1265.340+-114.464^ > 1099.233+-2.372 ^ definitely 1.1511x faster > imaging-desaturate1737.826+-40.791 ? > 1749.010+-167.969 ? > imaging-gaussian-blur 7846.369+-52.165 ^ > 6392.379+-1025.168 ^ definitely 1.2275x faster > json-parse-financial33.141+-0.473 > 33.054+-1.058 > json-stringify-tinderbox20.803+-0.901 > 20.664+-0.717 > stanford-crypto-aes401.589+-39.750 > 376.622+-12.111 might be 1.0663x faster > stanford-crypto-ccm245.629+-45.322 > 228.013+-8.976 might be 1.0773x faster > stanford-crypto-pbkdf2 941.178+-28.744 > 864.462+-60.083 might be 1.0887x faster > stanford-crypto-sha256-iterative 299.988+-47.729 > 270.849+-32.356 might be 1.1076x faster > > 1325.281+-2.613 ^ > 1149.584+-75.875^ definitely 1.1528x faster > > Interestingly, the improvement is not so large. In Linux box, it was 2x. But > in macOS, it is 15%. > But I think it is very nice if we can get 15% boost without any drawbacks. > >> On Thu, Sep 20, 2018 at 3:08 PM Saam Barati wrote: >> Interesting! I must have not run this experiment correctly when I did it. >> >> - Saam >> >>> On Sep 19, 2018, at 7:31 PM, Yusuke Suzuki >>> wrote: >>> On Thu, Sep 20, 2018 at 12:54 AM Saam Barati wrote: To elaborate: I ran this same experiment before. And I forgot to turn off the RegExp JIT and got results similar to what you got. Once I turned off the RegExp JIT, I saw no perf difference. >>> >>> Yeah, I disabled JIT and RegExpJIT explicitly by using >>> >>> export JSC_useJIT=false >>> export JSC_useRegExpJIT=false >>> >>> and I checked no JIT code is generated by running dumpDisassembly. And I >>> also put `CRASH()` in ExecutableAllocator::singleton() to ensure no >>> executable memory is allocated. >>> The result is the same. I think `useJIT=false` disables RegExp JIT too. >>> >>>baseline >>> patched >>> >>> ai-astar 3499.046+-14.772 ^ >>> 1897.624+-234.517 ^ definitely 1.8439x faster >>> audio-beat-detection 1803.466+-491.965 >>> 970.636+-428.051 might be 1.8580x faster >>> audio-dft 1756.985+-68.710 ^ >>> 954.312+-528.406 ^ definitely 1.8411x faster >>> audio-fft 1637.969+-458.129 >>> 850.083+-449.228 might be 1.9268x faster >>> audio-oscillator 1866.006+-569.581^ >>> 967.194+-82.521^ definitely 1.9293x faster >>> imaging-darkroom 2156.526+-591.042^ >>> 1231.318+-187.297 ^ definitely 1.7514x faster >>> imaging-desaturate3059.335+-284.740^ >>>
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
I've just set up MacBook Pro to measure the effect on macOS. The results are the followings. VMs tested: "baseline" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/nojit/Release/jsc "patched" at /Users/yusukesuzuki/dev/WebKit/WebKitBuild/nojit-llint/Release/jsc Collected 2 samples per benchmark/VM, with 2 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. baseline patched ai-astar 1738.056+-49.666 ^ 1568.904+-44.535^ definitely 1.1078x faster audio-beat-detection 1127.677+-15.749 ^ 972.323+-23.908^ definitely 1.1598x faster audio-dft 942.952+-107.209 919.933+-310.247 might be 1.0250x faster audio-fft 985.489+-47.414 ^ 796.955+-25.476^ definitely 1.2366x faster audio-oscillator 967.891+-34.854 ^ 801.778+-18.226^ definitely 1.2072x faster imaging-darkroom 1265.340+-114.464^ 1099.233+-2.372 ^ definitely 1.1511x faster imaging-desaturate1737.826+-40.791 ? 1749.010+-167.969 ? imaging-gaussian-blur 7846.369+-52.165 ^ 6392.379+-1025.168 ^ definitely 1.2275x faster json-parse-financial33.141+-0.473 33.054+-1.058 json-stringify-tinderbox20.803+-0.901 20.664+-0.717 stanford-crypto-aes401.589+-39.750 376.622+-12.111 might be 1.0663x faster stanford-crypto-ccm245.629+-45.322 228.013+-8.976 might be 1.0773x faster stanford-crypto-pbkdf2 941.178+-28.744 864.462+-60.083 might be 1.0887x faster stanford-crypto-sha256-iterative 299.988+-47.729 270.849+-32.356 might be 1.1076x faster 1325.281+-2.613 ^ 1149.584+-75.875^ definitely 1.1528x faster Interestingly, the improvement is not so large. In Linux box, it was 2x. But in macOS, it is 15%. But I think it is very nice if we can get 15% boost without any drawbacks. On Thu, Sep 20, 2018 at 3:08 PM Saam Barati wrote: > Interesting! I must have not run this experiment correctly when I did it. > > - Saam > > On Sep 19, 2018, at 7:31 PM, Yusuke Suzuki > wrote: > > On Thu, Sep 20, 2018 at 12:54 AM Saam Barati wrote: > >> To elaborate: I ran this same experiment before. And I forgot to turn off >> the RegExp JIT and got results similar to what you got. Once I turned off >> the RegExp JIT, I saw no perf difference. >> > > Yeah, I disabled JIT and RegExpJIT explicitly by using > > export JSC_useJIT=false > export JSC_useRegExpJIT=false > > and I checked no JIT code is generated by running dumpDisassembly. And I > also put `CRASH()` in ExecutableAllocator::singleton() to ensure no > executable memory is allocated. > The result is the same. I think `useJIT=false` disables RegExp JIT too. > >baseline > patched > > ai-astar 3499.046+-14.772 ^ > 1897.624+-234.517 ^ definitely 1.8439x faster > audio-beat-detection 1803.466+-491.965 > 970.636+-428.051 might be 1.8580x faster > audio-dft 1756.985+-68.710 ^ > 954.312+-528.406 ^ definitely 1.8411x faster > audio-fft 1637.969+-458.129 > 850.083+-449.228 might be 1.9268x faster > audio-oscillator 1866.006+-569.581^ > 967.194+-82.521^ definitely 1.9293x faster > imaging-darkroom 2156.526+-591.042^ > 1231.318+-187.297 ^ definitely 1.7514x faster > imaging-desaturate3059.335+-284.740^ > 1754.128+-339.941 ^ definitely 1.7441x faster > imaging-gaussian-blur16034.828+-1930.938 ^ > 7389.919+-2228.020 ^ definitely 2.1698x faster > json-parse-financial60.273+-4.143 > 53.935+-28.957 might be 1.1175x faster > json-stringify-tinderbox39.497+-3.915 > 38.146+-9.652 might be 1.0354x faster > stanford-crypto-aes873.623+-208.225^ > 486.350+-132.379 ^ definitely 1.7963x faster > stanford-crypto-ccm538.707+-33.979 ^ > 285.944+-41.570^ definitely 1.8840x faster > stanford-crypto-pbkdf21929.960+-649.861^ > 1044.320+-1.182 ^ definitely 1.8481x faster > stanford-crypto-sha256-iterative 614.344+-200.228 > 342.574+-123.524 might be 1.7933x faster > > 2562.183+-207.456^ > 1304.749+-312.963 ^ definitely 1.9637x faster > > I think thi
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
Interesting! I must have not run this experiment correctly when I did it. - Saam > On Sep 19, 2018, at 7:31 PM, Yusuke Suzuki wrote: > >> On Thu, Sep 20, 2018 at 12:54 AM Saam Barati wrote: >> To elaborate: I ran this same experiment before. And I forgot to turn off >> the RegExp JIT and got results similar to what you got. Once I turned off >> the RegExp JIT, I saw no perf difference. > > Yeah, I disabled JIT and RegExpJIT explicitly by using > > export JSC_useJIT=false > export JSC_useRegExpJIT=false > > and I checked no JIT code is generated by running dumpDisassembly. And I also > put `CRASH()` in ExecutableAllocator::singleton() to ensure no executable > memory is allocated. > The result is the same. I think `useJIT=false` disables RegExp JIT too. > >baseline patched > > > ai-astar 3499.046+-14.772 ^ > 1897.624+-234.517 ^ definitely 1.8439x faster > audio-beat-detection 1803.466+-491.965 > 970.636+-428.051 might be 1.8580x faster > audio-dft 1756.985+-68.710 ^ > 954.312+-528.406 ^ definitely 1.8411x faster > audio-fft 1637.969+-458.129 > 850.083+-449.228 might be 1.9268x faster > audio-oscillator 1866.006+-569.581^ > 967.194+-82.521^ definitely 1.9293x faster > imaging-darkroom 2156.526+-591.042^ > 1231.318+-187.297 ^ definitely 1.7514x faster > imaging-desaturate3059.335+-284.740^ > 1754.128+-339.941 ^ definitely 1.7441x faster > imaging-gaussian-blur16034.828+-1930.938 ^ > 7389.919+-2228.020 ^ definitely 2.1698x faster > json-parse-financial60.273+-4.143 > 53.935+-28.957 might be 1.1175x faster > json-stringify-tinderbox39.497+-3.915 > 38.146+-9.652 might be 1.0354x faster > stanford-crypto-aes873.623+-208.225^ > 486.350+-132.379 ^ definitely 1.7963x faster > stanford-crypto-ccm538.707+-33.979 ^ > 285.944+-41.570^ definitely 1.8840x faster > stanford-crypto-pbkdf21929.960+-649.861^ > 1044.320+-1.182 ^ definitely 1.8481x faster > stanford-crypto-sha256-iterative 614.344+-200.228 > 342.574+-123.524 might be 1.7933x faster > > 2562.183+-207.456^ > 1304.749+-312.963 ^ definitely 1.9637x faster > > I think this result is not related to RegExp JIT since ai-astar is not using > RegExp. > > Best regards, > Yusuke Suzuki > >> >> - Saam >> >>> On Sep 19, 2018, at 8:53 AM, Saam Barati wrote: >>> >>> Did you turn off the RegExp JIT? >>> >>> - Saam >>> On Sep 18, 2018, at 11:23 PM, Yusuke Suzuki wrote: Hi WebKittens! Recently, node-jsc is announced[1]. When I read the documents of that project, I found that they use LLInt ASM interpreter instead of CLoop in non-JIT environment. So I had one question in my mind: How fast the LLInt ASM interpreter when comparing to CLoop? I've set up two builds. One is CLoop build (-DENABLE_JIT=OFF) and another is JIT build JSC with `JSC_useJIT=false`. And I've ran kraken benchmarks with these two builds in x64 Linux machine. The results are the followings. Benchmark report for Kraken on sakura-trick. VMs tested: "baseline" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit/Release/bin/jsc "patched" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit-llint/Release/bin/jsc Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. baseline patched ai-astar 3619.974+-57.095 ^ 2014.835+-59.016^ definitely 1.7967x faster audio-beat-detection 1762.085+-24.853 ^ 1030.902+-19.743^ definitely 1.7093x faster audio-dft 1822.426+-28.704 ^ 909.262+-16.640^ definitely 2.0043x faster audio-fft 1651.070+-9.994 ^ 865.203+-7.912 ^ definitely 1.9083x faster audio-oscillator 1853.697+-26.539 ^ 992.406+-12.811
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
On Thu, Sep 20, 2018 at 12:54 AM Saam Barati wrote: > To elaborate: I ran this same experiment before. And I forgot to turn off > the RegExp JIT and got results similar to what you got. Once I turned off > the RegExp JIT, I saw no perf difference. > Yeah, I disabled JIT and RegExpJIT explicitly by using export JSC_useJIT=false export JSC_useRegExpJIT=false and I checked no JIT code is generated by running dumpDisassembly. And I also put `CRASH()` in ExecutableAllocator::singleton() to ensure no executable memory is allocated. The result is the same. I think `useJIT=false` disables RegExp JIT too. baseline patched ai-astar 3499.046+-14.772 ^ 1897.624+-234.517 ^ definitely 1.8439x faster audio-beat-detection 1803.466+-491.965 970.636+-428.051 might be 1.8580x faster audio-dft 1756.985+-68.710 ^ 954.312+-528.406 ^ definitely 1.8411x faster audio-fft 1637.969+-458.129 850.083+-449.228 might be 1.9268x faster audio-oscillator 1866.006+-569.581^ 967.194+-82.521^ definitely 1.9293x faster imaging-darkroom 2156.526+-591.042^ 1231.318+-187.297 ^ definitely 1.7514x faster imaging-desaturate3059.335+-284.740^ 1754.128+-339.941 ^ definitely 1.7441x faster imaging-gaussian-blur16034.828+-1930.938 ^ 7389.919+-2228.020 ^ definitely 2.1698x faster json-parse-financial60.273+-4.143 53.935+-28.957 might be 1.1175x faster json-stringify-tinderbox39.497+-3.915 38.146+-9.652 might be 1.0354x faster stanford-crypto-aes873.623+-208.225^ 486.350+-132.379 ^ definitely 1.7963x faster stanford-crypto-ccm538.707+-33.979 ^ 285.944+-41.570^ definitely 1.8840x faster stanford-crypto-pbkdf21929.960+-649.861^ 1044.320+-1.182 ^ definitely 1.8481x faster stanford-crypto-sha256-iterative 614.344+-200.228 342.574+-123.524 might be 1.7933x faster 2562.183+-207.456^ 1304.749+-312.963 ^ definitely 1.9637x faster I think this result is not related to RegExp JIT since ai-astar is not using RegExp. Best regards, Yusuke Suzuki > > - Saam > > On Sep 19, 2018, at 8:53 AM, Saam Barati wrote: > > Did you turn off the RegExp JIT? > > - Saam > > On Sep 18, 2018, at 11:23 PM, Yusuke Suzuki > wrote: > > Hi WebKittens! > > Recently, node-jsc is announced[1]. When I read the documents of that > project, > I found that they use LLInt ASM interpreter instead of CLoop in non-JIT > environment. > So I had one question in my mind: How fast the LLInt ASM interpreter when > comparing to CLoop? > > I've set up two builds. One is CLoop build (-DENABLE_JIT=OFF) and another > is JIT build JSC with `JSC_useJIT=false`. > And I've ran kraken benchmarks with these two builds in x64 Linux machine. > The results are the followings. > > Benchmark report for Kraken on sakura-trick. > > VMs tested: > "baseline" at > /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit/Release/bin/jsc > "patched" at > /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit-llint/Release/bin/jsc > > Collected 10 samples per benchmark/VM, with 10 VM invocations per > benchmark. Emitted a call to gc() between sample > measurements. Used 1 benchmark iteration per VM invocation for warm-up. > Used the jsc-specific preciseTime() > function to get microsecond-level timing. Reporting benchmark execution > times with 95% confidence intervals in > milliseconds. > >baseline > patched > > ai-astar 3619.974+-57.095 ^ > 2014.835+-59.016^ definitely 1.7967x faster > audio-beat-detection 1762.085+-24.853 ^ > 1030.902+-19.743^ definitely 1.7093x faster > audio-dft 1822.426+-28.704 ^ > 909.262+-16.640^ definitely 2.0043x faster > audio-fft 1651.070+-9.994 ^ > 865.203+-7.912 ^ definitely 1.9083x faster > audio-oscillator 1853.697+-26.539 ^ > 992.406+-12.811^ definitely 1.8679x faster > imaging-darkroom 2118.737+-23.219 ^ > 1303.729+-8.071 ^ definitely 1.6251x faster > imaging-desaturate3133.654+-28.545 ^ > 1759.738+-18.182^ definitely 1.7808x faster > imaging-gaussian-blur16321.090+-154.893^ > 7228.017+-58.508^ definitely 2.2580x faster > json-parse-financial57.256+-2.876 > 56.101+-4.265 might be 1.0206x faster > json-stringify-tinderbox38.470+-2.788 ? > 38.771+-0.935 ? > stanford-crypto-aes851.341+-7.738 ^ > 485.438+-13.904^
Re: [webkit-dev] [jsc-dev] Proposal: Using LLInt Asm in major architectures even if JIT is disabled
To elaborate: I ran this same experiment before. And I forgot to turn off the RegExp JIT and got results similar to what you got. Once I turned off the RegExp JIT, I saw no perf difference. - Saam > On Sep 19, 2018, at 8:53 AM, Saam Barati wrote: > > Did you turn off the RegExp JIT? > > - Saam > >> On Sep 18, 2018, at 11:23 PM, Yusuke Suzuki >> wrote: >> >> Hi WebKittens! >> >> Recently, node-jsc is announced[1]. When I read the documents of that >> project, >> I found that they use LLInt ASM interpreter instead of CLoop in non-JIT >> environment. >> So I had one question in my mind: How fast the LLInt ASM interpreter when >> comparing to CLoop? >> >> I've set up two builds. One is CLoop build (-DENABLE_JIT=OFF) and another is >> JIT build JSC with `JSC_useJIT=false`. >> And I've ran kraken benchmarks with these two builds in x64 Linux machine. >> The results are the followings. >> >> Benchmark report for Kraken on sakura-trick. >> >> VMs tested: >> "baseline" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit/Release/bin/jsc >> "patched" at >> /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit-llint/Release/bin/jsc >> >> Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. >> Emitted a call to gc() between sample >> measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used >> the jsc-specific preciseTime() >> function to get microsecond-level timing. Reporting benchmark execution >> times with 95% confidence intervals in >> milliseconds. >> >>baseline patched >> >> >> ai-astar 3619.974+-57.095 ^ >> 2014.835+-59.016^ definitely 1.7967x faster >> audio-beat-detection 1762.085+-24.853 ^ >> 1030.902+-19.743^ definitely 1.7093x faster >> audio-dft 1822.426+-28.704 ^ >> 909.262+-16.640^ definitely 2.0043x faster >> audio-fft 1651.070+-9.994 ^ >> 865.203+-7.912 ^ definitely 1.9083x faster >> audio-oscillator 1853.697+-26.539 ^ >> 992.406+-12.811^ definitely 1.8679x faster >> imaging-darkroom 2118.737+-23.219 ^ >> 1303.729+-8.071 ^ definitely 1.6251x faster >> imaging-desaturate3133.654+-28.545 ^ >> 1759.738+-18.182^ definitely 1.7808x faster >> imaging-gaussian-blur16321.090+-154.893^ >> 7228.017+-58.508^ definitely 2.2580x faster >> json-parse-financial57.256+-2.876 >> 56.101+-4.265 might be 1.0206x faster >> json-stringify-tinderbox38.470+-2.788 ? >> 38.771+-0.935 ? >> stanford-crypto-aes851.341+-7.738 ^ >> 485.438+-13.904^ definitely 1.7538x faster >> stanford-crypto-ccm556.133+-6.606 ^ >> 264.161+-3.970 ^ definitely 2.1053x faster >> stanford-crypto-pbkdf21945.718+-15.968 ^ >> 1075.013+-13.337^ definitely 1.8099x faster >> stanford-crypto-sha256-iterative 623.203+-7.604 ^ >> 349.782+-12.810^ definitely 1.7817x faster >> >> 2596.775+-14.857 ^ >> 1312.383+-8.840 ^ definitely 1.9787x faster >> >> Surprisingly, LLInt ASM interpreter is significantly faster than CLoop. I >> expected it would be fast, but it would show around 10% performance win. >> But the reality is that it is 2x faster. It is too much number to me to >> consider enabling LLInt ASM interpreter for non-JIT build configuration. >> As a bonus, LLInt ASM interpreter offers sampling profiler support even in >> non-JIT environment. >> >> So my proposal is, how about enabling LLInt ASM interpreter in non-JIT >> configuration environment in major architectures (x64 and ARM64)? >> >> Best regards, >> Yusuke Suzuki >> >> [1]: https://lists.webkit.org/pipermail/webkit-dev/2018-September/030140.html >> ___ >> webkit-dev mailing list >> webkit-dev@lists.webkit.org >> https://lists.webkit.org/mailman/listinfo/webkit-dev > ___ > jsc-dev mailing list > jsc-...@lists.webkit.org > https://lists.webkit.org/mailman/listinfo/jsc-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev