[+vegorov] Sorry for the delay. As I see, there are two optimizations here:
1. Optimizing operations where the result is truncated to int32 (used in a bitwise-operation) (1) is a little tricky to preserve the correct result in case of overflowing the precise integer range and get the corner cases right. Vyacheslav (cc'ed) and I also did some experiments but did not yet come up with a satisfying fast solution. I think your solution is missing the propagation through Phi-instructions which is needed in some of the crypto-like benchmarks. 2. Optimizing Math.floor(x/y) I think (2) is definitely a good idea. It is a very common pattern used by programmers. I would appreciate if you could separate the Math.floor optimization into a separate CL for review. Thanks, Florian Den 28. feb. 2012 12.10 skrev Alexandre Rames <[email protected]>: > Any comments or ideas about this? > > Alexandre > > > On Fri, Feb 3, 2012 at 1:08 PM, Alexandre Rames <[email protected] > > wrote: > >> Hello, >> >> Sorry for the late answer. >> >> Although the patch locally improves a few code segments, there are no >> observable performance gain on the usual benchmarks. >> The main reason seems to be that benchmarks are designed to avoid (or >> just don't) going through these bad cases. >> >> However this extends the trick (var | 0) well known from JS >> programmers (example >> here<http://code.google.com/p/v8/issues/detail?id=1840>). >> Once it lands it will allow to generate more optimized code. So I think >> there is a good potential use for this. >> >> On the very following very dummy benchmark below, I get the following >> timings on my Tegra2: >> >> $ time ./d8.10579 test/local/bitwise.js >> >> real 0m4.852s >> user 0m4.780s >> sys 0m0.030s >> >> $ time ./d8.bitwise test/local/bitwise.js >> >> real 0m2.092s >> user 0m2.060s >> sys 0m0.030s >> >> >> // bitwise.js ----------------------------------------------------------- >> function main() { >> for(var i = 0; i < 30000000; i++) { >> test(i, i*3); >> } >> } >> function test(a, b) { >> var v1 = a - 5; >> var v2 = b >> 1; >> return ((a / b) | 0) + ((a * (b + 1)) >> 5) + Math.floor(b / 5); >> } >> main(); >> >> >> And that it is without a SDIV instruction... >> I admit here we benefit greatly from the optimization, especially for the >> divisions. >> >> I think it would be interesting to rewrite the crypto benchmarks using >> this optimization... >> >> Alexandre >> >> >> >> >> On Tue, Jan 31, 2012 at 3:51 PM, Daniel Clifford <[email protected]>wrote: >> >>> I'll add Florian and Kevin to your CL, they are probably the best >>> qualified to review what you've done. Can you please remind me which of the >>> benchmarks you found this to help on? >>> >>> Thanks, >>> Danno >>> >>> >>> On Tue, Jan 31, 2012 at 4:42 PM, Alexandre Rames < >>> [email protected]> wrote: >>> >>>> Hello, >>>> >>>> A few details to help the reviewers. >>>> >>>> - The hydrogen level optimization to optimize HBitwise instructions >>>> away when canonicalizing has been moved down to Lithium level. >>>> Removing the HBitwise instruction loses information on how the inputs >>>> are used and prevents from recognizing patterns like (expr | 0) when >>>> inserting representation changes (see below). >>>> The problem with moving this down to Lithium is that the code is >>>> duplicated for all architectures. >>>> - The first step, triggered during the range inference, visits the >>>> Hydrogen graph to mark operations only used by bitwise operations. This >>>> allows to discard for example overflow checks or bailout on minus zero. >>>> We may want to move this somewhere else. >>>> - The second steps occurs when representation changes are inserted >>>> before a bitwise operation. We try to recognize patterns like ((a / b) | 0) >>>> or Math.floor(a / b) and optimize the generated code. >>>> - Basic support has been added for other architectures than ARM, but >>>> they still lack the optimized code part (implementation should be easy on >>>> ia32 and x64). >>>> >>>> The optimization can be extended to support more cases. I think this >>>> handles enough for a first step. >>>> >>>> Please don't hesitate to ask me for details (email or chat). >>>> >>>> Alexandre >>>> >>>> On Tue, Jan 31, 2012 at 3:26 PM, <[email protected]> wrote: >>>> >>>>> Reviewers: danno, >>>>> >>>>> Description: >>>>> Optimise code for bitwise usage >>>>> >>>>> BUG=none >>>>> TEST=Added bitwise-use.js, math-floor-div.js >>>>> >>>>> Please review this at >>>>> http://codereview.chromium.**org/9301041/<http://codereview.chromium.org/9301041/> >>>>> >>>>> SVN Base: >>>>> http://v8.googlecode.com/svn/**branches/bleeding_edge/<http://v8.googlecode.com/svn/branches/bleeding_edge/> >>>>> >>>>> Affected files: >>>>> M src/arm/lithium-arm.h >>>>> M src/arm/lithium-arm.cc >>>>> M src/arm/lithium-codegen-arm.h >>>>> M src/arm/lithium-codegen-arm.cc >>>>> M src/arm/macro-assembler-arm.h >>>>> M src/arm/macro-assembler-arm.cc >>>>> M src/hydrogen-instructions.h >>>>> M src/hydrogen-instructions.cc >>>>> M src/hydrogen.cc >>>>> M src/ia32/lithium-codegen-ia32.**cc >>>>> M src/ia32/lithium-ia32.h >>>>> M src/ia32/lithium-ia32.cc >>>>> M src/mips/lithium-codegen-mips.**cc >>>>> M src/mips/lithium-mips.h >>>>> M src/mips/lithium-mips.cc >>>>> M src/utils.h >>>>> M src/utils.cc >>>>> M src/x64/lithium-codegen-x64.cc >>>>> >>>>> >>>>> >>>>> -- >>>>> v8-dev mailing list >>>>> [email protected] >>>>> http://groups.google.com/**group/v8-dev<http://groups.google.com/group/v8-dev> >>>> >>>> >>>> >>> >> > -- > v8-dev mailing list > [email protected] > http://groups.google.com/group/v8-dev > -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev
