Hi Alexey! I had updated the JIRA with new patch [1], please take a look.
There seem to be no way to fix the callers of shiftLeftOneBit, so I leaved fast-path for zeroes intact. Thanks, Aleksey. [1] https://issues.apache.org/jira/browse/HARMONY-5825 On Thu, May 22, 2008 at 7:02 PM, Alexei Fedotov <[EMAIL PROTECTED]> wrote: > Aleksey, > Thank you for thorough reply. I like and support your suggestion about > commenting the changes. Let me address the most mysterious thing I > still don't understand: adding a fast path for zeroes in the shift > function. > > If this optimization helps increasing performance, some code sends > zeroes to the shift function more frequently than other numbers. Do > you know where your shift is called with zeroes as a typical argument? > What if we place the check in the specific caller instead of the > shift? > > Thanks! > > > > On Thu, May 22, 2008 at 4:07 PM, Aleksey Shipilev > <[EMAIL PROTECTED]> wrote: >> Hi, Alexei! >> >> Thanks for the review! Here is some light on your questions: >> >> On Thu, May 22, 2008 at 11:27 AM, Alexei Fedotov >> <[EMAIL PROTECTED]> wrote: >>> 1. I have noticed that you have replaced shiftLeft(1) with >>> shiftLeftOneBit() in the following checks which appear . Is it >>> performance critical part or just cleaning? >> This is performance-wise change. "sign == 0" guarantees that shifting >> is not needed. >> >>> // Checking if: remainder * 2 >= scaledDivisor >>> remainder.abs().shiftLeftOneBit().compareTo(scaledDivisor.abs()) >>> >>> If this is critical, then can one imagine a way of checking that a >>> number is twice bigger than another number without copying and >>> shifting the number? For cases when one number has more binary digits >>> than another this should be easy. >> IMO, this way is also fast, though we can think up about something >> with peak efficiency. I see we will need structural changes to >> accommodate another design which would allow us to use other checking >> schemes. >> >>> 2. What is the meaning of 2 in the new function name unsignedMultAdd2? >>> Why not unsignedMultAdd4? >> That mean "unsigned multiplication and addition of 2 arguments", hence >> the "2" in signature. We can change it to another if this is >> confusing. >> >>> 2a. What is the purpose of the following fix? How a function call >>> (even for a magic function), could be quicker than multiplication >>> inlined by JIT? Does it mean our JIT poorly inlines multiplication? >>> - long res = (aDigits[0] & 0xFFFFFFFFL) * (factor); >>> + long res = unsignedMultAdd2(aDigits[0], factor, 0, 0); >> Well, the original purpose of this was code cleanup (see how original >> version look like, bloated with 0xF...F constants), so we came up with >> idea of extracting unsigned operations of (a*b), (a*b + c), (a*b + c + >> d) as general method. Then you can imagine two situations: >> 1. JIT compiler inlines the method and eliminated unneeded additions. >> 2. JIT compiler generates the magic for this method and substituting >> it with optimal native code (HARMONY-5826). >> >> *sigh* I guess I need to write the javadoc for this method. >> >>> 3. You added a check against shifting zero values. Does adding this >>> check on a common path add to performance? If zeroes come to shift >>> function often enough to give a benefit, which caller spams so much >>> zeroes? Does it worth to patch the caller instead? >> Nope, the benefit of 0-path there is much more than penalty on non-0-path. >> >>> 4. I have an abstract concern with no idea how to address it. An >>> integer array in memory is very similar to a long array (if endings >>> are correct). Is there any way to use 64-bit operations for speed up? >> We thought about such the thing. This would require severe changes to >> fit into existing code. More details would be later :) >> >> Thanks, >> Aleksey. >> > > > > -- > With best regards, > Alexei >
