On Thu, Oct 10, 2013 at 11:53 PM, Matthew Ahrens <[email protected]> wrote:
> On Thu, Oct 10, 2013 at 8:44 PM, Richard Yao <[email protected]> wrote:
>>
>> On 10/10/2013 11:38 PM, Richard Yao wrote:
>> > On 10/10/2013 11:29 PM, Xin Li wrote:
>> >> On 10/10/13 20:18, Richard Yao wrote:
>> >>> Thanks for letting us know about this. I have a few comments:
>> >>
>> >>> 1. We could eliminate a branch entirely by doing this:
>> >>
>> >>> mlen = MIN(d_end - dst, mlen); while (--mlen >= 0) *dst++ = *cpy++
>> >>
>> >> I don't think this eliminates the branching as MIN is usually a macro
>> >> that expands to a > b ? b : a.
>> >
>> > My mistake. I was thinking of generic swap routines. I do think that
>> > using the MIN() macro is more readable though.
>>
>> On second thought, I was right the first time. It is possible to do this
>> without branching:
>>
>> #define MIN(x, y) ((y) ^ (((x) ^ (y)) & -((x) < (y))))
>> #define MIN(x, y) ((x) ^ (((x) ^ (y)) & -((x) < (y))))
>>
>> http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
>>
>> This makes MIN(d_end - dst, mlen) look inefficient, but a proper
>> optimizing compiler should store the result of d_end - dst in a register
>> to avoid doing the subtraction 3 times.
>
>
> This is complicated enough that I would definitely want to see the
> performance analysis before introducing this trick. Even according to the
> webpage: "On some rare machines ... [it] might be faster than the obvious
> approach ... On some machines ... there may be no advantage. ... gcc
> produced the same code on a Pentium as the obvious solution "
Measured on a Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz"
Units are rdtsc diffs of 10000000 iterations across random arrays
(different in each attempt).
#define MIN_A(x, y) ((y) ^ (((x) ^ (y)) & -((x) < (y))))
#define MIN_B(a, b) a > b ? b : a
x mina
+ minb
N Min Max Median Avg Stddev
x 656 58 760 87 112.24543 67.992472
+ 656 58 254 87 84.39939 12.143
Difference at 95.0% confidence
-27.846 +/- 5.28546
-24.8082% +/- 4.70884%
(Student's t, pooled s = 48.8387)
I probably made some mistake with my benchmarking but I don't think
the numbers lean toward the bit twiddly one.
--
Eitan Adler
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer