Giuseppe Scrivano wrote:
> Hi Pádraig,
> I tried to reproduce your results but I wasn't able to do it.  The
> biggest difference on a 300MB file I noticed was approximately 15% using
> on both implementations -O2, and 5% using -O3.
> My GCC version is "gcc (Debian 4.3.3-14) 4.3.3" and the CPU is: Intel(R)
> Pentium(R) D CPU 3.20GHz.
> I also spent some time trying to improve the gnulib SHA1 implementation
> and it seems a lookup table can improve things a bit.
> Can you please try the patch that I have attached and tell me which
> performance difference (if any) you get?

Thanks for looking at this Giuseppe
and sorry for not mentioning my GCC and CPU.

Note the binaries below is compiled with
$(rpm -q --qf="%{OPTFLAGS}\n" coreutils)
for consistency, which on my F11 machines is:

  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
  -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i586
  -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE=1

Testing on 2 machines I have here:

$ rpm -q gcc
$ grep "model name" /proc/cpuinfo | head -n1 | tr -s '[:blank:]' ' '
model name : Intel(R) Pentium(R) M processor 1.70GHz
$ truncate -s300MB sha1.test
$ time sha1sum sha1.test
real    0m3.540s
$ time linus-sha1 sha1.test
real    0m2.319s (-34%)
$ time  giuseppe-sha1sum sha1.test
real    0m3.513s (-.8%)

$ rpm -q gcc
$ grep "model name" /proc/cpuinfo | head -n1 | tr -s '[:blank:]' ' '
model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
$ truncate -s300MB sha1.test
$ time sha1sum sha1.test
real    0m1.857s
$ time linus-sha1 sha1.test
real    0m1.102s (-40%)
$ time giuseppe-sha1sum sha1.test
real    0m1.932s (+ 4%)


Reply via email to