Re: [PATCH] Improved strlen for ARM, around 29% faster

Rich Felker Wed, 03 Oct 2012 11:03:13 -0700

On Sat, Sep 29, 2012 at 02:48:48AM +0200, Gabriel Gonzalez wrote:
> This version for ARM improves performance mainly unrolling the loop for 
> iterations 
> and reducing the instructions need to look for the null character.
> A deeper analysis of this can be found at 
> http://www.gabrielgonzalezgarcia.com/2012/10/02/mystrlen-vs-android-bionics-strlen-on-arm-cpu/
>  
> where you can find some data which back up the performance improvement.
> I have only tested it on a little endian CPU so the BIG ENDIAN chunk might 
> need some testing


I suspect this code is still considerably slower than the good C
implementation, which looks something like:

http://git.musl-libc.org/cgit/musl/tree/src/string/strlen.c

(glibc uses a similar C implementation, but theirs seems to have a bug
whereby it drops out of the fast loop whenever it hits high bytes.)

Basically, the ideas is that you can test all bytes of a machine word
for a null byte in parallel rather than branching on each byte. I
don't have access to real arm hardware to test it on (just qemu) so
I'd be happy to hear which is actually faster.

Rich
_______________________________________________
uClibc mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] Improved strlen for ARM, around 29% faster

Reply via email to