On Sat, Sep 29, 2012 at 02:48:48AM +0200, Gabriel Gonzalez wrote: > This version for ARM improves performance mainly unrolling the loop for > iterations > and reducing the instructions need to look for the null character. > A deeper analysis of this can be found at > http://www.gabrielgonzalezgarcia.com/2012/10/02/mystrlen-vs-android-bionics-strlen-on-arm-cpu/ > > where you can find some data which back up the performance improvement. > I have only tested it on a little endian CPU so the BIG ENDIAN chunk might > need some testing
I suspect this code is still considerably slower than the good C implementation, which looks something like: http://git.musl-libc.org/cgit/musl/tree/src/string/strlen.c (glibc uses a similar C implementation, but theirs seems to have a bug whereby it drops out of the fast loop whenever it hits high bytes.) Basically, the ideas is that you can test all bytes of a machine word for a null byte in parallel rather than branching on each byte. I don't have access to real arm hardware to test it on (just qemu) so I'd be happy to hear which is actually faster. Rich _______________________________________________ uClibc mailing list [email protected] http://lists.busybox.net/mailman/listinfo/uclibc
