On Sat, 22 Jul 2023 13:35:56 +0200 "Roberto A. Foglietta" <[email protected]> wrote:
> On Sat, 22 Jul 2023 at 08:02, tito <[email protected]> wrote: > > > > Hi, > > I just adopted the test in the PERFORMANCE section of your source > > > > ** PERFORMANCES > > *************************************************************** > > * > > * gcc -Wall -O3 strings.orig.c -o strings && strip strings > > * rm -f [12].txt > > * time strings /usr/bin/busybox >1.txt > > * real 0m0.035s > > * time ./strings /usr/bin/busybox >2.txt > > * real 0m1.843s > > * > > * gcc -Wall -O3 strings.c -o strings && strip strings > > * rm -f [12].txt > > * time strings /usr/bin/busybox >1.txt > > * real 0m0.033s > > * time ./strings /usr/bin/busybox >2.txt > > * real 0m0.011s > > * > > ** FOOTPRINT > > ****************************************************************** > > > > Sorry Tito, > > I do not want to be pedantic but that is not a benchmark suite. It is > just the presentation of cherry-picked results that are meaningful > about performances. > > After having introduced the use of tmpfs, the "real 0m0.033s" should > be changed to "real 0m0.022s" or better the 11 in 16 in such a way to > maintain a reasonable proportion with the original code but who cares > anymore about the original code?. > > Now, we can argue - why I did not share the entire benchmark suite in > the header since the beginning. Well, after all the benchmark suite is > something like for i in 1..100 do time. This means that the "core > engine" behind the source of completion time is the same (in good and > bad shapes, both). Everything else is just a matter of simple math and > presentation. > > Your three repetitions might seem very similar to my results > presentation with a tiny sensitive difference: one single result is > NOT a statistics, it does not even take in consideration the > hypothesis of a variance in performances therefore it is a statement > (a source of truth!). While three repetitions makes me think that > something about statistics has been overlooked or like I did - I > appreciated your sense of humor in suggesting that I should present > statistics instead of a statement in the header. > > Finally, the mktmp on tmpfs is a must. Otherwise we are going to test > the performance of our hdd/ssd when the execution time is faster than > the I/O throughput. I have one of the fastest commercial SSD mounted > on my laptop and therefore I do not care very much but an > embedded/mobile system can have 8 CPU pipes like my laptop but a much > slower flash on SoC. > > This is just to say that a benchmark should take care about caching > and disk I/O. Obviously, the disk I/O gets into the picture also when > we read the file to strings. Unless the file is cached (and a cache > system exists and works efficiently) but in input is part of the > benchmark. How the file/stdin is read, is part of the stings way of > working, obviously. Probably we also need to try something like "sync; > echo 3 > /proc/sys/vm/drop_caches" before every execution and use > something different to string than a busybox, something which is > independent from strings and bb, both. > > Finally, the benchmark suite should also do a single run before all > the tests just to charge the cache and unleash the CPU at its maximum > performance. Plus putting the CPU set to performance instead of > anything else just to quickly make it work at its best. > > The benchmark.sh in attachment does stuff like that and can run with > drop-the-cache or not. For this reason, it requires root privileges to > run. It shows at my home that in fact 33 / 11 are a very good > estimation and another one could be 32 / 12 or anything in between. > Therefore my statement in the header was a source of truth under the > limitation of the tests I did, obviously. > > In the code I have replaced the static inline function with a macro > > #define isPrintable(c) ((c) == 0x09 || ((c) >= 0x20 && (c) <= 0x7e)) > > because the inline is not always granted and the code is used just a > single time. Probably the -O3 gcc optimization got it and inline by > default. > > == QUESTION == > > Is this a preliminary work for the integration task or just an > educated academic ping pong e-mail exchange? > > Best regards, R- Hi, I'm not the maintainer so I can say nothing about integration, I can just point out things that look strange to me and my limited knowledge. When I read that this code is faster vs other code as I'm a curious person I just try to see how much faster it is and why as there is always something to learn on the busybox mailing list. If in my little tests it is not faster then I think I'm entitled to ask questions about it as science results should be reproducible. For simple benchmarking maybe reading a big enough file into memory and feeding it to strings in a few 1000 iterations should do to avoid bias from hdd/sdd and system load, one shot shows: ramtmp="$(mktemp -p /dev/shm/)" dd if=vmlinux.o of=$ramtmp echo $ramtmp /dev/shm/tmp.ll3G2kzKE1 1) coreutils strings time strings $ramtmp > /dev/null real 0m0.473s user 0m0.460s sys 0m0.013s 2) busybox strings time ./busybox strings $ramtmp > /dev/null real 0m0.285s user 0m0.276s sys 0m0.008s 3) new strings time ./strings $ramtmp > /dev/null real 0m0.349s user 0m0.337s sys 0m0.012s of course a few more iterations would give statistically better results. After a few more tests about output it seems to me that it is not the same as coreutils strings nor busybox strings. A simple test: list=`find /usr/bin` 1) busybox strings vs coreutils strings: for i in $list; do if test -f $i; then strings $i > out1.txt; ./Desktop/busybox strings $i > out2.txt; diff -q out1.txt out2.txt; fi; done No output here, so no differences 2) busybox strings vs new strings: for i in $list; do if test -f $i; then ./Desktop/strings $i > out1.txt; ./Desktop/busybox strings $i > out2.txt; diff -q out1.txt out2.txt; fi; done Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ Files out1.txt and out2.txt differ 3) coreutils strings vs new strings: for i in $list; do if test -f $i; then ./Desktop/strings $i > out1.txt; strings $i > out2.txt; diff -q out1.txt out2.txt; fi; done | wc -l 35 I suspect this could be a problem for integration and also size of code after integration is relevant. Ciao, Tito _______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
