Re: Proposal for a new applet: strings

Roberto A. Foglietta Sat, 22 Jul 2023 04:36:51 -0700

On Sat, 22 Jul 2023 at 08:02, tito <farmat...@tiscali.it> wrote:
>
> Hi,
> I just adopted the test in the PERFORMANCE section of your source
>
> ** PERFORMANCES 
> ***************************************************************
>  *
>  * gcc -Wall -O3 strings.orig.c -o strings && strip strings
>  * rm -f [12].txt
>  * time   strings /usr/bin/busybox >1.txt
>  * real 0m0.035s
>  * time ./strings /usr/bin/busybox >2.txt
>  * real 0m1.843s
>  *
>  * gcc -Wall -O3 strings.c -o strings && strip strings
>  * rm -f [12].txt
>  * time   strings /usr/bin/busybox >1.txt
>  * real 0m0.033s
>  * time ./strings /usr/bin/busybox >2.txt
>  * real 0m0.011s
>  *
>  ** FOOTPRINT 
> ******************************************************************
>


Sorry Tito,

I do not want to be pedantic but that is not a benchmark suite. It is
just the presentation of cherry-picked results that are meaningful
about performances.

After having introduced the use of tmpfs, the "real 0m0.033s" should
be changed to "real 0m0.022s" or better the 11 in 16 in such a way to
maintain a reasonable proportion with the original code but who cares
anymore about the original code?.

Now, we can argue - why I did not share the entire benchmark suite in
the header since the beginning. Well, after all the benchmark suite is
something like for i in 1..100 do time. This means that the "core
engine" behind the source of completion time is the same (in good and
bad shapes, both). Everything else is just a matter of simple math and
presentation.

Your three repetitions might seem very similar to my results
presentation with a tiny sensitive difference: one single result is
NOT a statistics, it does not even take in consideration the
hypothesis of a variance in performances therefore it is a statement
(a source of truth!). While three repetitions makes me think that
something about statistics has been overlooked or like I did - I
appreciated your sense of humor in suggesting that I should present
statistics instead of a statement in the header.

Finally, the mktmp on tmpfs is a must. Otherwise we are going to test
the performance of our hdd/ssd when the execution time is faster than
the I/O throughput. I have one of the fastest commercial SSD mounted
on my laptop and therefore I do not care very much but an
embedded/mobile system can have 8 CPU pipes like my laptop but a much
slower flash on SoC.

This is just to say that a benchmark should take care about caching
and disk I/O. Obviously, the disk I/O gets into the picture also when
we read the file to strings. Unless the file is cached (and a cache
system exists and works efficiently) but in input is part of the
benchmark. How the file/stdin is read, is part of the stings way of
working, obviously. Probably we also need to try something like "sync;
echo 3 > /proc/sys/vm/drop_caches" before every execution and use
something different to string than a busybox, something which is
independent from strings and bb, both.

Finally, the benchmark suite should also do a single run before all
the tests just to charge the cache and unleash the CPU at its maximum
performance. Plus putting the CPU set to performance instead of
anything else just to quickly make it work at its best.

The benchmark.sh in attachment does stuff like that and can run with
drop-the-cache or not. For this reason, it requires root privileges to
run. It shows at my home that in fact 33 / 11 are a very good
estimation and another one could be 32 / 12 or anything in between.
Therefore my statement in the header was a source of truth under the
limitation of the tests I did, obviously.

In the code I have replaced the static inline function with a macro

#define isPrintable(c) ((c) == 0x09 || ((c) >= 0x20 && (c) <= 0x7e))

because the inline is not always granted and the code is used just a
single time. Probably the -O3 gcc optimization got it and inline by
default.

== QUESTION ==

Is this a preliminary work for the integration task or just an
educated academic ping pong e-mail exchange?

Best regards, R-

benchmark.sh
Description: application/shellscript

_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: Proposal for a new applet: strings

Reply via email to