On Sun, 23 Jul 2023 12:00:56 +0200 "Roberto A. Foglietta" <[email protected]> wrote:
> On Sun, 23 Jul 2023 at 11:42, tito <[email protected]> wrote: > > > > On Sun, 23 Jul 2023 00:36:09 +0200 > > "Roberto A. Foglietta" <[email protected]> wrote: > > > > > On Sat, 22 Jul 2023 at 21:29, tito <[email protected]> wrote: > > > > > > > > On Sat, 22 Jul 2023 19:31:28 +0200 > > > > "Roberto A. Foglietta" <[email protected]> wrote: > > > > > > > > > On Sat, 22 Jul 2023 at 15:40, tito <[email protected]> wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I'm not the maintainer so I can say nothing about integration, > > > > > > I can just point out things that look strange to me and my limited > > > > > > knowledge. > > > > > > When I read that this code is faster vs other code as I'm a curious > > > > > > person I just try to see how much faster it is and why as there > > > > > > is always something to learn on the busybox mailing list. > > > > > > If in my little tests it is not faster then I think I'm entitled > > > > > > to ask questions about it as science results should be reproducible. > > > > > > > > > > > > For simple benchmarking maybe reading a big enough file > > > > > > into memory and feeding it to strings in a few 1000 iterations > > > > > > should do to avoid bias from hdd/sdd and system load, one shot > > > > > > shows: > > > > > > > > > > > > ramtmp="$(mktemp -p /dev/shm/)" > > > > > > dd if=vmlinux.o of=$ramtmp > > > > > > echo $ramtmp > > > > > > /dev/shm/tmp.ll3G2kzKE1 > > > > > > > > > > > > 1) coreutils strings > > > > > > time strings $ramtmp > /dev/null > > > > > > > > > > This is not correct because you are reading a file in tmpfs while the > > > > > > > > Yes, this was exactly the purpose of the test to eliminate all > > > > factors connected to underlying block devices and time > > > > the speed of code of the different implementations. > > > > > > > > > > Which is wrong because you did a hypothesis which is far away from the > > > typical usage and in some cases you can even use it because strings > > > over a 4GB ISO image would not necessarily fit into a tmpfs in every > > > system. Abstract benchmarks can be funny but do not depict/measure the > > > reality as usual. Extending this logic, we can trash the Ohm law > > > because we can reach in the laboratory a near zero temperature! > > > > I see but dropping the caches etc doesn't seem to be a typical use case > > either. > > Dropping the cache is a trick to bring the system in its state after > the boot or as much as possible at that point. It is indispensable for > a confrontation with the normal functioning which has a larger > variance in completion time for each runs. > > > > > Using the same optimization flag -O3 the busybox applet in a real life > > system gives close empirical results, which is the results most > > people in their normal life use cases (one shot, no loops running, > > no files in memory, no dropped caches, no giant multi-GB files) > > will see so the performance increase is swallowed by the system > > or by other bottlenecks. > > > > This is correct, AFAIK my busybox has been compiled with -02. I have to check. > > > > I think the size will rather increase as there are a bunch of features > > missing that the original bb implementation already has: > > > > 1) multiple file handling (a must i would dare to say) > > Which is not such a problem, after all > > for i in "$@"; do simply-strings "$i" | sed -e "s/^/$i:/"; done > > the sed will include also the file name in front of the string which > is useful for grepping. However, the single-file limitation brings to > personalize the approach: > > for i in "$@"; do simply-strings "$i" | grep -w "word" && break; done; echo $i Don't cheat, this change would break other people's scripts. > For example. However, I admit that you are right about multiple-files > input. Personally, I do not need at all and if I need, I do with a > custom for. > > > > 2) -a -f -o -n -t command line options > > The options are: > > -a - --all Scan the entire file, not just the data section > > [default] > > -f --print-file-name Print the name of the file before each string > > -n --bytes=[number] Locate & print any NUL-terminated sequence of at > > least [number] characters > > (default 4). > > -t --radix={o,d,x} Print the location of the string in base 8, 10 > > or 16 > > -o An alias for --radix=o > > > > Yes, strings has a lot of options and also busybox have several > options. This is the best critic about proceeding with an integration. > I will check if I can put an optimization into bb strings, just for my > own curiosity. This would be far better than reinventing the wheel. > > > 3) output compatible with original gnu strings > > > > > In attachment the new version with the test suite and the benchmark > > > suite in the header. The benchmark suite did not change with respect > > > to the script file I just sent. > > > > > > Best regards, R- > > > > BTW: there still seem to be corner-cases: > > list=`find /usr` > > for i in $list; do if test -f $i; then ./strings $i > out1.txt; strings $i > > > out2.txt; diff -q out1.txt out2.txt; fi; done > > Files out1.txt and out2.txt differ > > Files out1.txt and out2.txt differ > > Files out1.txt and out2.txt differ > > Files out1.txt and out2.txt differ > > > > test is still running.... > > ok, I will do a run. Can you please echo the finenames, instead? > > for i in $list; do if test -f $i; then ./strings $i > out1.txt; > strings $i > out2.txt; diff -q out1.txt out2.txt >/dev/null || echo > $i; fi; done > > Thanks, R- if you hire me as beta tester....at least you own me a beer if we ever met in person. root@devuan:/home/tito/Desktop# for i in $list; do if test -f $i; then ./strings $i > out1.txt; strings $i > out2.txt; diff -q out1.txt out2.txt; if test $? -eq 1 ; then echo $i; fi; fi; done Files out1.txt and out2.txt differ /usr/share/themes/Adapta-Nokto-Eta/gtk-3.24/gtk.gresource Files out1.txt and out2.txt differ /usr/share/themes/Adapta/gtk-3.24/gtk.gresource Files out1.txt and out2.txt differ /usr/share/themes/Adapta-Nokto/gtk-3.24/gtk.gresource Files out1.txt and out2.txt differ /usr/share/themes/Adapta-Eta/gtk-3.24/gtk.gresource Files out1.txt and out2.txt differ /usr/lib/x86_64-linux-gnu/libkomsooxml.so.17.0.0 Files out1.txt and out2.txt differ /usr/lib/x86_64-linux-gnu/libkomsooxml.so.17 Ciao, Tito _______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
