[ksh93-integration-discuss] ksh93 benchmarking

Milan Jurik Fri, 10 Jul 2009 01:31:07 +0200

Hi Roland,

Roland Mainz p??e v P? 10. 07. 2009 v 00:58 +0200:
> Milan Jurik wrote:
> > V ??t, 09. 07. 2009 v 15:21, Sean McGrath p????e:
> > >  With the coming ksh93 update 2 and it replacing several commands
> > >   like wc, tail, head, join etc.  Theres a need to have a benchmark
> > >   to measure at least before and after ksh93 update 2 change.
> > >
> > >  Roland and I were talking on irc last night about this.  We'll need
> > >   to figure out a decent method of benchmarking these commands.
> > 
> > How is it possible that Roland discovers the responsible people
> > everytime? :-)
> 
> Well... part of the secret is that I use a komodo dragon (preferably a
> hungry one), a whips (wet, with salt) and a small egg... that way you
> can get every information out of people (yes, yes, it's cruel&&unusual)
> ... =:-)
>


Your latest source informed me about your magic already ;-)

> > >  So within the next few days we hope to work out a method for 
> > > benchmarking ksh93
> > >  This hopefully is a start of that discussion, rather than blindly writing
> > >   adhoc timing scripts..
> > >
> > >  One way, suggested by Roland could be:
> > >
> > >    cmd = mkdir:
> > >
> > >     timex ksh93 -c 'rmdir "xyz" >/dev/null ; \
> > >         for ((i=0 ; i < 1000 ; i++)) ; do /bin/mkdir -p "xyz" ; done'
> > >
> > >    that would benchmark the on disk mkdir. To use the builtin ksh93's 
> > > mkdir,
> > >    just remove the '/bin/'
> > >
> > >     timex ksh93 -c 'rmdir "xyz" >/dev/null ; \
> > >         for ((i=0 ; i < 1000 ; i++)) ; do mkdir -p "xyz" ; done'
> > 
> > Do not test it as ksh93 command, but through the wrapper. So not ksh93
> > -c 'tail', but /usr/bin/tail. That is the real impact.
> 
> Erm... that's not 100% correct. The test matrix should look like this:
> [ old-version, new-version, ksh93-builtin-command ] * [ C-locale,
> multibyte-locale ]
> 
> Explanation of terms:
> - "old-version" means the old versions of the commands
> - "new version" means the new versions of the commands
> - "ksh93-buitin-commands" means running the loop within a ksh93 shell
> using plain command names [1] [2]
> - "C-locale" means something like $ LC_ALL=C ./test-script #
> - "multibyte-locale" means something like $ LC_ALL=en_US.UTF-8
> ./test-script # - this is needed since the tools sometimes have
> different codepaths for single-byte locales (like "C") and multibyte
> locales (like "en_US.UTF-8" or "ja_JP.PCK")
> 
> [1]=(this is important to measure the impact for OpenSolaris/Indiana
> where the default system shell is ksh93 (e.g. /usr/bin/sh, /sbin/sh,
> /usr/bin/ksh, /usr/bin/ksh93 are all ksh93))
> [2]=Note that a POSIX-conformant shell (like ksh93) will only use
> builtin commands if you use the command name (e.g. "mkdir") and not the
> full path (e.g. "/usr/bin/mkdir"). Or better: Using the full path makes
> sure the shell always uses the non-builtin command from /usr/bin/

For C-team review I believe it is the most important the performance
regression of replaced/updated commands, because "builtin-commands" you
can bypass and are not the most important part of update 2. Also,
comparison of builtin-command vs. old-version of command has nothing to
do with performance regression testing, but it is more benchmark project
(important to have, but not show stopper for update 2).

> 
> > >   Another method, using the above example could be to see how many times
> > >   mkdir got called in a given time period.
> > 
> > The same amount of commands is good enough. Probably several times.
> > 
> > >   Other than basic benchmarking the environment too can be measured, i.e.
> > >    the locale can have an impact, e.g. LC_ALL=C and LC_ALL=en_US.UTF-8
> > 
> > +1
> 
> Right - see test matrix above...
> 
> > >   So too to be looked at is the datasize used with commands, eg
> > >    tail -X on a large or small file.  Small being about 256k or so and
> > >    large being at least 1GB.
> > 
> > +1
> > 
> > File bigger than RAM should be good.
> 
> BTW: Some notes:
> - "tail" _may_ now be a bit slower since it no longer uses |mmap()|
> (which was one of the root causes for crashes (e.g. if the underlying
> file shrinks while "tail" reads it))
> - some commands like "join" should be faster now since it uses |mmap()|
> (but we have an option to turn this behaviour off to avoid running into
> the issue described with "tail" above)
> - command startup time may be slightly higher since we now depend on two
> more libraries (e.g. libcmd, libast) which need to be looked-up&&loaded.
> This should be a bit compensated by the detail that the AST tools are
> tuned more for large amounts of data

It is good to know and document as part of the performance results. But
except "mmap" the impact should not be critical.

> - please use tmpfs (e.g. /tmp) for reading/writing from/to files to
> avoid getting noise from the disk I/O system
> 

I think Sean's team is very good in performance testing ;-)

> > >  For starters is there a definite list of those command we'd want to
> > >   look at ? i.e. those being replaced by ksh93.
> > 
> > I think the the list is definitive and you can find it here (in Notes):
> > 
> > http://www.opensolaris.org/os/project/ksh93-integration/downloads/2009-07-02/
> > 
> > Optimal thing would be to test not only those which are replaced now,
> > but also those which are already replaced and updated by this update.
> > 
> > Only usr/bin/print is new command, so we do not need to test it.
> > 
> > For testing all internal ksh93 commands, I would say no for now. It can
> > be separate project, to do complete ksh93 benchmarking. But we should
> > concentrate on update 2 for now.
> 
> Well, it's not tought as "ksh93 benchmark" (since it doesn't cover any
> special shell features like string processing, math, array operations
> etc.) - the idea was to figure out the impact on OpenSolaris/Indiana
> where the use of builtin commands in the default system shell has direct
> impact on system performance (e.g. at _least_ fewer |fork()|+|exec()|
> calls).
> 

Yes. But then you are comparing only old and new ksh93. Not the core
topic of update 2 for now, because update 2 concentrates on bugfixing
and new commands.

Best regards,

Milan

[ksh93-integration-discuss] ksh93 benchmarking

Reply via email to