I have been investigating a performance issue on OpenSolaris/x86, which
will probably happen to other architectures (sparc, arm ..) related to
the memory management while working with huge files with 'ar'.

The result of the investigation exposes that GNU ar is >40 times slower
than the Solaris one on a native Solaris system (virtualized gets better
results). The benchmarks has been tested on a Sun Ultra 20 (dual opteron
2.8GHz with OpenSolaris 2009.06), on Ubuntu 8.10 with 2.6.27 and
virtualized OS2009.05 on virtualbox on Ubuntu running on a core2 2.13GHz
cpu. No swap was used for these tests.

The main reason why we cannot use the Solaris 'ar' is because it doesn't
support the 'N' flag which is present in the GNU one. This flags allows
to index the contents of a list of files with the same name inside an
archive. The Solaris one only permits -a, -b and -i to append or
prepend, but not to index (if I understand it correctly in the
manpage).

In the other hand, would be nice to investigate about this huge loss of
performance of the GNU ar compared with the Solaris one.

The problem can be reproduced with the 'ar' tool. I have been able to
test he issue with GNU ar 2.15 (shipped with opensolaris), GNU ar 2.18
(compiled by me). The tested file is a 61MB archive.

$ time ar t libtest.a > /dev/null
real    0m47.793s
user    0m43.318s
sys     0m4.032s

Using the Solaris's 'ar' the result is quite different:

$ time /usr/bin/ar -t libtest.a > /dev/null
real    0m1.412s
user    0m1.077s
sys     0m0.252s

$ time /usr/bin/ar -t libtest.a > /dev/null
real    0m1.341s
user    0m1.076s
sys     0m0.211s

On GNU/Linux (Ubuntu 8.1 with kernel 2.6.17) I get much better times 
with GNU ar.

$ time ar t libtest.a > /dev/null
real    0m1.450s
user    0m0.112s
sys     0m0.412s

$ time ar t libtest.a > /dev/null
real    0m0.773s
user    0m0.108s
sys     0m0.400s

I have installed OpenSolaris 2009.05 on VirtualBox on the above
GNU/Linux box and the resulting times are better than the native box
with OpenSolaris 2009.06 which makes me think that it is something
related to the memory management which in the case of
Linux->Virtualbox->Solaris performs better probably because the caching
system of the kernel.

Here's the linux->vbox->osol2009-05 with GNU ar 2.15 results:

$ time ar t libtest.a > /dev/null
real    0m23.660s
user    0m17.627s
sys     0m5.407s

$ time ar t libtest.a > /dev/null
real    0m23.537s
user    0m18.070s
sys     0m5.064s

And using the OpenSolaris one:

$ time /usr/bin/ar -t libtest.a > /dev/null
real    0m2.054s
user    0m0.873s
sys     0m1.111s

$ time /usr/bin/ar -t libtest.a > /dev/null
real    0m1.802s
user    0m0.878s
sys     0m0.863s

The 'top' trace of the running process gives this output:

$ while : ; do top | grep ' ar' ; done | tee ar-trace.txt
  24665 pancake     1  32    0   49M   46M cpu/0    0:00 31.29% ar t 
libtest.a
  24665 pancake     1  20    0   80M   77M cpu/1    0:01 49.80% ar t 
libtest.a
  24665 pancake     1  20    0   94M   91M cpu/1    0:02 49.85% ar t 
libtest.a
  24665 pancake     1  22    0  104M  102M cpu/0    0:03 49.81% ar t 
libtest.a
  24665 pancake     1  21    0  112M  109M cpu/0    0:04 49.85% ar t 
libtest.a
  24665 pancake     1  21    0  119M  116M cpu/1    0:05 49.79% ar t 
libtest.a
  24665 pancake     1  21    0  125M  122M cpu/0    0:06 49.77% ar t 
libtest.a
  24665 pancake     1  10    0  131M  128M cpu/0    0:07 49.84% ar t 
libtest.a
  24665 pancake     1  20    0  136M  134M cpu/0    0:08 49.82% ar t 
libtest.a
  24665 pancake     1  21    0  142M  139M cpu/0    0:09 49.83% ar t 
libtest.a
  24665 pancake     1  22    0  147M  144M cpu/1    0:10 49.80% ar t 
libtest.a
  24665 pancake     1   1    0  152M  149M cpu/0    0:11 49.82% ar t 
libtest.a
  24665 pancake     1  10    0  157M  154M cpu/1    0:12 49.78% ar t 
libtest.a
  24665 pancake     1   0    0  161M  159M cpu/0    0:13 49.86% ar t 
libtest.a
  24665 pancake     1   0    0  166M  163M cpu/1    0:14 49.75% ar t 
libtest.a
  24665 pancake     1  10    0  170M  168M cpu/0    0:16 49.82% ar t 
libtest.a
  24665 pancake     1  11    0  175M  172M cpu/0    0:17 49.81% ar t 
libtest.a
  24665 pancake     1  11    0  179M  176M cpu/0    0:18 49.82% ar t 
libtest.a
  24665 pancake     1   1    0  183M  180M cpu/1    0:19 49.87% ar t 
libtest.a
  24665 pancake     1   0    0  187M  184M cpu/1    0:20 49.24% ar t 
libtest.a
  24665 pancake     1  50    0  191M  188M cpu/0    0:21 49.76% ar t 
libtest.a
  24665 pancake     1  50    0  195M  192M cpu/0    0:22 49.84% ar t 
libtest.a
  24665 pancake     1  41    0  198M  196M cpu/1    0:23 49.85% ar t 
libtest.a
  24665 pancake     1  30    0  202M  200M cpu/0    0:24 49.78% ar t 
libtest.a
  24665 pancake     1  30    0  206M  203M cpu/0    0:25 49.70% ar t 
libtest.a
  24665 pancake     1  30    0  210M  207M cpu/0    0:26 49.72% ar t 
libtest.a
  24665 pancake     1  31    0  213M  211M cpu/0    0:27 49.80% ar t 
libtest.a
  24665 pancake     1  20    0  217M  214M cpu/1    0:28 49.84% ar t 
libtest.a
  24665 pancake     1  30    0  220M  218M cpu/1    0:29 49.74% ar t 
libtest.a
  24665 pancake     1  20    0  223M  221M cpu/0    0:30 49.86% ar t 
libtest.a
  24665 pancake     1  20    0  227M  224M cpu/0    0:31 49.75% ar t 
libtest.a
  24665 pancake     1  10    0  230M  228M cpu/0    0:32 49.81% ar t 
libtest.a
  24665 pancake     1  11    0  233M  231M cpu/0    0:33 49.82% ar t 
libtest.a

It looks strange to me that a single process is moving from one to
another cpu all the time. Is this normal? can this be tuned? how?

The only significant difference by looking at opensolaris 'ar' and GNU
'ar' is the check for alignment while allocating and handling memory.
But I don't see any system-related difference that can make the same
source work that fast on Linux and that slow on Solaris.

Thanks  :)

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: solaris-report.txt
URL: 
<http://mail.opensolaris.org/pipermail/desktop-discuss/attachments/20090702/0935d329/attachment.txt>

Reply via email to