I have been investigating a performance issue on OpenSolaris/x86, which will probably happen to other architectures (sparc, arm ..) related to the memory management while working with huge files with 'ar'.
The result of the investigation exposes that GNU ar is >40 times slower than the Solaris one on a native Solaris system (virtualized gets better results). The benchmarks has been tested on a Sun Ultra 20 (dual opteron 2.8GHz with OpenSolaris 2009.06), on Ubuntu 8.10 with 2.6.27 and virtualized OS2009.05 on virtualbox on Ubuntu running on a core2 2.13GHz cpu. No swap was used for these tests. The main reason why we cannot use the Solaris 'ar' is because it doesn't support the 'N' flag which is present in the GNU one. This flags allows to index the contents of a list of files with the same name inside an archive. The Solaris one only permits -a, -b and -i to append or prepend, but not to index (if I understand it correctly in the manpage). In the other hand, would be nice to investigate about this huge loss of performance of the GNU ar compared with the Solaris one. The problem can be reproduced with the 'ar' tool. I have been able to test he issue with GNU ar 2.15 (shipped with opensolaris), GNU ar 2.18 (compiled by me). The tested file is a 61MB archive. $ time ar t libtest.a > /dev/null real 0m47.793s user 0m43.318s sys 0m4.032s Using the Solaris's 'ar' the result is quite different: $ time /usr/bin/ar -t libtest.a > /dev/null real 0m1.412s user 0m1.077s sys 0m0.252s $ time /usr/bin/ar -t libtest.a > /dev/null real 0m1.341s user 0m1.076s sys 0m0.211s On GNU/Linux (Ubuntu 8.1 with kernel 2.6.17) I get much better times with GNU ar. $ time ar t libtest.a > /dev/null real 0m1.450s user 0m0.112s sys 0m0.412s $ time ar t libtest.a > /dev/null real 0m0.773s user 0m0.108s sys 0m0.400s I have installed OpenSolaris 2009.05 on VirtualBox on the above GNU/Linux box and the resulting times are better than the native box with OpenSolaris 2009.06 which makes me think that it is something related to the memory management which in the case of Linux->Virtualbox->Solaris performs better probably because the caching system of the kernel. Here's the linux->vbox->osol2009-05 with GNU ar 2.15 results: $ time ar t libtest.a > /dev/null real 0m23.660s user 0m17.627s sys 0m5.407s $ time ar t libtest.a > /dev/null real 0m23.537s user 0m18.070s sys 0m5.064s And using the OpenSolaris one: $ time /usr/bin/ar -t libtest.a > /dev/null real 0m2.054s user 0m0.873s sys 0m1.111s $ time /usr/bin/ar -t libtest.a > /dev/null real 0m1.802s user 0m0.878s sys 0m0.863s The 'top' trace of the running process gives this output: $ while : ; do top | grep ' ar' ; done | tee ar-trace.txt 24665 pancake 1 32 0 49M 46M cpu/0 0:00 31.29% ar t libtest.a 24665 pancake 1 20 0 80M 77M cpu/1 0:01 49.80% ar t libtest.a 24665 pancake 1 20 0 94M 91M cpu/1 0:02 49.85% ar t libtest.a 24665 pancake 1 22 0 104M 102M cpu/0 0:03 49.81% ar t libtest.a 24665 pancake 1 21 0 112M 109M cpu/0 0:04 49.85% ar t libtest.a 24665 pancake 1 21 0 119M 116M cpu/1 0:05 49.79% ar t libtest.a 24665 pancake 1 21 0 125M 122M cpu/0 0:06 49.77% ar t libtest.a 24665 pancake 1 10 0 131M 128M cpu/0 0:07 49.84% ar t libtest.a 24665 pancake 1 20 0 136M 134M cpu/0 0:08 49.82% ar t libtest.a 24665 pancake 1 21 0 142M 139M cpu/0 0:09 49.83% ar t libtest.a 24665 pancake 1 22 0 147M 144M cpu/1 0:10 49.80% ar t libtest.a 24665 pancake 1 1 0 152M 149M cpu/0 0:11 49.82% ar t libtest.a 24665 pancake 1 10 0 157M 154M cpu/1 0:12 49.78% ar t libtest.a 24665 pancake 1 0 0 161M 159M cpu/0 0:13 49.86% ar t libtest.a 24665 pancake 1 0 0 166M 163M cpu/1 0:14 49.75% ar t libtest.a 24665 pancake 1 10 0 170M 168M cpu/0 0:16 49.82% ar t libtest.a 24665 pancake 1 11 0 175M 172M cpu/0 0:17 49.81% ar t libtest.a 24665 pancake 1 11 0 179M 176M cpu/0 0:18 49.82% ar t libtest.a 24665 pancake 1 1 0 183M 180M cpu/1 0:19 49.87% ar t libtest.a 24665 pancake 1 0 0 187M 184M cpu/1 0:20 49.24% ar t libtest.a 24665 pancake 1 50 0 191M 188M cpu/0 0:21 49.76% ar t libtest.a 24665 pancake 1 50 0 195M 192M cpu/0 0:22 49.84% ar t libtest.a 24665 pancake 1 41 0 198M 196M cpu/1 0:23 49.85% ar t libtest.a 24665 pancake 1 30 0 202M 200M cpu/0 0:24 49.78% ar t libtest.a 24665 pancake 1 30 0 206M 203M cpu/0 0:25 49.70% ar t libtest.a 24665 pancake 1 30 0 210M 207M cpu/0 0:26 49.72% ar t libtest.a 24665 pancake 1 31 0 213M 211M cpu/0 0:27 49.80% ar t libtest.a 24665 pancake 1 20 0 217M 214M cpu/1 0:28 49.84% ar t libtest.a 24665 pancake 1 30 0 220M 218M cpu/1 0:29 49.74% ar t libtest.a 24665 pancake 1 20 0 223M 221M cpu/0 0:30 49.86% ar t libtest.a 24665 pancake 1 20 0 227M 224M cpu/0 0:31 49.75% ar t libtest.a 24665 pancake 1 10 0 230M 228M cpu/0 0:32 49.81% ar t libtest.a 24665 pancake 1 11 0 233M 231M cpu/0 0:33 49.82% ar t libtest.a It looks strange to me that a single process is moving from one to another cpu all the time. Is this normal? can this be tuned? how? The only significant difference by looking at opensolaris 'ar' and GNU 'ar' is the check for alignment while allocating and handling memory. But I don't see any system-related difference that can make the same source work that fast on Linux and that slow on Solaris. Thanks :) -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: solaris-report.txt URL: <http://mail.opensolaris.org/pipermail/desktop-discuss/attachments/20090702/0935d329/attachment.txt>
