Since pkgtool spec files make it easy to build/install apps, I decided to do
some benchmarks to see if you get a decent increase in speed, if you build
platform optimized libraries. Sun Studio 11 has some interesting options to
turn on SSE2, AMD 3DNOW etc instructions so I wanted to turn these on and see
if they give a good increase in performance over the default options supplied
in the JDS build.
The test platform I am using is a Acer Ferrari 4005, running Solaris 11 Nevada
build 38. The app is lame from FWlame.spec. For each build, I did 2 runs with
the CPU set to 800Mhz and 2000Mhz set using the powernowadm utility.
The first thing I did was to do an un-optimized baseline build by setting
CFLAGS="".
The second build was using the JDS optimized flags CFLAGS="%optflags".
The last build was built for the platform "pentium_pro+mmx" using the following
-
CFLAGS="`echo %optflags | sed -e 's/-xpentium//'` -xarch=sse2a -xchip=opteron
-xcache=64/64/2:1024/64/16 -nofstore -xvector -xbuiltin=%all -xdepend -xlibmil
-xlibmopt"
LD_OPTIONS="-R%{_libdir}/\$ISALIST:%{_libdir} -lmvec"
And after the configure added #include <sunmedia_intrin.h> to config.h
To summarize the result, the JDS options give a [b]95%[/b] increase in
performance over un-optimized baseline build. The pentium_pro+mmx build gave a
[b]151%[/b] increase over the baseline, which is an increase of [b]28%[/b] over
the JDS options.
I have attached the raw results and the spec file for the pentium_pro+mmx build.
Doug
This message posted from opensolaris.org
-------------- next part --------------
Results
======================================================================
CPU SPEED VERSION play/CPU Real time CPU time
--------------------------------------------------------------------
2000 standard 3.8750x 2:10 0:55
800 standard 1.5445x 5:15 2:18
2000 JDS optimised 7.5750x 1:06 0:28
800 JDS optimised 3.0210x 2:44 1:11
2000 pent_pro+mmx 9.7358x 0:50 0:22
800 pent_pro+mmx 3.8834x 2:08 0:55
Test run for pentium_pro+mmx version -
=======================================
doug at bangkok> file /usr/bin/lame
/usr/bin/lame: ELF 32-bit LSB executable 80386 Version 1 [FPU], dynamically
linked, not stripped
doug at bangkok> ldd /usr/bin/lame
libmp3lame.so.0 => /usr/lib/pentium_pro+mmx/libmp3lame.so.0
libcurses.so.1 => /usr/lib/libcurses.so.1
libm.so.2 => /usr/lib/libm.so.2
libsocket.so.1 => /usr/lib/libsocket.so.1
libnsl.so.1 => /usr/lib/libnsl.so.1
libc.so.1 => /usr/lib/libc.so.1
libmvec.so.1 => /usr/lib/libmvec.so.1
libmp.so.2 => /lib/libmp.so.2
libmd.so.1 => /lib/libmd.so.1
libscf.so.1 => /lib/libscf.so.1
libuutil.so.1 => /lib/libuutil.so.1
/lib/libmvec/libmvec_hwcap1.so.1
/usr/lib/pentium_pro+mmx/libmp3lame.so.0: ELF 32-bit LSB dynamic lib
80386 Version 1 [SSE2 SSE CMOV FPU], dynamically linked, not stripped
doug at bangkok> file /lib/libmvec/libmvec_hwcap1.so.1
/lib/libmvec/libmvec_hwcap1.so.1: ELF 32-bit LSB dynamic lib 80386
Version 1 [SSE2 SSE CMOV FPU], dynamically linked, not stripped, no debugging
information available
doug at bangkok> lame x.wav x.mp3
LAME 3.97 (beta 2, May 15 2006) 32bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 16538 Hz - 17071 Hz
Encoding x.wav to x.mp3
Encoding as 44.1 kHz 128 kbps j-stereo MPEG-1 Layer III (11x) qval=3
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
8218/8218 (100%)| 0:22/ 0:22| 0:51/ 0:51| 9.7402x| 0:00
-------------------------------------------------------------------------------
kbps LR MS % long switch short %
128.0 3.3 96.7 96.4 2.0 1.6
Writing LAME Tag...done
ReplayGain: -8.5dB
-------------- next part --------------
A non-text attachment was scrubbed...
Name: FWlame.spec
Type: application/octet-stream
Size: 3572 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/desktop-discuss/attachments/20060514/c9dff642/attachment.obj>