Hi Brian,
   lame itself gives you all the stats as it is processing. I agree with 
Laca, that the optimizations are
mostly floating point (though mmx is integer), and will probably have 
less of an effect on most
desktop apps. Though many mutimedia apps such as gstreamer (and plugins) 
can heavily use floating point.
Using er_src you can see what percentage of a compiled object files are 
mmx/sse instructions (very rough estimation)-

doug at bangkok> er_src -disasm quantize_lines_xrpow 
/usr/lib/pentium_pro+mmx/libmp3lame.so | grep '^[ ]*\[' | wc -l
    4422
doug at bangkok> er_src -disasm quantize_lines_xrpow 
/usr/lib/pentium_pro+mmx/libmp3lame.so | grep '^[ ]*\[' | grep xmm | wc -l
     213

Other than gtkperf, I dont know really good way to quantify performance 
benefits for the gnome libraries. Most of the focus seems
to be on desktop startup times which are mostly a IO issue. I have used 
gtkperf in the past to see the difference between a 32bit
Xserver and Xclient (gtkperf) and a 64bit ones.

You  will probably note that I use ISALIST to search for the optimized 
libraries. I think this variable is abit broad if it does not
have an optimized version or it has no permissions -

   xstat(2, "/usr/lib/amd64/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/pentium_pro+mmx/libmp3lame.so.0", 0x08047228) 
Err#13 EACCES [file_dac_search]
   xstat(2, "/usr/lib/pentium_pro/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/pentium+mmx/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/pentium/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/i486/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/i386/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/i86/libmp3lame.so.0", 0x08047228) Err#2 ENOENT
   xstat(2, "/usr/lib/libmp3lame.so.0", 0x08047228) = 0

Thats a lot of 'stats' for platforms that don't exist.

Doug
P.S. er_src, collect, analayze from Sun Studio really Rock. After many 
years using Sun compilers, I have only just sat down
and had a look at them. Being able to look at the C code, assembler with 
comments all together is really cool. It eithen told
me that I had requested "not" to inline some functions (dont know how I 
did that), now they are inlined :-) Another nail in the
gcc coffin.

Brian Nitz wrote:
> Doug,
>
>  Very interesting results, what benchmark did you use?
>
> Laszlo (Laca) Peter wrote:
>> Hi Doug,
>>
>> Sorry for the slow response, we're really busy these days...
>>
>> I think this is awesome.  Although most of these options are floating
>> point related, which is not typical for desktop apps, some low level
>> libs could probably be optimised better and we should also re-evaluate
>> our default optimisation options.
>>
>> Thanks,
>> Laca
>>
>> On Sun, 2006-05-14 at 21:58 -0700, Doug Scott wrote:
>>  
>>> Since pkgtool spec files make it easy to build/install apps, I 
>>> decided to do some benchmarks to see if you get a decent increase in 
>>> speed, if you build platform optimized libraries. Sun Studio 11 has 
>>> some interesting options to turn on SSE2, AMD 3DNOW etc instructions 
>>> so I wanted to turn these on and see if they give a good increase in 
>>> performance over the default options supplied in the JDS build.
>>> The test platform I am using is a Acer Ferrari 4005, running Solaris 
>>> 11 Nevada build 38. The app is lame from FWlame.spec. For each 
>>> build, I did 2 runs with the CPU set to 800Mhz and 2000Mhz set using 
>>> the powernowadm utility.
>>>
>>> The first thing I did was to do an un-optimized baseline build by 
>>> setting CFLAGS="".
>>>
>>> The second build was using the JDS optimized flags CFLAGS="%optflags".
>>>
>>> The last build was built for the platform "pentium_pro+mmx" using 
>>> the following -
>>>
>>> CFLAGS="`echo %optflags | sed -e 's/-xpentium//'` -xarch=sse2a 
>>> -xchip=opteron -xcache=64/64/2:1024/64/16 -nofstore -xvector 
>>> -xbuiltin=%all -xdepend -xlibmil -xlibmopt"
>>>
>>> LD_OPTIONS="-R%{_libdir}/\$ISALIST:%{_libdir} -lmvec"
>>> And after the configure added #include <sunmedia_intrin.h> to config.h
>>>
>>> To summarize the result, the JDS options give a [b]95%[/b] increase 
>>> in performance over un-optimized baseline build. The pentium_pro+mmx 
>>> build gave a [b]151%[/b] increase over the baseline, which is an 
>>> increase of [b]28%[/b] over the JDS options.
>>>
>>> I have attached the raw results and the spec file for the 
>>> pentium_pro+mmx build.
>>>
>>> Doug
>>>  
>>>
>>> This message posted from opensolaris.org
>>> _______________________________________________
>>> desktop-discuss mailing list
>>> desktop-discuss at opensolaris.org
>>>     
>>
>> _______________________________________________
>> desktop-discuss mailing list
>> desktop-discuss at opensolaris.org
>>   

Reply via email to