Markus Dittrich wrote: > On Fri, 25 Aug 2006, Adam Pityszek wrote: > >>> Dear Markus, gentoo-science guys, >>> >>> Please find below the reply from Clint to my yesterday's email related to >>> our work on ATLAS shared libraries in Gentoo. >>> >>> Markus, I think we can help with answering the questions (2) and (3). Of >>> course, volunteers from gentoo-science are welcome as well. >>> >>> BR, >>> /ediap >>> >>> (1) Is it true that the extra pointer may still be used if we restore >>> it at >>> end of assembly routine? >>> (2) Does throwing the -fpic or other required compiler flag changes >>> change >>> the best cases (thus necessitating doubling the arch defaults)? >>> (3) What is the overall performance affect when using .so? >>> >>> I've tried to answer (1) by looking at some docs, but never got convinced >>> either way. I've been meaning to write a resister stress-test to see if >>> I can make gcc use the reserved register in a function w/o global data. >>> Perhaps you know? >>> >>> You guys could help with (2) & (3) if you like. You could build >>> out-of-box >>> to .a on whatever machines you can, and then build it to .so using your >>> gentoo harness, and post some head-to-head timings . . . If, as we >>> suspect, >>> the difference is essentially zero, that makes .so a lot more >>> attractive . . . >>> > > Hi Adam, > > Thanks for talking to upstream about this and Clint's response > sounds encouraging. We could definitely help out with 2) and 3); > it would be good to know anyway how well we do with our shared libs. In > doing so we should also test the impact of using > the 387 floating point unit versus the sse instruction set. According to > Clint, the former can give a significant performance > gain on some CPU's. If that is the case it might be worth a note in the > ebuild to make our users aware of it. > > We should get a hold of a nice benchmark suite for this purpose; Clint > has posted one on this gcc bug > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827 > which we might be able to use. I'll have a look at it. > > Best, > Markus > > > -- Markus Dittrich (markusle) > Gentoo Linux Developer > Scientific applications
If you have the time, you can turn off all of the pre-conceived notions Atlas has about your architecture and let it benchmark itself. In fact, for the hard-core number crunchers, you might actually want to put a USE flag in the ebuild to do a "brute-force" assume-nothing compile, warning them that it takes a long time and that it should be run after an "emerge -f" with Linux in single-user mode. My recollection is that it used to take about 8 hours on a 1.3 GHz Athlon Thunderbird. -- [email protected] mailing list
