Hi All, Sure, I can do some performance tests. The tests supplied with the software are specifically to test performance so I can run those with various build flags and see what we get. Maybe also enabling link-time optimisation would help here - I've not really played with that myself yet.
The documentation says the software has a "64-bit design" so quite probably we shouldn't be building this for i386. Cheers, TIM On Mon, Feb 2, 2015, at 10:53 AM, Andreas Tille wrote: > Hi Gert, > > thanks for your helpful comments. > > On Mon, Feb 02, 2015 at 11:38:20AM +0100, Gert Wollny wrote: > > Hello, > > > > On Mon, 2015-02-02 at 07:51 +0100, Andreas Tille wrote: > > > Hi Mentors, > > > > > It is very important to build vsearch with the maximum optimisation for > > > speed > > > and thus I wonder whether dropping this option is a good idea or whether > > > I should enable it on i386 and amd64 (the question extends also to > > > freebsd-i386/freebsd-amd64 once an other issue in freebsd with this > > > package is solved). > > > > On amd64 sse/sse2 is enabled by default. > > > > Tuning the code for a specific processor (i.e. core2) might not be such > > a good idea, according to the GCC man page one should use -mtune=generic > > instead: > > > > "generic: > > > > Produce code optimized for the most common IA32/AMD64/EM64T processors. > > If you know the CPU on which your code will run, then you should use the > > corresponding -mtune or -march option instead of -mtune=generic. But, > > if you do not know exactly what CPU users of your application will have, > > then you should use this option. > > As new processors are deployed in the marketplace, the behavior of this > > option will change. Therefore, if you upgrade to a newer version of > > GCC, code generation controlled by this option will change to reflect > > the processors that are most common at the time that version of GCC is > > released. " > > Tim, could you clarify with upstream if they agree that -mtune=generic is > the option that should be used? In this case my patch in svn I prepared > in advance (x86_spezific_opts.patch) should be dropped. > > > In addition, with itksnap I saw that -funroll-loops and -ftree-vectorize > > improved performance a lot, and these are options that do not depend on > > the architecture, but are also not enabled by default. > > > > -funroll-loops may also slow down the code, you should check this. It is > > especially effective if there are many small loops of fixed size (like > > it is the case with ITK's types that are templated over dimensions). > > > > -ftree-vectorize may be useless on x86 without SSE but on amd64 it could > > give some speedups. > > Tim, could you do some performance checks? I have no idea whether the > usual upstream test suite is a proper check for this. > > Kind regards > > Andreas. > > -- > http://fam-tille.de -- Of course I'm a technophobe; I program computers for a living! -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

