Jacques Bouchard wrote:
> Hi all,
>
> I just ran speed tests on a Compaq XP1000 (CPU 21264) that has been lent to
> me (I will buy one if it suits me). I installed beforehand Redhat 6.0, which
> I downloaded from a redhat mirror. 2 problems arose:
>
> 1) The math library seems *very* slow:
Indeed. Get either the Compaq Portable Math Library, or if you prefer free
software and/or your code can make use of vectorized sqrt/sin/cos (i.e. perform
sin on these 100 numbers) then get libffm, both from http://www.alphalinux.org/
(ALO).
> So I downloaded the glibc sources from GNU and compiled with different
> options (I added '-O2 -mcpu=21264', and removed -mieee), but the speed
> increase was only 7%.
My, you're brave! And you call yourself a "beginner"? :-)
I think the optimizations you want are '-O6 -mcpu=ev6' but 21264 and ev6 may do
the same thing. I think -O6 includes -ffast-math which is needed to get the
hardware sqrt on 21264. But CPML and libffm are hand-coded assembly which
should give you significantly better performance than optimized glibc- and they
avoid some of the floating exceptions such as the exp (-large) underflow.
> 2) I tried to run a very large program, but I failed: it crashes with this
> error message:
>
> Floating point exception (core dumped)
Yup, a common problem on Alpha. Here's a recent post from the high-performance
list (instructions also on ALO):
Richard Gorton wrote:
> Martin Kahlert <[EMAIL PROTECTED]> writes:
>
> > Why is -mieee compiled code so slow? I thought, only when an execption is
> > generated, the mieee routines should provide values like inf, NaN.
> > Otherwise the FPU could proceed as usual?
>
> Since "exceptions" are exceptional, they are, by definition very infrequent.
> If they were meant to be commonplace, they would be called "commons" or
> some such. Unfortunately, there _is_ a fair amount of sloppy code in the
> world where specific integer values are used to trigger termination
> conditions, and are freely operated on as if they are floating point
> values. Some processors/architectures silently convert NaNs/denorms
> into zero results, or ignore them. In my opinion, encouraging sloppy
> programming habits by masking such effects is bad. This silent
> behavior can hide nastiness like memory leaks elsewhere in the code.
>
> In Alpha implementations prior to the 21264, floating point exceptions
> were imprecise. That is, they were not necessarily reported until after
> the instruction which may have caused them has been retired. The trade-off
> here is that fp-intensive code which operates with 'normal' values
> can really scream. But fixing up exceptions is going to be more complex,
> and will consume a bunch of cycles.
>
> In the Alpha Architecture Reference Manual, there is a section about
> arithmetic traps & trap shadows, etc. which goes into a lot of detail
> about the code sequences to fix these up. One of the requirements of
> code to handle such behavior is (on alpha prior to the 21264) to insert
> 'trapb' (trap barrier) instructions in lots of places. The trapb
> instruction forces the processor to wait (stalling it!) until all
> exceptions have been reported before continuing. There are also some
> rules about register re-use among instructions between trap shadows.
>
> On the 21264, exceptions are precise, and trapb instructions are
> effectively nops, which will significantly improve things (assuming
> you eventually get one).
>
> The net requirement of -mieee is to generate more conservative code,
> and to insert trapb instructions.
>
> If your code generates lots of exceptions at a small number of locations,
> you could put that code into a single file, and only compile that file
> with -mieee. If you really care about floating point performance,
> it's probably worth the effort to debug and modify your code to not
> do NaN/denorm operations. You might inadvertently fix some bugs as well.
>
> I hope this helps some,
>
> Rick
>
> Richard Gorton http://www.digital.com/amt
> Compaq Computer Corporation All standard disclaimers apply.
> Reply-to: [EMAIL PROTECTED]
> I get the same errors on Intel, but the program doesn't stop, and that suits
> me very well.
>
> So I tried to catch the exception with the signal function (signal(SIGFPE,
> handler)): the handler function is actually called, but that doesn't prevent
> the program from crashing with the same message.
>
> Is there a way to prevent the crash (and if possible, without reducing the
> program speed) ?
You can prevent the crash by compiling with -mieee but with a significant
performance hit- see above.
I'm ccing this reply to [EMAIL PROTECTED], because for some reason that list
has a much greater population and AFAICT more knowledgable subscribers than any
other I have found, and because you're using RedHat.
Zeen,
-Adam `Cold Fusion' Powell, IV http://www.ctcms.nist.gov/~powell/ ____
USDoC, National Institute of Standards & Technology (NIST) |\ ||< |
Center for Theoretical and Computational Materials Science | \||_> |