> From [EMAIL PROTECTED] Tue Nov  9 12:10:09 1999
> 
> >On my Ultra-5 with a small 256Kb L2 cache I get 0.58 secs/iter
> >for MLU against 0.78 secs/iter for Mlucas at 512K FFT.
> 
> These timings suggest that it's more than a cache size issue - after
> all, Mlucas has a smaller memory footprint irrespective of the CPU,
> and one would expect some benefit from this even in cases where
> both codes significantly exceed the L2 cache size. I wonder if
> the fact that the code itself (due to all the routines needed for
> non-power-of-2 FFT lengths, which MLU doesn't support) might be
> causing a slowdown here, by competing for space in the L2 cache
> with FFT data? I've little experience with this aspect of performance,
> but perhaps conditional compilation, with each binary incorporating
> only the routines it needs for that length, could reduce the code
> footprint and help performance - would any of the computer science
> experts care to comment?

These low end machines do seem to have a bottleneck feeding the
processor from memory. An Ultra-5 is not an old machine. When I first
started trying different compiler options I found the timed tests
supplied with MLU would give very significant speed differences, but these
almost never translated into a speed difference on a real exponent. I
had a chance to have a very brief talk with a Sun Engineer about this and
he said these low end machines are built to be cost-competitive with
PCs and the CPU probably spends a lot of its time "spinning its wheels"
while waiting for data to be fed to it. Once I got the iteration time
down to 0.6 secs/iter, which was achieved with minimal compiler options,
I didn't get any speed improvement at the 512K FFT level until I used 
the profiling. With that I got it down to the present 0.58 secs/iter.


> Or perhaps the compiler and OS already do a good job at keeping only
> needed program segments in cache, in which case the problem lies yet
> elsewhere.

The executable sizes are quite different:-

113216 bytes for MLU
340048 bytes for Mlucas

I don't know enough about how the operating system handles code
in the cache to know whether this is significant. I would guess
on a little 256Kb cache it could make a difference to the speed
of execution. For a large 4Mb cache this probably isn't much to 
worry about. 


[snip] 
> We also could use help from any SPARC employees familiar with the
> compiler technology to tell us why the f90 -xprefetch option is
> so unpredictable - it speeds some runlengths by up to 10%, but more
> often causes a 10-20% slowdown (or no change).

This would be very helpful. I'm sure Ernst would be happy to let
the compiler writers use his code to improve the compiler. (Any
Sun engineers on this list?) 

> Well, I didn't really expect it to make much difference, which is why
> I expressed surprise when you said that it did. Was the slowdown you
> mentioned for MLU in 64-bit mode also spurious?

It wasn't when I reported it, but it is now spurious. With systems
with small caches I now think that the compiler option -xspace is
very important. This tells the compiler to do no optimizations which
would increase code size. Compiling MLU as 64-bit initially
resulted in a much bigger executable. With the restart capability
it's fairly easy to compile a new binary and within a couple of
hours know whether you've improved the speed, but the save files
of 32 and 64-bit MLU are not compatible so you have to opt for
one or the other on a particular exponent and stay with it to
the end. After several recompiles and restarts there is now no
noticeable difference in speed between 32 and 64-bit MLU. But I've
also mananged to reduce the executable size of the 64-bit version
to where it's virtually the same as the 32-bit.

Bill Rea, Information Technology Services, University of Canterbury  \_ 
E-Mail b dot rea at its dot canterbury dot ac dot nz                 </   New 
Phone   64-3-364-2331, Fax     64-3-364-2332                        /)  Zealand 
Unix Systems Administrator                                         (/' 


 
 
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to