my question is a bit off topic, but since I know from the conference in Milan, that there are a lot of programming specialists among us, I hope, you can help me.
In my real-time application an ISR has to do a matrix multiplication in real time. This is done by the following code:
===
for (j=0; j<Np; j++)
{
Element=0.0;
k=(j+1)*Np;
for (i=0; i<Np; i++)
Element+= *(P+k+i) * (*(W+i) - *(f+i));
*(DU+j)=Element;
}
===
*DU, *W and *f are declared as double.
Compiler options: -O2 -D__KERNEL__ -DMODULE -c -I/usr/src/rtai- 1.3/include -funroll-loops
With Np=100 the calculation time is 51�s. This seems to be quite high for an AMD Duron processor with 900MHz. So I tried to find out, which part of the routine is consuming most of the time. I realized, that deleting the command "*(DU+j)=Element" reduces the processing time to 9�s.
My first thought was, that due to cache misses the time to access *DU gives that huge difference, but changing "*(DU+j)=Element" into "Element=*(DU+j)" also resulted in 9�s processing time. (Of course the program didn't give the desired results in that case.)
In my opinion the gcc compiler (version 2.95.2 19991024) refuses to unroll the loops correctly when inserting the line "*(DU+j)=Element;" into the outer for-loop. Does someone have an idea to solve this problem?
Greetings from Wuppertal
Arne Linder
Dipl.-Ing. Arne Linder Labor fuer elektrische Maschinen und Antriebe Fachbereich 13 Bergische Universitaet - Gesamthochschule Wuppertal D-42097 Wuppertal e-mail: [EMAIL PROTECTED] -- [rtl] --- To unsubscribe: echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR echo "unsubscribe rtl" | mail [EMAIL PROTECTED] -- For more information on Real-Time Linux see: http://www.rtlinux.org/
