Although this problem is not related to OMPI *at all*, I think it is good
to tell the others what was going on. Finally, I caught the illegal
instruction :)

Briefly, I built the serial version of Siesta on the frontend and ran it
directly on the compute node. Fortunately, "x/i $pc" from GDB showed that
the illegal instruction was a FMA3 instruction. More detail is available at
https://gcc.gnu.org/ml/gcc-help/2016-09/msg00084.html

According to the Wikipedia,


   - *FMA4* is supported in AMD
   <https://en.wikipedia.org/wiki/Advanced_Micro_Devices> processors
   starting with the Bulldozer
   <https://en.wikipedia.org/wiki/Bulldozer_%28microarchitecture%29>
   architecture. FMA4 was realized in hardware before FMA3.
   - *FMA3* is supported in AMD processors starting with the Piledriver
   <https://en.wikipedia.org/wiki/Piledriver_%28microarchitecture%29>
   architecture and Intel <https://en.wikipedia.org/wiki/Intel_Corporation>
   starting with Haswell processors
   <https://en.wikipedia.org/wiki/Haswell_%28microarchitecture%29> and
Broadwell
   processors
   <https://en.wikipedia.org/wiki/Broadwell_%28microarchitecture%29> since
   2014.

Therefore, the frontend (piledriver) inserts a FMA3 instruction while the
compute node (Bulldozer) doesn't recognize it.

The solution was (as stated by guys) building Siesta on the compute node. I
have to say that I tested all related programs (OMPI
​,​
Scalapack, OpenBLAS
​) sequentially on the compute node in order to find who generate the
illegal instruction.

Anyway... thanks a lot for your comments. Hope this helps others in the
future.
​


Regards,
Mahmood
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to