Hi,
After upgrading OpenMPI (from 1.6.5 to 2.0.0) and my program (from 3.2 to
4.0), still the parallel run aborts with the "Illegal instruction" error in
the middle on the run.

I wonder why this happens and how can I debug more? How can I find that
this error is related to the program itself, mpi or system libraries?

Gilles gave a suggestion about using ulimit to create a core file (
https://mail-archive.com/users@lists.open-mpi.org/msg29919.html). Please
see the following:

mahmood@cluster:tran$ cat sc.sh
#!/bin/bash
ulimit -c unlimited
exec /share/apps/siesta/siesta-4.0/tpar/transiesta < trans-cc.fdf
mahmood@cluster:tran$ cat hosts.txt
compute-0-1
mahmood@cluster:tran$ hostname
cluster
mahmood@cluster:tran$ #/share/apps/siesta/openmpi-2.0.0/bin/mpirun
-hostfile hosts.txt -np 15 sc.sh
....
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 5383 on node compute-0-1 exited
on signal 4 (Illegal instruction).
--------------------------------------------------------------------------



Now I see a file core.5383
It is a very huge file (1290018816 bytes)!!!
How can I process that?

Regards,
Mahmood
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to