Re: [OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
>gdb --pid=core.5383

​Are you sure about the syntax?​
​PID must be a running process. I see --core which seems to be relevant
here.

Both OpenMPI and Siesta were compiled with O flags. This is not appropriate
for gdb. Should I compile both of them with debug symbols?

>Btw, did you compile lapack and friends by yourself ?
I use Scalapack which need BLAS. I use OpenBLAS instead of netllib's BLAS?


​$ gdb --core=core.5383

Try: yum --enablerepo='*-debug*' install
/usr/lib/debug/.build-id/e1/ddc85f7caa9f2571545a58479d64ba676217dd
[New Thread 5383]
[New Thread 5416]
[New Thread 5401]
[New Thread 5388]
[New Thread 5407]
[New Thread 5406]
[New Thread 5418]
[New Thread 5393]
[New Thread 5391]
[New Thread 5387]
[New Thread 5405]
[New Thread 5389]
[New Thread 5408]
[New Thread 5417]
[New Thread 5394]
[New Thread 5506]
[New Thread 5404]
[New Thread 5392]
[New Thread 5410]
[New Thread 5411]
[New Thread 5395]
[New Thread 5409]
[New Thread 5403]
[New Thread 5414]
[New Thread 5396]
[New Thread 5412]
[New Thread 5419]
[New Thread 5413]
[New Thread 5509]
[New Thread 5415]
[New Thread 5397]
[New Thread 5420]
[New Thread 5398]
[New Thread 5399]
Core was generated by `/share/apps/siesta/siesta-4.0/tpar/transiesta'.
Program terminated with signal 4, Illegal instruction.
#0  0x008da76e in ?? ()
(gdb) bt
#0  0x008da76e in ?? ()
#1  0x008da970 in ?? ()
#2  0x00bfe9f8 in ?? ()
#3  0x in ?? ()
(gdb)
​

Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
Mahmood,

You can

gdb --pid=core.5383
And then
bt
An then
disas
And "scroll" until the current instruction
Iirc, there is a star at the beginning of this line
You can also try
show maps
Or
info maps
(I cannot remember the syntax...)

Btw, did you compile lapack and friends by yourself ?

Mahmood Naderan  wrote:
>Hi,
>
>After upgrading OpenMPI (from 1.6.5 to 2.0.0) and my program (from 3.2 to 
>4.0), still the parallel run aborts with the "Illegal instruction" error in 
>the middle on the run.
>
>
>I wonder why this happens and how can I debug more? How can I find that this 
>error is related to the program itself, mpi or system libraries?
>
>
>Gilles gave a suggestion about using ulimit to create a core file 
>(https://mail-archive.com/users@lists.open-mpi.org/msg29919.html). Please see 
>the following:
>
>
>mahmood@cluster:tran$ cat sc.sh
>#!/bin/bash
>ulimit -c unlimited
>exec /share/apps/siesta/siesta-4.0/tpar/transiesta < trans-cc.fdf
>mahmood@cluster:tran$ cat hosts.txt
>compute-0-1
>mahmood@cluster:tran$ hostname
>cluster
>mahmood@cluster:tran$ #/share/apps/siesta/openmpi-2.0.0/bin/mpirun -hostfile 
>hosts.txt -np 15 sc.sh
>
>
>
>--
>mpirun noticed that process rank 0 with PID 5383 on node compute-0-1 exited on 
>signal 4 (Illegal instruction).
>--
>
>
>
>Now I see a file core.5383
>
>It is a very huge file (1290018816 bytes)!!! 
>
>How can I process that?
>
>
>Regards,
>Mahmood
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users