Re: [OMPI users] Still "illegal instruction"
>gdb --pid=core.5383 Are you sure about the syntax? PID must be a running process. I see --core which seems to be relevant here. Both OpenMPI and Siesta were compiled with O flags. This is not appropriate for gdb. Should I compile both of them with debug symbols? >Btw, did you compile lapack and friends by yourself ? I use Scalapack which need BLAS. I use OpenBLAS instead of netllib's BLAS? $ gdb --core=core.5383 Try: yum --enablerepo='*-debug*' install /usr/lib/debug/.build-id/e1/ddc85f7caa9f2571545a58479d64ba676217dd [New Thread 5383] [New Thread 5416] [New Thread 5401] [New Thread 5388] [New Thread 5407] [New Thread 5406] [New Thread 5418] [New Thread 5393] [New Thread 5391] [New Thread 5387] [New Thread 5405] [New Thread 5389] [New Thread 5408] [New Thread 5417] [New Thread 5394] [New Thread 5506] [New Thread 5404] [New Thread 5392] [New Thread 5410] [New Thread 5411] [New Thread 5395] [New Thread 5409] [New Thread 5403] [New Thread 5414] [New Thread 5396] [New Thread 5412] [New Thread 5419] [New Thread 5413] [New Thread 5509] [New Thread 5415] [New Thread 5397] [New Thread 5420] [New Thread 5398] [New Thread 5399] Core was generated by `/share/apps/siesta/siesta-4.0/tpar/transiesta'. Program terminated with signal 4, Illegal instruction. #0 0x008da76e in ?? () (gdb) bt #0 0x008da76e in ?? () #1 0x008da970 in ?? () #2 0x00bfe9f8 in ?? () #3 0x in ?? () (gdb) Regards, Mahmood ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] Still "illegal instruction"
Mahmood, You can gdb --pid=core.5383 And then bt An then disas And "scroll" until the current instruction Iirc, there is a star at the beginning of this line You can also try show maps Or info maps (I cannot remember the syntax...) Btw, did you compile lapack and friends by yourself ? Mahmood Naderanwrote: >Hi, > >After upgrading OpenMPI (from 1.6.5 to 2.0.0) and my program (from 3.2 to >4.0), still the parallel run aborts with the "Illegal instruction" error in >the middle on the run. > > >I wonder why this happens and how can I debug more? How can I find that this >error is related to the program itself, mpi or system libraries? > > >Gilles gave a suggestion about using ulimit to create a core file >(https://mail-archive.com/users@lists.open-mpi.org/msg29919.html). Please see >the following: > > >mahmood@cluster:tran$ cat sc.sh >#!/bin/bash >ulimit -c unlimited >exec /share/apps/siesta/siesta-4.0/tpar/transiesta < trans-cc.fdf >mahmood@cluster:tran$ cat hosts.txt >compute-0-1 >mahmood@cluster:tran$ hostname >cluster >mahmood@cluster:tran$ #/share/apps/siesta/openmpi-2.0.0/bin/mpirun -hostfile >hosts.txt -np 15 sc.sh > > > >-- >mpirun noticed that process rank 0 with PID 5383 on node compute-0-1 exited on >signal 4 (Illegal instruction). >-- > > > >Now I see a file core.5383 > >It is a very huge file (1290018816 bytes)!!! > >How can I process that? > > >Regards, >Mahmood > > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users