Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
Mahmood, note you have to compile the source file that contains the snippet with '-g -O0', and link with '-g -O0' also, there was a typo in the gdb command, please read "frame 1" instead of "frame #1" Cheers, Gilles On Fri, Sep 16, 2016 at 12:53 PM, Gilles Gouaillardet

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
Mahmood, -march=bdver1 should be ok on your nodes. from the gcc command line, i was expecting -march=xxx, but it is missing (your gcc might be a bit older for that) note you have to recompile all your libs (openblas and friends) with -march=bdver1 i guess your gdb is also a bit too old to

Re: [OMPI users] static linking MPI libraries with applications

2016-09-15 Thread Jeff Squyres (jsquyres)
If you want to build statically with verbs support, it's tricky. Per the FAQ: 37. I get bizarre linker warnings / errors / run-time faults when I try to compile my OpenFabrics MPI application statically. How do I fix this? Fully static linking is not for the weak, and is not recommended. But

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Matthieu Brucher
I don't think there is anything OpenMPI can do for you here. The issue is clearly on how you are compiling your application. To start, you can try to compile without the --march=generic and use something as generic as possible (i.e. only SSE2). Then if this doesn't work for your app, do the same

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Reuti
Am 15.09.2016 um 19:54 schrieb Mahmood Naderan: > The differences are very very minor > > root@cluster:tpar# echo | gcc -v -E - 2>&1 | grep cc1 > /usr/libexec/gcc/x86_64-redhat-linux/4.4.7/cc1 -E -quiet -v - -mtune=generic > > [root@compute-0-1 ~]# echo | gcc -v -E - 2>&1 | grep cc1 >

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-09-15 Thread Graham, Nathaniel Richard
​Both issues have been fixed. The trouble with CReqops.java was a problem with the test. A fixed version has been pushed to the ompi-java-tests repo. The issue with compare_and_swap is merged on master, and should be in the 2.0.2 release. Let me know if you have any other issues. -Nathan

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
Excuse me, which is most suitable for me to find the name of the illegal instruction? --verbose --debug-level --debug-daemons --debug-daemons-file Regards, Mahmood ___ users mailing list users@lists.open-mpi.org

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
The differences are very very minor root@cluster:tpar# echo | gcc -v -E - 2>&1 | grep cc1 /usr/libexec/gcc/x86_64-redhat-linux/4.4.7/cc1 -E -quiet -v - -mtune=generic [root@compute-0-1 ~]# echo | gcc -v -E - 2>&1 | grep cc1 /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -E -quiet -v -

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
if gcc is installed on your compute node, you can run echo | gcc -v -E - 2>&1 | grep cc1 and look for the -march=xxx parameter /* you might want to compare that with your fronted */ And/or you can run grep family /proc/cpuinfo on your compute node Then man gcc on your front end node >From my

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
Although the CPUs are nearly the same, but the CPU flags are different. I noticed that the frontend has fma, f16c, tch, tce, tbm and bmi1 while the compute nodes don't have them. I guess that since the programs were compiled on the frontend (6380), there are some especial instructions in the

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
Ok, you can try this under gdb info proc mapping info registers x /100x $rip x /100x $eip I remember you are running on AMD cpus that is why INTEL is only instructions must be avoided Cheers, Gilles On Thursday, September 15, 2016, Mahmood Naderan wrote: > disas

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
disas command fails. Program terminated with signal 4, Illegal instruction. #0 0x008da76e in ?? () (gdb) bt #0 0x008da76e in ?? () #1 0x008da970 in ?? () #2 0x00bfe9f8 in ?? () #3 0x in ?? () (gdb) disas No function contains program counter for

Re: [OMPI users] OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
--core=... is the right syntax, sorry about that No need to recompile with -g, binary is good enough here Then you need to run disas in gdb, to disassemble the instruction at 0x08da76e And then, still in gdb info maps or show maps To find out the library this instruction is coming from OpenBLAS

Re: [OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
>gdb --pid=core.5383 ​Are you sure about the syntax?​ ​PID must be a running process. I see --core which seems to be relevant here. Both OpenMPI and Siesta were compiled with O flags. This is not appropriate for gdb. Should I compile both of them with debug symbols? >Btw, did you compile lapack

Re: [OMPI users] Still "illegal instruction"

2016-09-15 Thread Gilles Gouaillardet
Mahmood, You can gdb --pid=core.5383 And then bt An then disas And "scroll" until the current instruction Iirc, there is a star at the beginning of this line You can also try show maps Or info maps (I cannot remember the syntax...) Btw, did you compile lapack and friends by yourself ? Mahmood

[OMPI users] Still "illegal instruction"

2016-09-15 Thread Mahmood Naderan
Hi, After upgrading OpenMPI (from 1.6.5 to 2.0.0) and my program (from 3.2 to 4.0), still the parallel run aborts with the "Illegal instruction" error in the middle on the run. I wonder why this happens and how can I debug more? How can I find that this error is related to the program itself, mpi