Bug#725013: [Support] [Debichem-devel] Bug#725013: gromacs-openmpi: grompp crashes with invalid opcode
On 10/01/2013 09:39 PM, Nicholas Breen wrote: Could you please check if the i5 machines where it works include avx in the flags line of /proc/cpuinfo? yep it has. This is my i5. It is indeed newer than sandy bridge. bill@beyonder:~$ grep avx /proc/cpuinfo flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms I built 4.6.3-4 on a different machine than usual, and I think it accidentally picked up a CPU optimization it should not have had. AVX extensions were only added on the Sandy Bridge and newer model Intel CPUs, and the Xeon you provided the information for doesn't have it. If that's the case, I will ask for the package to be rebuilt on a different machine where that problem won't occur. One question though. When I build with dpkg-buildpackage it was crashing. Shouldn't pick the correct flags when I created the packages? I just retested and I know why. I had the package shared libraries installed. So when I tried to run build/src/kernel/grompp it crashed because it was using the system's libraries and not the compiled ones. Ok. So I recompiled and installed my debs and it is working now. Waiting for your update -- __ Vassilis Virvilis Ph.D. Head of IT Biovista Inc. US Offices 2421 Ivy Road Charlottesville, VA 22903 USA T: +1.434.971.1141 F: +1.434.971.1144 European Offices 34 Rodopoleos Street Ellinikon, Athens 16777 GREECE T: +30.210.9629848 F: +30.210.9647606 www.biovista.com Biovista is a privately held biotechnology company that finds novel uses for existing drugs, and profiles their side effects using their mechanism of action. Biovista develops its own pipeline of drugs in CNS, oncology, auto-immune and rare diseases. Biovista is collaborating with biopharmaceutical companies on indication expansion and de-risking of their portfolios and with the FDA on adverse event prediction. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#725013: [Support] [Debichem-devel] Bug#725013: gromacs-openmpi: grompp crashes with invalid opcode
On 09/30/2013 07:27 PM, Nicholas Breen wrote: reassign 725013 gromacs tags 725013 moreinfo thanks On Mon, Sep 30, 2013 at 04:39:48PM +0300, Vassilis Virvilis wrote: Trying to run grompp grompp_d * What exactly did you do (or not do) that was effective (or ineffective)? It crashes dmesg output: [ 1699.966132] traps: grompp_d[9667] trap invalid opcode ip:7fb9311ac95d sp:77700ee8 error:0 in libgmx_d.so.8[7fb9310d+4e9000] [ 1728.255893] traps: grompp[9684] trap invalid opcode ip:7f6807c2c65d sp:7fff560ed648 error:0 in libgmx.so.8[7f6807b51000+51b000] I can't reproduce this crash with my test data, and my system runs a similar Intel CPU (i5-2x00 series). Could you please attach a file that it crashes on (or a pdb2gmx/genbox/etc. sequence that creates one) and the exact command line that causes it to fail? There is no need to have any test data. It crashes just by running it and before printing the help. Here let me re iterate because I have done some steps to pinpoint the bug and now that I am reading my bug reports I can see I wasn't clear enough. The story so far: 1) apt-get update; apt-get dist-upgrade (30/9/2013) 2) reboot (since we have now a new kernel) 3) Let's run staff bill@odin:~$ grompp_d :-) G R O M A C S (-: Illegal instruction bill@odin:~$ grompp :-) G R O M A C S (-: Illegal instruction Here is the dmesg [ 1699.966132] traps: grompp_d[9667] trap invalid opcode ip:7fb9311ac95d sp:77700ee8 error:0 in libgmx_d.so.8[7fb9310d+4e9000] [ 1728.255893] traps: grompp[9684] trap invalid opcode ip:7f6807c2c65d sp:7fff560ed648 error:0 in libgmx.so.8[7f6807b51000+51b000] 4) ok let's see the debugger bill@odin:~$ gdb grompp GNU gdb (GDB) 7.6 (Debian 7.6-5) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-linux-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /usr/bin/grompp...(no debugging symbols found)...done. (gdb) run Starting program: /usr/bin/grompp warning: Could not load shared library symbols for linux-vdso.so.1. Do you need set solib-search-path or set sysroot? [Thread debugging using libthread_db enabled] Using host libthread_db library /lib/x86_64-linux-gnu/libthread_db.so.1. :-) G R O M A C S (-: Program received signal SIGILL, Illegal instruction. 0x76efe65d in rando () from /usr/lib/libgmx.so.8 (gdb) bt #0 0x76efe65d in rando () from /usr/lib/libgmx.so.8 #1 0x76f6a14f in bromacs () from /usr/lib/libgmx.so.8 #2 0x76f6ad0c in CopyRight () from /usr/lib/libgmx.so.8 #3 0xb3ab in cmain () #4 0x7657e995 in __libc_start_main (main=0x6f50 main, argc=1, ubp_av=0x7fffe1d8, init=optimized out, fini=optimized out, rtld_fini=optimized out, stack_end=0x7fffe1c8) at libc-start.c:260 #5 0x6f7e in _start () (gdb) 5) Does it happen if we build it ourselves. At least we could get line information in the backtrace $ apt-get source gromacs-openmpi $ sudo apt-get build-dep gromacs-openmpi $ cd gromacs-4.6.3/ $ cmake . $ make $ find -name grompp $./src/kernel/grompp - It works (prints the help.) No crash. 6) ok. Let's a build a debian package $ apt-get source gromacs-openmpi $ sudo apt-get build-dep gromacs-openmpi $ cd gromacs-4.6.3/ $ dpkg-buildpackage $ cd .. $ dpkg -i ../gromacs_4.6.3-4_amd64.deb $ grompp --- It crashses the same way as the original package. 7) Now I am installing in i5 It works in my i5. Looks like the problem is only in i7. I have tested in the two machines of the cluster. These are xeons that they have the problem. Here is an excerpt from /proc/cpuinfo processor : 23 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz stepping: 2 microcode : 0x15 cpu MHz : 1600.000 cache size : 12288 KB physical id : 1 siblings: 12 core id : 10 cpu cores : 6 apicid : 53 initial apicid : 53 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm ida arat epb dtherm tpr_shadow vnmi flexpriority ept vpid bogomips: 5600.18
Bug#725013: [Support] [Debichem-devel] Bug#725013: gromacs-openmpi: grompp crashes with invalid opcode
Thank you, I think that information will lead to a solution. One last question: On Tue, Oct 01, 2013 at 11:00:26AM +0300, Vassilis Virvilis wrote: 7) Now I am installing in i5 It works in my i5. Looks like the problem is only in i7. I have tested in the two machines of the cluster. These are xeons that they have the problem. Here is an excerpt from /proc/cpuinfo Could you please check if the i5 machines where it works include avx in the flags line of /proc/cpuinfo? I built 4.6.3-4 on a different machine than usual, and I think it accidentally picked up a CPU optimization it should not have had. AVX extensions were only added on the Sandy Bridge and newer model Intel CPUs, and the Xeon you provided the information for doesn't have it. If that's the case, I will ask for the package to be rebuilt on a different machine where that problem won't occur. -- Nicholas Breen nbr...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org