I added --disable-cpuid, will be in hwloc v1.10. Brice
Le 06/05/2014 00:44, Friedley, Andrew a écrit : > Actually, is there any way to make HWLOC_COMPONENTS=-x86 the default or > otherwise disable or compile without the x86 backend, so that I get that > behavior by default? > > Thanks, > > Andrew > >> -----Original Message----- >> From: Brice Goglin [mailto:brice.gog...@inria.fr] >> Sent: Monday, May 5, 2014 1:03 PM >> To: Friedley, Andrew >> Subject: Re: [hwloc-users] divide by zero error? >> >> Thanks. >> The simulator returns buggy cpuid information. It may be possible to >> workaround this specific problem, but I am afraid there could be others. >> I think you should just disable the hwloc x86 backend by setting >> HWLOC_COMPONENTS=-x86 in the environment. Does this look like an >> acceptable work-around ? >> Brice >> >> >> >> Le 05/05/2014 20:21, Friedley, Andrew a écrit : >>> Back from vacation -- Is this what you're after? >>> >>> [root@viper0 bin]# ./lstopo >>> >>> >>> * Topology extraction from /proc/cpuinfo * >>> >>> processor 0 >>> found 1 cpu topologies, cpuset 0x00000001 os socket 0 has cpuset >>> 0x00000001 os core 0 has cpuset 0x00000001 thread 0 has cpuset >>> 0x00000001 cache depth 0 has cpuset 0x00000001 cache depth 0 has >>> cpuset 0x00000001 cache depth 1 has cpuset 0x00000001 cache depth 2 >>> has cpuset 0x00000001 found DMIProductName 'Bochs' >>> found DMIProductVersion '' >>> found DMIProductSerial '' >>> found DMIChassisVendor 'Bochs' >>> found DMIChassisType '1' >>> found DMIChassisVersion '' >>> found DMIChassisSerial '' >>> found DMIChassisAssetTag '' >>> found DMIBIOSVendor 'Bochs' >>> found DMIBIOSVersion 'Bochs' >>> found DMIBIOSDate '01/01/2007' >>> found DMISysVendor 'Bochs' >>> Machine#0(local=2055580KB total=0KB DMIProductName=Bochs >> DMIProductVersion= DMIProductSerial= DMIChassisVendor=Bochs >> DMIChassisType=1 DMI) cpuset 0xf...f complete 0x00000001 online 0xf...f >> allowed 0xf...f nodeset 0x0 completeN 0x0 allowedN 0xf...f >>> Socket#0(CPUVendor=GenuineIntel CPUFamilyNumber=6 >> CPUModelNumber=26 CPUModel="Intel(R) Core(TM) i7 CPU @ >> 2.00GHz") cpuset 0x00000001 >>> L3Cache(size=8192KB linesize=64 ways=16) cpuset 0x00000001 >>> L2Cache(size=256KB linesize=64 ways=8) cpuset 0x00000001 >>> L1dCache(size=32KB linesize=64 ways=8) cpuset 0x00000001 >>> L1iCache(size=32KB linesize=64 ways=4) cpuset 0x00000001 >>> Core#0 cpuset 0x00000001 >>> PU#0 cpuset 0x00000001 >>> Backend x86 forcing a reconnect of levels >>> --- Socket level has number 1 >>> >>> --- Cache level depth 3 has number 2 >>> >>> --- Cache level depth 2 has number 3 >>> >>> --- Cache level depth 1 has number 4 >>> >>> --- Cache level depth 1 has number 5 >>> >>> --- Core level has number 6 >>> >>> --- PU level has number 7 >>> >>> highest cpuid b, cpuid type 0 >>> highest extended cpuid 80000008 >>> possible CPUs are 0x00000001 >>> binding to CPU0 >>> APIC ID 0x00 max_log_proc 1 >>> phys 0 thread 0 >>> cache 0 type 1 >>> cache 1 type 2 >>> cache 2 type 3 >>> cache 3 type 3 >>> cache 4 type 0 >>> cache 0 type 1 L1 t2 c8 linesize 64 linepart 1 ways 8 sets 64, size >>> 32KB thus 0 threads Floating point exception (core dumped) >>> >>>> -----Original Message----- >>>> From: Brice Goglin [mailto:brice.gog...@inria.fr] >>>> Sent: Wednesday, April 30, 2014 2:30 AM >>>> To: Friedley, Andrew >>>> Subject: Re: [hwloc-users] divide by zero error? >>>> >>>> Thanks. >>>> The Linux backend works well so the bug is indeed in the x86 backend >> only. >>>> Could you rebuild with --enable-debug and send the entire >>>> stdout+stderr output of lstopo ? >>>> >>>> Thanks >>>> Brice >>>> >>>> >>>> >>>> Le 29/04/2014 17:01, Friedley, Andrew a écrit : >>>>> Attached, off list. >>>>> >>>>> Andrew >>>>> >>>>>> -----Original Message----- >>>>>> From: hwloc-users [mailto:hwloc-users-boun...@open-mpi.org] On >>>> Behalf >>>>>> Of Brice Goglin >>>>>> Sent: Monday, April 28, 2014 10:37 PM >>>>>> To: hwloc-us...@open-mpi.org >>>>>> Subject: Re: [hwloc-users] divide by zero error? >>>>>> >>>>>> Please run "hwloc-gather-topology simics" and send the resulting >>>>>> simics.tar.bz2 that it will create. However, I assume that the >>>>>> simulator returns buggy x86 cpuid information, so we'll see if we >>>>>> want/can easily workaround the bug or just let simics developers fix it. >>>>>> Brice >>>>>> >>>>>> >>>>>> Le 29/04/2014 01:15, Friedley, Andrew a écrit : >>>>>>> Hi, >>>>>>> >>>>>>> I ran into a problem when running OMPI v1.8.1 -- a divide by zero >>>>>>> crash >>>>>> deep in the hwloc code called by OMPI. The system I'm running is a >>>>>> simics >>>>>> x86_64 emulator and RHEL 6.3. I can reproduce the error running >>>>>> lstopo from hwloc v1.9: >>>>>>> [root@viper0 bin]# LD_LIBRARY_PATH=/root/hwloc/lib ./lstopo -v >>>>>>> Floating point exception (core dumped) >>>>>>> >>>>>>> >>>>>>> Hwloc v1.1rc6, already installed on the system, and a >>>>>>> corresponding OMPI >>>>>> 1.6.5 build, works with no problems: >>>>>>> [root@viper0 bin]# lstopo --version lstopo 1.1rc6 >>>>>>> [root@viper0 bin]# lstopo -v >>>>>>> Machine (P#0 local=2055580KB total=2055580KB >>>> DMIProductName=Bochs >>>>>> DMIProductVersion= DMIProductSerial= DMIChassisVendor=Bochs >>>>>> DMIChassisType=1 DMIChassisVersion= DMIChassisSerial= >>>>>> DMIChassisAssetTag= DMIBIOSVendor=Bochs DMIBIOSVersion=Bochs >>>>>> DMIBIOSDate=01/01/2007 DMIS) >>>>>>> Socket L#0 (P#0) >>>>>>> L3Cache L#0 (8192KB line=64) >>>>>>> L2Cache L#0 (256KB line=64) >>>>>>> L1Cache L#0 (32KB line=64) >>>>>>> Core L#0 (P#0) >>>>>>> PU L#0 (P#0) >>>>>>> depth 0: 1 Machine (type #1) >>>>>>> depth 1: 1 Socket (type #3) >>>>>>> depth 2: 1 Cache (type #4) >>>>>>> depth 3: 1 Cache (type #4) >>>>>>> depth 4: 1 Cache (type #4) >>>>>>> depth 5: 1 Core (type #5) >>>>>>> depth 6: 1 PU (type #6) >>>>>>> >>>>>>> >>>>>>> Here's the output from a GDB session on hwloc v1.9: >>>>>>> >>>>>>> [root@viper0 bin]# LD_LIBRARY_PATH=/root/hwloc/lib gdb ./lstopo >>>> GNU >>>>>>> gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) Copyright (C) 2010 >>>>>>> Free Software Foundation, Inc. >>>>>>> License GPLv3+: GNU GPL version 3 or later >>>>>>> <http://gnu.org/licenses/gpl.html> >>>>>>> This is free software: you are free to change and redistribute it. >>>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show >>>>>> copying" >>>>>>> and "show warranty" for details. >>>>>>> This GDB was configured as "x86_64-redhat-linux-gnu". >>>>>>> For bug reporting instructions, please see: >>>>>>> <http://www.gnu.org/software/gdb/bugs/>... >>>>>>> Reading symbols from /root/hwloc/bin/lstopo...done. >>>>>>> (gdb) r -v >>>>>>> Starting program: /root/hwloc/bin/lstopo -v >>>>>>> warning: no loadable sections found in added symbol-file >>>>>>> system-supplied DSO at 0x7ffff7ffd000 >>>>>>> >>>>>>> Program received signal SIGFPE, Arithmetic exception. >>>>>>> 0x00007ffff7df0558 in look_proc (infos=0x61b6a0, highest_cpuid=11, >>>>>> highest_ext_cpuid=<value optimized out>, features=<value optimized >>>>>> out>, >>>>>> cpuid_type=intel) >>>>>>> at topology-x86.c:323 >>>>>>> 323 infos->threadid = infos->logprocid % infos->max_nbthreads; >>>>>>> Missing separate debuginfos, use: debuginfo-install >>>>>>> glibc-2.12-1.80.el6.x86_64 >>>>>>> (gdb) bt >>>>>>> #0 0x00007ffff7df0558 in look_proc (infos=0x61b6a0, >>>>>>> highest_cpuid=11, >>>>>> highest_ext_cpuid=<value optimized out>, features=<value optimized >>>>>> out>, >>>>>>> cpuid_type=intel) at topology-x86.c:323 >>>>>>> #1 0x00007ffff7df165a in look_procs (topology=0x619100, >>>>>>> nbprocs=1, >>>>>>> fulldiscovery=0) at topology-x86.c:741 >>>>>>> #2 hwloc_look_x86 (topology=0x619100, nbprocs=1, fulldiscovery=0) >>>>>>> at >>>>>>> topology-x86.c:886 >>>>>>> #3 0x00007ffff7df17f9 in hwloc_x86_discover (backend=<value >>>>>>> optimized >>>>>>> out>) at topology-x86.c:934 >>>>>>> #4 0x00007ffff7dd6568 in hwloc_discover (topology=0x619100) at >>>>>>> topology.c:2452 >>>>>>> #5 hwloc_topology_load (topology=0x619100) at topology.c:2925 >>>>>>> #6 0x0000000000402cf0 in main (argc=<value optimized out>, >>>>>>> argv=<value optimized out>) at lstopo.c:581 >>>>>>> (gdb) print infos->logprocid >>>>>>> $1 = 0 >>>>>>> (gdb) print infos->max_nbthreads >>>>>>> $2 = 0 >>>>>>> >>>>>>> >>>>>>> Any ideas? Any other info I should provide? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Andrew >>>>>>> _______________________________________________ >>>>>>> hwloc-users mailing list >>>>>>> hwloc-us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users >>>>>> _______________________________________________ >>>>>> hwloc-users mailing list >>>>>> hwloc-us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users