Thanks. I pushed the fix to git branches. It will be included in future
releases (but 1.11.3 isn't planned anytime soon).

It might be good to report a bug to VMware. I don't think they are
supposed to advertise the x2APIC CPU feature unless they support CPUID
0xb leaf.

Brice





Le 03/02/2016 05:45, Jianjun Wen a écrit :
> Confirmed!
> This patch fixes the problem.
>
> Thanks a lot!
> Jianjun
>
> On Tue, Feb 2, 2016 at 9:05 AM, Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>> wrote:
>
>     Does this patch help?
>
>     diff --git a/src/topology-x86.c b/src/topology-x86.c
>     index efd4300..a602121 100644
>     --- a/src/topology-x86.c
>     +++ b/src/topology-x86.c
>     @@ -403,7 +403,7 @@ static void look_proc(struct hwloc_backend *backend, 
> struct procinfo *infos, uns
>        /* Get package/core/thread information from cpuid 0x0b
>         * (Intel x2APIC)
>         */
>     -  if (cpuid_type == intel && has_x2apic(features)) {
>     +  if (cpuid_type == intel && highest_cpuid >= 0x0b && 
> has_x2apic(features)) {
>          unsigned level, apic_nextshift, apic_number, apic_type, apic_id = 0, 
> apic_shift = 0, id;
>          for (level = 0; ; level++) {
>            ecx = level;
>
>     It looks like VMware reports that the virtual reports x2APIC
>     feature without 0xb CPUID. This looks buggy, but can be worked around.
>
>     Brice
>
>
>
>
>
>     Le 02/02/2016 05:50, Jianjun Wen a écrit :
>>     Hi Brice,
>>     Oh, didn't realize that. Only master has the gatther-cpuid.
>>
>>     Attached.
>>
>>     BTW, /proc/cpuinfo contain a field called flags. If it is an vm,
>>     hypervisor will be there. 
>>     sudo  dmidecode -s system-product-name
>>     will output 
>>     VMware Virtual Platform
>>
>>     Jianjun
>>
>>     On Mon, Feb 1, 2016 at 12:26 AM, Brice Goglin
>>     <brice.gog...@inria.fr <mailto:brice.gog...@inria.fr>> wrote:
>>
>>         Looks like you ran hwloc-gather-topology instead of
>>         hwloc-gather-cpuid?
>>         By the way, in (4)
>>             tar cfj cpuid.tbz2 foo
>>         should be
>>             tar cfj cpuid.tbz2 cpuid
>>
>>
>>
>>
>>         Le 01/02/2016 07:20, Jianjun Wen a écrit :
>>>         Hi Brice,
>>>         Thanks for the workaround -- it works very good.
>>>
>>>         Attached please find the two output file after run
>>>         hwloc-gather-cpuid.
>>>         Let me after this is fixed!
>>>
>>>         thanks,
>>>         Jianjun
>>>
>>>         On Sun, Jan 31, 2016 at 9:48 PM, Brice Goglin
>>>         <brice.gog...@inria.fr <mailto:brice.gog...@inria.fr>> wrote:
>>>
>>>             Thanks for the debugging. I guess VMware doesn't
>>>             properly emulate the CPUID instruction.
>>>
>>>             Please do:
>>>             1) take a tarball from git master at
>>>             https://ci.inria.fr/hwloc/job/master-0-tarball/ and build it
>>>             2) export HWLOC_COMPONENTS=-x86 in your terminal
>>>             3) do utils/hwloc/hwloc-gather-cpuid
>>>             4) tar cfj cpuid.tbz2 foo and send that cpuid.tbz2
>>>
>>>             Step (3) might do an infinite loop for the same reason,
>>>             please replace
>>>             for(i=0; ; i++) {
>>>             with
>>>             for(i=0; i<10; i++) {
>>>             everywhere in utils/hwloc/hwloc-gather-cpuid.c
>>>
>>>             This tarball will help me find what's buggy in VMware
>>>             CPUID instruction.
>>>
>>>
>>>             In the meantime, you can fix your hwloc by exporting
>>>             HWLOC_COMPONENTS=-x86 in your environment.
>>>
>>>             If somebody knows how do detect vmware by looking under
>>>             /proc or /sys, we could use that to automatically set
>>>             that environment variable.
>>>
>>>             thanks
>>>             Brice
>>>
>>>
>>>
>>>
>>>
>>>             Le 01/02/2016 05:59, Jianjun Wen a écrit :
>>>>             I did a debug build. Found it loops forever in this
>>>>             loop in topology-x86.c:404.
>>>>               
>>>>
>>>>             /* Get package/core/thread information from cpuid 0x0b
>>>>                * (Intel x2APIC)
>>>>                */
>>>>               if (cpuid_type == intel && has_x2apic(features)) {
>>>>                 unsigned level, apic_nextshift, apic_number,
>>>>             apic_type, apic_id = 0, apic_shift = 0, id;
>>>>                 for (level = 0; ; level++) {
>>>>                   ecx = level;
>>>>                   eax = 0x0b;
>>>>                   hwloc_x86_cpuid(&eax, &ebx, &ecx, &edx);
>>>>                   if (!eax && !ebx)
>>>>                     break;
>>>>                 }
>>>>
>>>>             On Sun, Jan 31, 2016 at 8:30 PM, Christopher Samuel
>>>>             <sam...@unimelb.edu.au <mailto:sam...@unimelb.edu.au>>
>>>>             wrote:
>>>>
>>>>                 On 01/02/16 15:09, Jianjun Wen wrote:
>>>>
>>>>                 > 0x00007ffff7bce13c in look_proc () from
>>>>                 /lib64/libhwloc.so.5
>>>>                 >
>>>>                 > Always the same place.
>>>>
>>>>                 pstack on the process when stuck might give more of
>>>>                 an insight as it
>>>>                 should give more of a stack trace.
>>>>
>>>>                 Also running lstopo under strace should show what
>>>>                 it is trying to do at
>>>>                 that point.
>>>>
>>>>                 All the best,
>>>>                 Chris
>>>>                 --
>>>>                  Christopher Samuel        Senior Systems Administrator
>>>>                  VLSCI - Victorian Life Sciences Computation Initiative
>>>>                  Email: sam...@unimelb.edu.au
>>>>                 <mailto:sam...@unimelb.edu.au> Phone: +61 (0)3 903
>>>>                 55545 <tel:%2B61%20%280%293%20903%2055545>
>>>>                  http://www.vlsci.org.au/      http://twitter.com/vlsci
>>>>
>>>>                 _______________________________________________
>>>>                 hwloc-users mailing list
>>>>                 hwloc-us...@open-mpi.org
>>>>                 <mailto:hwloc-us...@open-mpi.org>
>>>>                 Subscription:
>>>>                 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>>                 Link to this post:
>>>>                 
>>>> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1251.php
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             -Jianjun Wen
>>>>             Wancube.com - 3D photography
>>>>             Phone: 408 888 7023 <tel:408%20888%207023>
>>>>
>>>>
>>>>             _______________________________________________
>>>>             hwloc-users mailing list hwloc-us...@open-mpi.org
>>>>             <mailto:hwloc-us...@open-mpi.org> Subscription:
>>>>             http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>>
>>>>             Link to this post: 
>>>> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1252.php
>>>
>>>
>>>             _______________________________________________
>>>             hwloc-users mailing list
>>>             hwloc-us...@open-mpi.org <mailto:hwloc-us...@open-mpi.org>
>>>             Subscription:
>>>             http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>             Link to this post:
>>>             
>>> http://www.open-mpi.org/community/lists/hwloc-users/2016/02/1254.php
>>>
>>>
>>>
>>>
>>>         -- 
>>>         -Jianjun Wen
>>>         Wancube.com - 3D photography
>>>         Phone: 408 888 7023 <tel:408%20888%207023>
>>>
>>>
>>>         _______________________________________________
>>>         hwloc-users mailing list
>>>         hwloc-us...@open-mpi.org <mailto:hwloc-us...@open-mpi.org>
>>>         Subscription: 
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>         Link to this post: 
>>> http://www.open-mpi.org/community/lists/hwloc-users/2016/02/1254.php
>>
>>
>>
>>
>>     -- 
>>     -Jianjun Wen
>>     Wancube.com - 3D photography
>>     Phone: 408 888 7023 <tel:408%20888%207023>
>
>
>
>
> -- 
> -Jianjun Wen
> Wancube.com - 3D photography
> Phone: 408 888 7023

Reply via email to