Re: [hwloc-users] lstopo hangs for centos 7

2016-02-03 Thread Brice Goglin
Thanks. I pushed the fix to git branches. It will be included in future
releases (but 1.11.3 isn't planned anytime soon).

It might be good to report a bug to VMware. I don't think they are
supposed to advertise the x2APIC CPU feature unless they support CPUID
0xb leaf.

Brice





Le 03/02/2016 05:45, Jianjun Wen a écrit :
> Confirmed!
> This patch fixes the problem.
>
> Thanks a lot!
> Jianjun
>
> On Tue, Feb 2, 2016 at 9:05 AM, Brice Goglin  > wrote:
>
> Does this patch help?
>
> diff --git a/src/topology-x86.c b/src/topology-x86.c
> index efd4300..a602121 100644
> --- a/src/topology-x86.c
> +++ b/src/topology-x86.c
> @@ -403,7 +403,7 @@ static void look_proc(struct hwloc_backend *backend, 
> struct procinfo *infos, uns
>/* Get package/core/thread information from cpuid 0x0b
> * (Intel x2APIC)
> */
> -  if (cpuid_type == intel && has_x2apic(features)) {
> +  if (cpuid_type == intel && highest_cpuid >= 0x0b && 
> has_x2apic(features)) {
>  unsigned level, apic_nextshift, apic_number, apic_type, apic_id = 0, 
> apic_shift = 0, id;
>  for (level = 0; ; level++) {
>ecx = level;
>
> It looks like VMware reports that the virtual reports x2APIC
> feature without 0xb CPUID. This looks buggy, but can be worked around.
>
> Brice
>
>
>
>
>
> Le 02/02/2016 05:50, Jianjun Wen a écrit :
>> Hi Brice,
>> Oh, didn't realize that. Only master has the gatther-cpuid.
>>
>> Attached.
>>
>> BTW, /proc/cpuinfo contain a field called flags. If it is an vm,
>> hypervisor will be there. 
>> sudo  dmidecode -s system-product-name
>> will output 
>> VMware Virtual Platform
>>
>> Jianjun
>>
>> On Mon, Feb 1, 2016 at 12:26 AM, Brice Goglin
>> > wrote:
>>
>> Looks like you ran hwloc-gather-topology instead of
>> hwloc-gather-cpuid?
>> By the way, in (4)
>> tar cfj cpuid.tbz2 foo
>> should be
>> tar cfj cpuid.tbz2 cpuid
>>
>>
>>
>>
>> Le 01/02/2016 07:20, Jianjun Wen a écrit :
>>> Hi Brice,
>>> Thanks for the workaround -- it works very good.
>>>
>>> Attached please find the two output file after run
>>> hwloc-gather-cpuid.
>>> Let me after this is fixed!
>>>
>>> thanks,
>>> Jianjun
>>>
>>> On Sun, Jan 31, 2016 at 9:48 PM, Brice Goglin
>>> > wrote:
>>>
>>> Thanks for the debugging. I guess VMware doesn't
>>> properly emulate the CPUID instruction.
>>>
>>> Please do:
>>> 1) take a tarball from git master at
>>> https://ci.inria.fr/hwloc/job/master-0-tarball/ and build it
>>> 2) export HWLOC_COMPONENTS=-x86 in your terminal
>>> 3) do utils/hwloc/hwloc-gather-cpuid
>>> 4) tar cfj cpuid.tbz2 foo and send that cpuid.tbz2
>>>
>>> Step (3) might do an infinite loop for the same reason,
>>> please replace
>>> for(i=0; ; i++) {
>>> with
>>> for(i=0; i<10; i++) {
>>> everywhere in utils/hwloc/hwloc-gather-cpuid.c
>>>
>>> This tarball will help me find what's buggy in VMware
>>> CPUID instruction.
>>>
>>>
>>> In the meantime, you can fix your hwloc by exporting
>>> HWLOC_COMPONENTS=-x86 in your environment.
>>>
>>> If somebody knows how do detect vmware by looking under
>>> /proc or /sys, we could use that to automatically set
>>> that environment variable.
>>>
>>> thanks
>>> Brice
>>>
>>>
>>>
>>>
>>>
>>> Le 01/02/2016 05:59, Jianjun Wen a écrit :
 I did a debug build. Found it loops forever in this
 loop in topology-x86.c:404.
   

 /* Get package/core/thread information from cpuid 0x0b
* (Intel x2APIC)
*/
   if (cpuid_type == intel && has_x2apic(features)) {
 unsigned level, apic_nextshift, apic_number,
 apic_type, apic_id = 0, apic_shift = 0, id;
 for (level = 0; ; level++) {
   ecx = level;
   eax = 0x0b;
   hwloc_x86_cpuid(, , , );
   if (!eax && !ebx)
 break;
 }

 On Sun, Jan 31, 2016 at 8:30 PM, Christopher Samuel
 >
 wrote:

 On 01/02/16 15:09, Jianjun Wen wrote:

 > 0x77bce13c in look_proc () from
 /lib64/libhwloc.so.5
 >
 

Re: [hwloc-users] lstopo hangs for centos 7

2016-02-01 Thread Jianjun Wen
Hi Brice,
Thanks for the workaround -- it works very good.

Attached please find the two output file after run hwloc-gather-cpuid.
Let me after this is fixed!

thanks,
Jianjun

On Sun, Jan 31, 2016 at 9:48 PM, Brice Goglin  wrote:

> Thanks for the debugging. I guess VMware doesn't properly emulate the
> CPUID instruction.
>
> Please do:
> 1) take a tarball from git master at
> https://ci.inria.fr/hwloc/job/master-0-tarball/ and build it
> 2) export HWLOC_COMPONENTS=-x86 in your terminal
> 3) do utils/hwloc/hwloc-gather-cpuid
> 4) tar cfj cpuid.tbz2 foo and send that cpuid.tbz2
>
> Step (3) might do an infinite loop for the same reason, please replace
> for(i=0; ; i++) {
> with
> for(i=0; i<10; i++) {
> everywhere in utils/hwloc/hwloc-gather-cpuid.c
>
> This tarball will help me find what's buggy in VMware CPUID instruction.
>
>
> In the meantime, you can fix your hwloc by exporting HWLOC_COMPONENTS=-x86
> in your environment.
>
> If somebody knows how do detect vmware by looking under /proc or /sys, we
> could use that to automatically set that environment variable.
>
> thanks
> Brice
>
>
>
>
>
> Le 01/02/2016 05:59, Jianjun Wen a écrit :
>
> I did a debug build. Found it loops forever in this loop in
> topology-x86.c:404.
>
>
> /* Get package/core/thread information from cpuid 0x0b
>* (Intel x2APIC)
>*/
>   if (cpuid_type == intel && has_x2apic(features)) {
> unsigned level, apic_nextshift, apic_number, apic_type, apic_id = 0,
> apic_shift = 0, id;
> for (level = 0; ; level++) {
>   ecx = level;
>   eax = 0x0b;
>   hwloc_x86_cpuid(, , , );
>   if (!eax && !ebx)
> break;
> }
>
> On Sun, Jan 31, 2016 at 8:30 PM, Christopher Samuel <
> sam...@unimelb.edu.au> wrote:
>
>> On 01/02/16 15:09, Jianjun Wen wrote:
>>
>> > 0x77bce13c in look_proc () from /lib64/libhwloc.so.5
>> >
>> > Always the same place.
>>
>> pstack on the process when stuck might give more of an insight as it
>> should give more of a stack trace.
>>
>> Also running lstopo under strace should show what it is trying to do at
>> that point.
>>
>> All the best,
>> Chris
>> --
>>  Christopher SamuelSenior Systems Administrator
>>  VLSCI - Victorian Life Sciences Computation Initiative
>>  Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
>>  http://www.vlsci.org.au/  http://twitter.com/vlsci
>>
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1251.php
>>
>
>
>
> --
> -Jianjun Wen
> Wancube.com - 3D photography
> Phone: 408 888 7023
>
>
> ___
> hwloc-users mailing listhwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
> Link to this post: 
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1252.php
>
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post:
> http://www.open-mpi.org/community/lists/hwloc-users/2016/02/1254.php
>



-- 
-Jianjun Wen
Wancube.com - 3D photography
Phone: 408 888 7023


aaa.output
Description: Binary data


aaa.tar.bz2
Description: BZip2 compressed data


Re: [hwloc-users] lstopo hangs for centos 7

2016-02-01 Thread Brice Goglin
Thanks for the debugging. I guess VMware doesn't properly emulate the
CPUID instruction.

Please do:
1) take a tarball from git master at
https://ci.inria.fr/hwloc/job/master-0-tarball/ and build it
2) export HWLOC_COMPONENTS=-x86 in your terminal
3) do utils/hwloc/hwloc-gather-cpuid
4) tar cfj cpuid.tbz2 foo and send that cpuid.tbz2

Step (3) might do an infinite loop for the same reason, please replace
for(i=0; ; i++) {
with
for(i=0; i<10; i++) {
everywhere in utils/hwloc/hwloc-gather-cpuid.c

This tarball will help me find what's buggy in VMware CPUID instruction.


In the meantime, you can fix your hwloc by exporting
HWLOC_COMPONENTS=-x86 in your environment.

If somebody knows how do detect vmware by looking under /proc or /sys,
we could use that to automatically set that environment variable.

thanks
Brice




Le 01/02/2016 05:59, Jianjun Wen a écrit :
> I did a debug build. Found it loops forever in this loop in
> topology-x86.c:404.
>   
>
> /* Get package/core/thread information from cpuid 0x0b
>* (Intel x2APIC)
>*/
>   if (cpuid_type == intel && has_x2apic(features)) {
> unsigned level, apic_nextshift, apic_number, apic_type, apic_id =
> 0, apic_shift = 0, id;
> for (level = 0; ; level++) {
>   ecx = level;
>   eax = 0x0b;
>   hwloc_x86_cpuid(, , , );
>   if (!eax && !ebx)
> break;
> }
>
> On Sun, Jan 31, 2016 at 8:30 PM, Christopher Samuel
> > wrote:
>
> On 01/02/16 15:09, Jianjun Wen wrote:
>
> > 0x77bce13c in look_proc () from /lib64/libhwloc.so.5
> >
> > Always the same place.
>
> pstack on the process when stuck might give more of an insight as it
> should give more of a stack trace.
>
> Also running lstopo under strace should show what it is trying to
> do at
> that point.
>
> All the best,
> Chris
> --
>  Christopher SamuelSenior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: sam...@unimelb.edu.au 
> Phone: +61 (0)3 903 55545 
>  http://www.vlsci.org.au/  http://twitter.com/vlsci
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post:
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1251.php
>
>
>
>
> -- 
> -Jianjun Wen
> Wancube.com - 3D photography
> Phone: 408 888 7023
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post: 
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1252.php



Re: [hwloc-users] lstopo hangs for centos 7

2016-01-31 Thread Jianjun Wen
I did a debug build. Found it loops forever in this loop in
topology-x86.c:404.


/* Get package/core/thread information from cpuid 0x0b
   * (Intel x2APIC)
   */
  if (cpuid_type == intel && has_x2apic(features)) {
unsigned level, apic_nextshift, apic_number, apic_type, apic_id = 0,
apic_shift = 0, id;
for (level = 0; ; level++) {
  ecx = level;
  eax = 0x0b;
  hwloc_x86_cpuid(, , , );
  if (!eax && !ebx)
break;
}

On Sun, Jan 31, 2016 at 8:30 PM, Christopher Samuel 
wrote:

> On 01/02/16 15:09, Jianjun Wen wrote:
>
> > 0x77bce13c in look_proc () from /lib64/libhwloc.so.5
> >
> > Always the same place.
>
> pstack on the process when stuck might give more of an insight as it
> should give more of a stack trace.
>
> Also running lstopo under strace should show what it is trying to do at
> that point.
>
> All the best,
> Chris
> --
>  Christopher SamuelSenior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
>  http://www.vlsci.org.au/  http://twitter.com/vlsci
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post:
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1251.php
>



-- 
-Jianjun Wen
Wancube.com - 3D photography
Phone: 408 888 7023


Re: [hwloc-users] lstopo hangs for centos 7

2016-01-31 Thread Jianjun Wen
I just realize that yum on centos 7 use hwloc version 1.7.
I Downloaded 1.11.2 version source, built and install. Still hangs.

0x77bcb32c in hwloc_x86_cpuid (edx=,
ecx=, ebx=0x7fffdbec,
eax=) at
/home/wen/Downloads/hwloc-1.11.2/include/private/cpuid-x86.h:67
67  __asm__(



On Sun, Jan 31, 2016 at 8:09 PM, Jianjun Wen  wrote:

> Hi Brice
> Thanks for the reply.
> I use yum install hwloc to install it.
> The cpu usage is 100%.
> I got this after Ctrl + C, and c, several times:
>
> 0x77bce13c in look_proc () from /lib64/libhwloc.so.5
>
> Always the same place.
>
> On Sun, Jan 31, 2016 at 12:29 AM, Brice Goglin 
> wrote:
>
>> Hello
>>
>> Thanks for the report. I have never seen this issue. I have CentOS 7 VMs
>> (kvm), lstopo works fine. Did you try this in similar VMs in the past?
>>
>> When you say "latest hwloc", do you mean "build latest tarball" (1.11.2)
>> or "installed latest centos package" (1.7)?
>>
>> First thing to check: run lstopo, let it hang, and check under top
>> whether it uses 100% CPU or 0% CPU (to see if that's an infinite loop or
>> not).
>>
>> Then, run it under gdb:
>> $ gdb lstopo
>> Type 'r' and Enter
>> When things hang, do ctrl-c
>> Type "where" and send the output to us.
>>
>> If you got 100% in top above, you should do this multiple time. After
>> "where", type 'c' to go back to the execution, ctrl+c again, "where" again
>> and check whether the backtrace is similar.
>>
>> Brice
>>
>>
>>
>>
>> Le 31/01/2016 04:48, Jianjun Wen a écrit :
>>
>> I installed the latest centos 7 (1151) on VM (vmware), then installed
>> latest hwloc.
>> lstopo command hangs.
>>
>> hwloc_topology_load()
>> function call also hangs.
>>
>> Is this an know issue? How to find out what's wrong?
>>
>> thanks
>> --
>> -Jianjun Wen
>> Wancube.com - 3D photography
>> Phone: 408 888 7023
>>
>>
>> ___
>> hwloc-users mailing listhwloc-us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1247.php
>>
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1248.php
>>
>
>
>
> --
> -Jianjun Wen
> Wancube.com - 3D photography
> Phone: 408 888 7023
>



-- 
-Jianjun Wen
Wancube.com - 3D photography
Phone: 408 888 7023


Re: [hwloc-users] lstopo hangs for centos 7

2016-01-31 Thread Jianjun Wen
Hi Brice
Thanks for the reply.
I use yum install hwloc to install it.
The cpu usage is 100%.
I got this after Ctrl + C, and c, several times:

0x77bce13c in look_proc () from /lib64/libhwloc.so.5

Always the same place.

On Sun, Jan 31, 2016 at 12:29 AM, Brice Goglin 
wrote:

> Hello
>
> Thanks for the report. I have never seen this issue. I have CentOS 7 VMs
> (kvm), lstopo works fine. Did you try this in similar VMs in the past?
>
> When you say "latest hwloc", do you mean "build latest tarball" (1.11.2)
> or "installed latest centos package" (1.7)?
>
> First thing to check: run lstopo, let it hang, and check under top whether
> it uses 100% CPU or 0% CPU (to see if that's an infinite loop or not).
>
> Then, run it under gdb:
> $ gdb lstopo
> Type 'r' and Enter
> When things hang, do ctrl-c
> Type "where" and send the output to us.
>
> If you got 100% in top above, you should do this multiple time. After
> "where", type 'c' to go back to the execution, ctrl+c again, "where" again
> and check whether the backtrace is similar.
>
> Brice
>
>
>
>
> Le 31/01/2016 04:48, Jianjun Wen a écrit :
>
> I installed the latest centos 7 (1151) on VM (vmware), then installed
> latest hwloc.
> lstopo command hangs.
>
> hwloc_topology_load()
> function call also hangs.
>
> Is this an know issue? How to find out what's wrong?
>
> thanks
> --
> -Jianjun Wen
> Wancube.com - 3D photography
> Phone: 408 888 7023
>
>
> ___
> hwloc-users mailing listhwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post: 
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1247.php
>
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post:
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1248.php
>



-- 
-Jianjun Wen
Wancube.com - 3D photography
Phone: 408 888 7023


Re: [hwloc-users] lstopo hangs for centos 7

2016-01-31 Thread Brice Goglin
Hello

Thanks for the report. I have never seen this issue. I have CentOS 7 VMs
(kvm), lstopo works fine. Did you try this in similar VMs in the past?

When you say "latest hwloc", do you mean "build latest tarball" (1.11.2)
or "installed latest centos package" (1.7)?

First thing to check: run lstopo, let it hang, and check under top
whether it uses 100% CPU or 0% CPU (to see if that's an infinite loop or
not).

Then, run it under gdb:
$ gdb lstopo
Type 'r' and Enter
When things hang, do ctrl-c
Type "where" and send the output to us.

If you got 100% in top above, you should do this multiple time. After
"where", type 'c' to go back to the execution, ctrl+c again, "where"
again and check whether the backtrace is similar.

Brice



Le 31/01/2016 04:48, Jianjun Wen a écrit :
> I installed the latest centos 7 (1151) on VM (vmware), then installed
> latest hwloc.
> lstopo command hangs.
>
> hwloc_topology_load()
> function call also hangs.
>
> Is this an know issue? How to find out what's wrong?
>
> thanks
> -- 
> -Jianjun Wen
> Wancube.com - 3D photography
> Phone: 408 888 7023
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post: 
> http://www.open-mpi.org/community/lists/hwloc-users/2016/01/1247.php