Re: [ovirt-users] Failed to read hardware information

2016-10-13 Thread Nir Soffer
On Thu, Oct 13, 2016 at 11:28 AM, Martin Polednik  wrote:
> On 13/10/16 09:01 +0300, Dan Kenigsberg wrote:
>>
>> On Thu, Oct 13, 2016 at 11:52:17AM +1100, David Pinkerton wrote:
>>>
>>> Nir,
>>>
>>> Looks like its crashing on the dmidecode system call.
>>>
>>> I've attached the output from gbd as well as a dmidecode text dump,
>>> dmidecode binary dump and each keywords run individually.
>>>
>>> >From the keywords it look like my dmi info is corrupted.  I have
>>> download a
>>> AMI dmi editor but this only allows access to limited fields.  Do you
>>> know
>>> another tools to rewrite the dmi info?
>>
>>
>> I don't. But whatever is inside your dmi, dmidecode must not crash.
>> Which version of python-dmidecode do you have installed?
>> Would you open a bug against it?
>
>
> This is really unfortunate - I've reproduced the issue with the
> attached dump

Can you explain how do you do that?

> and it's python-dmidecode that crashes. The issue is
> actually fixed upstream, but the version at least in RHEL does not
> contain the fix.

Can you link to the missing fix?

> RHEL version:
> python-dmidecode-3.10.13-11.el7.x86_64
>
> works with (actual upstream):
> python-dmidecode-3.12.2-1.el7.x86_64
> (actually it's ~6 line change in dmioem.c)
>
> VDSM output:
> # vdsClient 0 getVdsHardwareInfo
>systemFamily = 'To Be Filled By O.E.M.'
>systemManufacturer = 'Supermicro'
>systemProductName = 'H8DM8-2'
>systemSerialNumber = '1234567890'
>systemUUID = '00020003-0004-0005-0006-000700080009'
>systemVersion = '1234567890'
>
> Although the upstream version of python-dmidecode is able to deal with
> improper DMI tables, I can't say what else will/will not behave correctly.
>
> mpolednik
>
>
>> I believe that its maintainers would appriace a simple reproducer, that
>> does not involve ovirt or Vdsm. See if you can simplify the code in
>>
>> def __leafDict(d):
>>ret = {}
>>for k, v in d.iteritems():
>>if isinstance(v, dict):
>>ret.update(__leafDict(v))
>>else:
>>ret[k] = v
>>return ret
>>
>>
>> def getAllDmidecodeInfo():
>>import dmidecode
>>
>>myLeafDict = {}
>>for k in ('system', 'bios', 'cache', 'processor', 'chassis', 'memory'):
>>myLeafDict[k] = __leafDict(getattr(dmidecode, k)())
>>return myLeafDict
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-13 Thread Nir Soffer
On Thu, Oct 13, 2016 at 12:29 PM, David Pinkerton  wrote:
> Good News.
>
> I installed the fedora 24 version of python-dmidecode and was able to
> successfully add the host to my cluster...
>
> Thanks you to everyone who looked at this.  I owe you a beer or at least
> some reward points   :-)

You owe us a rhel bug for dmidecode :-)


Program received signal SIGSEGV, Segmentation fault.
dmi_set_vendor (s=0x0) at src/dmioem.c:45
45if(strcmp(s, "HP") == 0)

s is a null, cannot work with strcmp

(gdb) thread apply all bt full

Thread 1 (Thread 0x77feb740 (LWP 6318)):
#0  dmi_set_vendor (s=0x0) at src/dmioem.c:45
__s1 = 0x0
__result = 
#1  0x7fffead5f66f in dmi_table (logp=logp@entry=0xaa70e0,
type=type@entry=1, base=, len=,
num=, ver=,
devmem=devmem@entry=0x7fffead645e0 "/dev/mem",
xmlnode=xmlnode@entry=0x6e5420) at src/dmidecode.c:4902
next = 
h = {type = 0 '\000', length = 252 '\374', handle = 13, data =
0xa4f4aa ""}

It look like the code is trying to get the vendor name from type 0.

The fact that this leads to sending null vendor string is a second bug.

I believe dmidecode is missing this upstream patch:
https://github.com/nirs/dmidecode/commit/6e3a3f3cd36f633a56437b42e40d6769ad8acfe7

Nir

> On Thu, Oct 13, 2016 at 7:28 PM, Martin Polednik 
> wrote:
>>
>> On 13/10/16 09:01 +0300, Dan Kenigsberg wrote:
>>>
>>> On Thu, Oct 13, 2016 at 11:52:17AM +1100, David Pinkerton wrote:

 Nir,

 Looks like its crashing on the dmidecode system call.

 I've attached the output from gbd as well as a dmidecode text dump,
 dmidecode binary dump and each keywords run individually.

 >From the keywords it look like my dmi info is corrupted.  I have
 download a
 AMI dmi editor but this only allows access to limited fields.  Do you
 know
 another tools to rewrite the dmi info?
>>>
>>>
>>> I don't. But whatever is inside your dmi, dmidecode must not crash.
>>> Which version of python-dmidecode do you have installed?
>>> Would you open a bug against it?
>>
>>
>> This is really unfortunate - I've reproduced the issue with the
>> attached dump and it's python-dmidecode that crashes. The issue is
>> actually fixed upstream, but the version at least in RHEL does not
>> contain the fix.
>>
>> RHEL version:
>> python-dmidecode-3.10.13-11.el7.x86_64
>>
>> works with (actual upstream):
>> python-dmidecode-3.12.2-1.el7.x86_64
>> (actually it's ~6 line change in dmioem.c)
>>
>> VDSM output:
>> # vdsClient 0 getVdsHardwareInfo
>>systemFamily = 'To Be Filled By O.E.M.'
>>systemManufacturer = 'Supermicro'
>>systemProductName = 'H8DM8-2'
>>systemSerialNumber = '1234567890'
>>systemUUID = '00020003-0004-0005-0006-000700080009'
>>systemVersion = '1234567890'
>>
>> Although the upstream version of python-dmidecode is able to deal with
>> improper DMI tables, I can't say what else will/will not behave correctly.
>>
>> mpolednik
>>
>>
>>> I believe that its maintainers would appriace a simple reproducer, that
>>> does not involve ovirt or Vdsm. See if you can simplify the code in
>>>
>>> def __leafDict(d):
>>>ret = {}
>>>for k, v in d.iteritems():
>>>if isinstance(v, dict):
>>>ret.update(__leafDict(v))
>>>else:
>>>ret[k] = v
>>>return ret
>>>
>>>
>>> def getAllDmidecodeInfo():
>>>import dmidecode
>>>
>>>myLeafDict = {}
>>>for k in ('system', 'bios', 'cache', 'processor', 'chassis',
>>> 'memory'):
>>>myLeafDict[k] = __leafDict(getattr(dmidecode, k)())
>>>return myLeafDict
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> --
>
> David Pinkerton
> Consultant
> Red Hat Asia Pacific Pty. Ltd.
> Level 11, Canberra House
> 40 Marcus Clarke Street
> Canberra 2600 ACT
>
> Mobile: +61-488-904-232
> Email: david.pinker...@redhat.com
> Web: http://apac.redhat.com/
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-13 Thread David Pinkerton
Good News.

I installed the fedora 24 version of python-dmidecode and was able to
successfully add the host to my cluster...

Thanks you to everyone who looked at this.  I owe you a beer or at least
some reward points   :-)



On Thu, Oct 13, 2016 at 7:28 PM, Martin Polednik 
wrote:

> On 13/10/16 09:01 +0300, Dan Kenigsberg wrote:
>
>> On Thu, Oct 13, 2016 at 11:52:17AM +1100, David Pinkerton wrote:
>>
>>> Nir,
>>>
>>> Looks like its crashing on the dmidecode system call.
>>>
>>> I've attached the output from gbd as well as a dmidecode text dump,
>>> dmidecode binary dump and each keywords run individually.
>>>
>>> >From the keywords it look like my dmi info is corrupted.  I have
>>> download a
>>> AMI dmi editor but this only allows access to limited fields.  Do you
>>> know
>>> another tools to rewrite the dmi info?
>>>
>>
>> I don't. But whatever is inside your dmi, dmidecode must not crash.
>> Which version of python-dmidecode do you have installed?
>> Would you open a bug against it?
>>
>
> This is really unfortunate - I've reproduced the issue with the
> attached dump and it's python-dmidecode that crashes. The issue is
> actually fixed upstream, but the version at least in RHEL does not
> contain the fix.
>
> RHEL version:
> python-dmidecode-3.10.13-11.el7.x86_64
>
> works with (actual upstream):
> python-dmidecode-3.12.2-1.el7.x86_64
> (actually it's ~6 line change in dmioem.c)
>
> VDSM output:
> # vdsClient 0 getVdsHardwareInfo
>systemFamily = 'To Be Filled By O.E.M.'
>systemManufacturer = 'Supermicro'
>systemProductName = 'H8DM8-2'
>systemSerialNumber = '1234567890'
>systemUUID = '00020003-0004-0005-0006-000700080009'
>systemVersion = '1234567890'
>
> Although the upstream version of python-dmidecode is able to deal with
> improper DMI tables, I can't say what else will/will not behave correctly.
>
> mpolednik
>
>
> I believe that its maintainers would appriace a simple reproducer, that
>> does not involve ovirt or Vdsm. See if you can simplify the code in
>>
>> def __leafDict(d):
>>ret = {}
>>for k, v in d.iteritems():
>>if isinstance(v, dict):
>>ret.update(__leafDict(v))
>>else:
>>ret[k] = v
>>return ret
>>
>>
>> def getAllDmidecodeInfo():
>>import dmidecode
>>
>>myLeafDict = {}
>>for k in ('system', 'bios', 'cache', 'processor', 'chassis', 'memory'):
>>myLeafDict[k] = __leafDict(getattr(dmidecode, k)())
>>return myLeafDict
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>


-- 

David Pinkerton
Consultant
Red Hat Asia Pacific Pty. Ltd.
Level 11, Canberra House
40 Marcus Clarke Street
Canberra 2600 ACT

Mobile: +61-488-904-232
Email: david.pinker...@redhat.com
Web: http://apac.redhat.com/ 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-13 Thread David Pinkerton
python-dmidecode-3.10.13-11.el7.x86_64

I cut and pasted your python code into a file and ran python file  no
workie

I did find the attached dmidump.py on github.  It segfaults after printing
bios on line 64.

Also attached is a dump from the AMIDEDOS utility.

Happy to do whatever is required to fix this issue..  I have a couple of
months of nights and weekends invested so far...  what's a couple more.  :)






On Thu, Oct 13, 2016 at 5:01 PM, Dan Kenigsberg  wrote:

> On Thu, Oct 13, 2016 at 11:52:17AM +1100, David Pinkerton wrote:
> > Nir,
> >
> > Looks like its crashing on the dmidecode system call.
> >
> > I've attached the output from gbd as well as a dmidecode text dump,
> > dmidecode binary dump and each keywords run individually.
> >
> > >From the keywords it look like my dmi info is corrupted.  I have
> download a
> > AMI dmi editor but this only allows access to limited fields.  Do you
> know
> > another tools to rewrite the dmi info?
>
> I don't. But whatever is inside your dmi, dmidecode must not crash.
> Which version of python-dmidecode do you have installed?
> Would you open a bug against it?
>
> I believe that its maintainers would appriace a simple reproducer, that
> does not involve ovirt or Vdsm. See if you can simplify the code in
>
> def __leafDict(d):
> ret = {}
> for k, v in d.iteritems():
> if isinstance(v, dict):
> ret.update(__leafDict(v))
> else:
> ret[k] = v
> return ret
>
>
> def getAllDmidecodeInfo():
> import dmidecode
>
> myLeafDict = {}
> for k in ('system', 'bios', 'cache', 'processor', 'chassis', 'memory'):
> myLeafDict[k] = __leafDict(getattr(dmidecode, k)())
> return myLeafDict
>



-- 

David Pinkerton
Consultant
Red Hat Asia Pacific Pty. Ltd.
Level 11, Canberra House
40 Marcus Clarke Street
Canberra 2600 ACT

Mobile: +61-488-904-232
Email: david.pinker...@redhat.com
Web: http://apac.redhat.com/ 
[SMBIOS Header]
===
Name  : SMBIOS SignatureStyle : 4 BYTEs
Data  : _SM_

Name  : SMBIOS Checksum Style : BYTE
Data  : 7Fh

Name  : SMBIOS Table Length Style : BYTE
Data  : 31 bytes

Name  : SMBIOS Version  Style : WORD
Data  : 2.4

Name  : SMBIOS Max. Struc. Size Style : WORD
Data  : 254 bytes

Name  : SMBIOS Point Revision   Style : BYTE
Data  : 00h

Name  : SMBIOS Formatted Area   Style : 5 BYTEs
Data  : 00 00 00 00 00h

Name  : DMI Signature   Style : 5 BYTEs
Data  : _DMI_

Name  : DMI ChecksumStyle : BYTE
Data  : 49h

Name  : DMI Table LengthStyle : WORD
Data  : 2911 bytes

Name  : DMI Table Address   Style : DWORD
Data  : 000FC5B0h

Name  : Number of SMBIOS Stuctures  Style : WORD
Data  : 49

Name  : DMI Revisiion   Style : BYTE
Data  : 0.0

[Type 000] -- BIOS Information
===
Name  : Struc. Length   Style : BYTE
Data  : 18h

Name  : Struc. Handle   Style : WORD
Data  : h

Name  : BIOS Vendor Style : STRING
Data  : "American Megatrends Inc."

Name  : BIOS VersionStyle : STRING
Data  : "080014"

Name  : BIOS Starting Add. Seg. Style : WORD
Data  : F000h

Name  : BIOS Release Date   Style : STRING
Data  : "10/22/2009"

Name  : BIOS ROM Size   Style : BYTE
Data  : 0Fh
-- 1024 KB

Name  : BIOS CharacteristicsStyle : QWORD
Data  :  0001 7F8B DE90h
-- Bit.04:ISA is supported
-- Bit.07:PCI is Reserved
-- Bit.09:Plug and Play is supported
-- Bit.10:APM is supported
-- Bit.11:BIOS is Upgradeable(Flash)
-- Bit.12:BIOS shadowing is allowed
-- Bit.14:ESCD support is available
-- Bit.15:Boot from CD is supported
-- Bit.16:Selectable Boot is supported
-- Bit.17:BIOS ROM is socketed
-- Bit.19:EDD(Enhanced Disk Drive) Specification is supported
-- Bit.23:Int 13h - 5.25" / 1.2MB Floppy Services are supported
-- Bit.24:Int 13h - 3.5" / 720 KB Floppy Services are supported
-- Bit.25:Int 13h - 3.5" / 2.88 MB Floppy Services are supported
-- Bit.26:Int 5h, Print Screen Service is supported
-- Bit.27:Int 9h, 8042 

Re: [ovirt-users] Failed to read hardware information

2016-10-13 Thread Martin Polednik

On 13/10/16 09:01 +0300, Dan Kenigsberg wrote:

On Thu, Oct 13, 2016 at 11:52:17AM +1100, David Pinkerton wrote:

Nir,

Looks like its crashing on the dmidecode system call.

I've attached the output from gbd as well as a dmidecode text dump,
dmidecode binary dump and each keywords run individually.

>From the keywords it look like my dmi info is corrupted.  I have download a
AMI dmi editor but this only allows access to limited fields.  Do you know
another tools to rewrite the dmi info?


I don't. But whatever is inside your dmi, dmidecode must not crash.
Which version of python-dmidecode do you have installed?
Would you open a bug against it?


This is really unfortunate - I've reproduced the issue with the
attached dump and it's python-dmidecode that crashes. The issue is
actually fixed upstream, but the version at least in RHEL does not
contain the fix.

RHEL version:
python-dmidecode-3.10.13-11.el7.x86_64

works with (actual upstream):
python-dmidecode-3.12.2-1.el7.x86_64
(actually it's ~6 line change in dmioem.c)

VDSM output:
# vdsClient 0 getVdsHardwareInfo
   systemFamily = 'To Be Filled By O.E.M.'
   systemManufacturer = 'Supermicro'
   systemProductName = 'H8DM8-2'
   systemSerialNumber = '1234567890'
   systemUUID = '00020003-0004-0005-0006-000700080009'
   systemVersion = '1234567890'

Although the upstream version of python-dmidecode is able to deal with
improper DMI tables, I can't say what else will/will not behave correctly.

mpolednik



I believe that its maintainers would appriace a simple reproducer, that
does not involve ovirt or Vdsm. See if you can simplify the code in

def __leafDict(d):
   ret = {}
   for k, v in d.iteritems():
   if isinstance(v, dict):
   ret.update(__leafDict(v))
   else:
   ret[k] = v
   return ret


def getAllDmidecodeInfo():
   import dmidecode

   myLeafDict = {}
   for k in ('system', 'bios', 'cache', 'processor', 'chassis', 'memory'):
   myLeafDict[k] = __leafDict(getattr(dmidecode, k)())
   return myLeafDict
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-13 Thread Dan Kenigsberg
On Thu, Oct 13, 2016 at 11:52:17AM +1100, David Pinkerton wrote:
> Nir,
> 
> Looks like its crashing on the dmidecode system call.
> 
> I've attached the output from gbd as well as a dmidecode text dump,
> dmidecode binary dump and each keywords run individually.
> 
> >From the keywords it look like my dmi info is corrupted.  I have download a
> AMI dmi editor but this only allows access to limited fields.  Do you know
> another tools to rewrite the dmi info?

I don't. But whatever is inside your dmi, dmidecode must not crash.
Which version of python-dmidecode do you have installed?
Would you open a bug against it?

I believe that its maintainers would appriace a simple reproducer, that
does not involve ovirt or Vdsm. See if you can simplify the code in

def __leafDict(d):
ret = {}
for k, v in d.iteritems():
if isinstance(v, dict):
ret.update(__leafDict(v))
else:
ret[k] = v
return ret


def getAllDmidecodeInfo():
import dmidecode

myLeafDict = {}
for k in ('system', 'bios', 'cache', 'processor', 'chassis', 'memory'):
myLeafDict[k] = __leafDict(getattr(dmidecode, k)())
return myLeafDict
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-12 Thread David Pinkerton
Nir,

Looks like its crashing on the dmidecode system call.

I've attached the output from gbd as well as a dmidecode text dump,
dmidecode binary dump and each keywords run individually.

>From the keywords it look like my dmi info is corrupted.  I have download a
AMI dmi editor but this only allows access to limited fields.  Do you know
another tools to rewrite the dmi info?


Thanks so much for your help.

Cheers,


On Thu, Oct 13, 2016 at 5:34 AM, Nir Soffer  wrote:

> On Tue, Oct 11, 2016 at 11:59 PM, David Pinkerton 
> wrote:
> > Logs attached
>
> According vdsm.log and supervdsm.log, each time vdsm try to call
> getHardwareInfo,
> supervdsm show the start of the call and then it show no logs for 10
> seconds,
> and than we see the startup message.
>
> So it seems that supervdsm is crashing each time it try to invoke dmidecode
> code.
>
> To dig deeper, I suggest you try to run the relevant code from the
> shell. If this
> code crash, we will see the details in the shell, and we can also run the
> python
> shell in gdb to debug this.
>
> Try this:
>
> 1. Open a python shell as root
>
> $ sudo python
>
> 2. In the shell, type this
>
> >>> from vdsm import dmidecodeUtil
> >>> dmidecodeUtil.getHardwareInfoStructure()
>
> If at this point the the python shell crash, please try:
>
> 1. Install python debug-info packages:
>
> $ sudo debuginfo-install -y python
>
> 2. Start python in gdb
>
> $ sudo gdb python
>
> 3. In the gdb shell, run python
>
> (gdb) run
>
> Python shell will show, type the code above again.
>
> If this crash in gdb, please type this in the gdb shell:
>
> (gdb) thread apply all bt full
>
>
> Nir
>
> >
> > On Mon, Oct 10, 2016 at 4:59 PM, Nir Soffer  wrote:
> >>
> >> On Mon, Oct 10, 2016 at 5:05 AM, Charles Kozler 
> >> wrote:
> >>>
> >>> Possibly stupid question but are you doing this on a base empty
> >>> centos/rhel 7?
> >>>
> >>>
> >>> On Oct 9, 2016 9:48 PM, "David Pinkerton"  wrote:
> 
> 
>  I've spent the weekend trying to get to the bottom of this issue.
> 
>  Adding a Host fails:
> 
>  From RHVM
> 
> 
>  VDSM rhv1 command failed: Connection reset by peer
>  Could not get hardware information for host rhv1
>  VDSM rhv1 command failed: Failed to read hardware information
>  Host rhv1 installed
>  Network changes were saved on host rhv1
>  Installing Host rhv1. Stage: Termination.
>  Installing Host rhv1. Retrieving installation logs to:
>  '/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-
> 20161010115606-192.168.21.71-24d39274.log'.
>  Installing Host rhv1. Stage: Pre-termination.
>  Installing Host rhv1. Starting ovirt-vmconsole-host-sshd.
>  Installing Host rhv1. Starting vdsm.
>  Installing Host rhv1. Stopping libvirtd.
>  Installing Host rhv1. Stage: Closing up.
>  Installing Host rhv1. Setting kernel arguments.
>  Installing Host rhv1. Stage: Transaction commit.
>  Installing Host rhv1. Enrolling serial console certificate.
>  Installing Host rhv1. Enrolling certificate.
>  Installing Host rhv1. Stage: Misc configuration.
> 
> 
> 
>  This was in the /var/log/vdsm/vdsm.log on the host trying to be added:
> 
>  jsonrpc.Executor/2::ERROR::2016-10-10
>  11:57:10,276::API::1340::vds::(getHardwareInfo) failed to retrieve
> hardware
>  info
>  Traceback (most recent call last):
>    File "/usr/share/vdsm/API.py", line 1337, in getHardwareInfo
>  hw = supervdsm.getProxy().getHardwareInfo()
>    File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line
> 53, in
>  __call__
>  return callMethod()
>    File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line
> 51, in
>  
>  **kwargs)
>    File "", line 2, in getHardwareInfo
>    File "/usr/lib64/python2.7/multiprocessing/managers.py", line 759,
> in
>  _callmethod
>  kind, result = conn.recv()
>  EOFError
> >>
> >>
> >> If a request to supervdsm fails with EOFError, something bad happened
> >> supervdsm and we would see the exception in the supervdsm log.
> >>
> >> Can you share supervdsm.log?
> >>
> >> Nir
> >
> >
> >
> >
> > --
> >
> > David Pinkerton
> > Consultant
> > Red Hat Asia Pacific Pty. Ltd.
> > Level 11, Canberra House
> > 40 Marcus Clarke Street
> > Canberra 2600 ACT
> >
> > Mobile: +61-488-904-232
> > Email: david.pinker...@redhat.com
> > Web: http://apac.redhat.com/
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>



-- 

David Pinkerton
Consultant
Red Hat Asia Pacific Pty. Ltd.
Level 11, Canberra House
40 Marcus Clarke Street
Canberra 2600 ACT

Mobile: +61-488-904-232
Email: david.pinker...@redhat.com
Web: http://apac.redhat.com/ 

Re: [ovirt-users] Failed to read hardware information

2016-10-12 Thread Nir Soffer
On Tue, Oct 11, 2016 at 11:59 PM, David Pinkerton  wrote:
> Logs attached

According vdsm.log and supervdsm.log, each time vdsm try to call
getHardwareInfo,
supervdsm show the start of the call and then it show no logs for 10 seconds,
and than we see the startup message.

So it seems that supervdsm is crashing each time it try to invoke dmidecode
code.

To dig deeper, I suggest you try to run the relevant code from the
shell. If this
code crash, we will see the details in the shell, and we can also run the python
shell in gdb to debug this.

Try this:

1. Open a python shell as root

$ sudo python

2. In the shell, type this

>>> from vdsm import dmidecodeUtil
>>> dmidecodeUtil.getHardwareInfoStructure()

If at this point the the python shell crash, please try:

1. Install python debug-info packages:

$ sudo debuginfo-install -y python

2. Start python in gdb

$ sudo gdb python

3. In the gdb shell, run python

(gdb) run

Python shell will show, type the code above again.

If this crash in gdb, please type this in the gdb shell:

(gdb) thread apply all bt full


Nir

>
> On Mon, Oct 10, 2016 at 4:59 PM, Nir Soffer  wrote:
>>
>> On Mon, Oct 10, 2016 at 5:05 AM, Charles Kozler 
>> wrote:
>>>
>>> Possibly stupid question but are you doing this on a base empty
>>> centos/rhel 7?
>>>
>>>
>>> On Oct 9, 2016 9:48 PM, "David Pinkerton"  wrote:


 I've spent the weekend trying to get to the bottom of this issue.

 Adding a Host fails:

 From RHVM


 VDSM rhv1 command failed: Connection reset by peer
 Could not get hardware information for host rhv1
 VDSM rhv1 command failed: Failed to read hardware information
 Host rhv1 installed
 Network changes were saved on host rhv1
 Installing Host rhv1. Stage: Termination.
 Installing Host rhv1. Retrieving installation logs to:
 '/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20161010115606-192.168.21.71-24d39274.log'.
 Installing Host rhv1. Stage: Pre-termination.
 Installing Host rhv1. Starting ovirt-vmconsole-host-sshd.
 Installing Host rhv1. Starting vdsm.
 Installing Host rhv1. Stopping libvirtd.
 Installing Host rhv1. Stage: Closing up.
 Installing Host rhv1. Setting kernel arguments.
 Installing Host rhv1. Stage: Transaction commit.
 Installing Host rhv1. Enrolling serial console certificate.
 Installing Host rhv1. Enrolling certificate.
 Installing Host rhv1. Stage: Misc configuration.



 This was in the /var/log/vdsm/vdsm.log on the host trying to be added:

 jsonrpc.Executor/2::ERROR::2016-10-10
 11:57:10,276::API::1340::vds::(getHardwareInfo) failed to retrieve hardware
 info
 Traceback (most recent call last):
   File "/usr/share/vdsm/API.py", line 1337, in getHardwareInfo
 hw = supervdsm.getProxy().getHardwareInfo()
   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in
 __call__
 return callMethod()
   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in
 
 **kwargs)
   File "", line 2, in getHardwareInfo
   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in
 _callmethod
 kind, result = conn.recv()
 EOFError
>>
>>
>> If a request to supervdsm fails with EOFError, something bad happened
>> supervdsm and we would see the exception in the supervdsm log.
>>
>> Can you share supervdsm.log?
>>
>> Nir
>
>
>
>
> --
>
> David Pinkerton
> Consultant
> Red Hat Asia Pacific Pty. Ltd.
> Level 11, Canberra House
> 40 Marcus Clarke Street
> Canberra 2600 ACT
>
> Mobile: +61-488-904-232
> Email: david.pinker...@redhat.com
> Web: http://apac.redhat.com/
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-10 Thread Nir Soffer
On Mon, Oct 10, 2016 at 5:05 AM, Charles Kozler 
wrote:

> Possibly stupid question but are you doing this on a base empty
> centos/rhel 7?
>
> On Oct 9, 2016 9:48 PM, "David Pinkerton"  wrote:
>
>>
>> I've spent the weekend trying to get to the bottom of this issue.
>>
>> Adding a Host fails:
>>
>> From RHVM
>>
>>
>> VDSM rhv1 command failed: Connection reset by peer
>> Could not get hardware information for host rhv1
>> VDSM rhv1 command failed: Failed to read hardware information
>> Host rhv1 installed
>> Network changes were saved on host rhv1
>> Installing Host rhv1. Stage: Termination.
>> Installing Host rhv1. Retrieving installation logs to:
>> '/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-2016101
>> 0115606-192.168.21.71-24d39274.log'.
>> Installing Host rhv1. Stage: Pre-termination.
>> Installing Host rhv1. Starting ovirt-vmconsole-host-sshd.
>> Installing Host rhv1. Starting vdsm.
>> Installing Host rhv1. Stopping libvirtd.
>> Installing Host rhv1. Stage: Closing up.
>> Installing Host rhv1. Setting kernel arguments.
>> Installing Host rhv1. Stage: Transaction commit.
>> Installing Host rhv1. Enrolling serial console certificate.
>> Installing Host rhv1. Enrolling certificate.
>> Installing Host rhv1. Stage: Misc configuration.
>>
>>
>>
>> This was in the /var/log/vdsm/vdsm.log on the host trying to be added:
>>
>> jsonrpc.Executor/2::ERROR::2016-10-10 
>> 11:57:10,276::API::1340::vds::(getHardwareInfo)
>> failed to retrieve hardware info
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/API.py", line 1337, in getHardwareInfo
>> hw = supervdsm.getProxy().getHardwareInfo()
>>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in
>> __call__
>> return callMethod()
>>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in
>> 
>> **kwargs)
>>   File "", line 2, in getHardwareInfo
>>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in
>> _callmethod
>> kind, result = conn.recv()
>> EOFError
>>
>
If a request to supervdsm fails with EOFError, something bad happened
supervdsm and we would see the exception in the supervdsm log.

Can you share supervdsm.log?

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-09 Thread Charles Kozler
Possibly stupid question but are you doing this on a base empty centos/rhel
7?

On Oct 9, 2016 9:48 PM, "David Pinkerton"  wrote:

>
> I've spent the weekend trying to get to the bottom of this issue.
>
> Adding a Host fails:
>
> From RHVM
>
>
> VDSM rhv1 command failed: Connection reset by peer
> Could not get hardware information for host rhv1
> VDSM rhv1 command failed: Failed to read hardware information
> Host rhv1 installed
> Network changes were saved on host rhv1
> Installing Host rhv1. Stage: Termination.
> Installing Host rhv1. Retrieving installation logs to:
> '/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-
> 20161010115606-192.168.21.71-24d39274.log'.
> Installing Host rhv1. Stage: Pre-termination.
> Installing Host rhv1. Starting ovirt-vmconsole-host-sshd.
> Installing Host rhv1. Starting vdsm.
> Installing Host rhv1. Stopping libvirtd.
> Installing Host rhv1. Stage: Closing up.
> Installing Host rhv1. Setting kernel arguments.
> Installing Host rhv1. Stage: Transaction commit.
> Installing Host rhv1. Enrolling serial console certificate.
> Installing Host rhv1. Enrolling certificate.
> Installing Host rhv1. Stage: Misc configuration.
>
>
>
> This was in the /var/log/vdsm/vdsm.log on the host trying to be added:
>
> jsonrpc.Executor/2::ERROR::2016-10-10 
> 11:57:10,276::API::1340::vds::(getHardwareInfo)
> failed to retrieve hardware info
> Traceback (most recent call last):
>   File "/usr/share/vdsm/API.py", line 1337, in getHardwareInfo
> hw = supervdsm.getProxy().getHardwareInfo()
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in
> __call__
> return callMethod()
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in
> 
> **kwargs)
>   File "", line 2, in getHardwareInfo
>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in
> _callmethod
> kind, result = conn.recv()
> EOFError
>
>
> and then VDSM fails to start.
>
>
>
> Looking at the source code...
>
> def getHardwareInfoStructure():
> dmiInfo = getAllDmidecodeInfo()
> sysStruct = {}
> for k1, k2 in (('system', 'Manufacturer'),
>('system', 'Product Name'),
>('system', 'Version'),
>('system', 'Serial Number'),
>('system', 'UUID'),
>('system', 'Family')):
> val = dmiInfo.get(k1, {}).get(k2, None)
> if val not in [None, 'Not Specified']:
> sysStruct[(k1 + k2).replace(' ', '')] = val
>
> return sysStruct
>
>
>
> Running dmidecode from command line I get..
>
> System Information
> Manufacturer: Supermicro
> Product Name: H8DM8-2
> Version: 1234567890
> Serial Number: 1234567890
> UUID: 00020003-0004-0005-0006-000700080009
> Wake-up Type: Power Switch
> SKU Number: To Be Filled By O.E.M.
> Family: To Be Filled By O.E.M.
>
>
> Q: Is the string in Family the source of my problems??
>
> Q: Any work arounds??
>
>
>
>
>
>
>
>
> --
>
> David Pinkerton
> Consultant
> Red Hat Asia Pacific Pty. Ltd.
> Level 11, Canberra House
> 40 Marcus Clarke Street
> Canberra 2600 ACT
>
> Mobile: +61-488-904-232
> Email: david.pinker...@redhat.com
> Web: http://apac.redhat.com/ 
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users