[ovirt-users] Re: [External] : Unrecoverable NMI error on HP Gen8 hosts.

2022-01-09 Thread Strahil Nikolov via Users
 
This means that most probably you are hitting a Firmware issue. (Most probably 
doesn't mean 100%).
Steps I would do:
- Check if latest iLO firmware is deployed (recently iLo4 got an update)
- Check that BIOS is up to date
- Check that HDD/Controller Firmware is up to date

Both iLO and HDD updates are not added in the HP SPP, as Gen8 is no longer 
actively supported.

Next, run the HP sum from 2017.04.00 
(https://support.hpe.com/hpesc/public/swd/detail?swItemId=MTX_3f6b4074ed734dc3baf007612d)
 that is the latest for Gen8

Check the IML logs and for further hardware-related topics check the HPE 
community.

Best Regards,
Strahil Nikolov В събота, 8 януари 2022 г., 19:57:52 Гринуич+2, Diggy Mc 
 написа:  
 
 > Hi Diggy,
> 
> I'm not sure if it's an oVirt issue, but it can be a network or firewall 
> issue. 
> Did you test the connection between oVirt hosts and the iLO interfaces?
> Simple tests like ping to ensure one host can reach others iLO interfaces and 
> ipmitool to
> ensure you can connect to the management interfaces?
> 
> Marcos
> 

It is not network or firewall related.  There is no firewall between the oVirt 
hosts/engine and the iLO interfaces.  When I configured oVirt power management, 
the offered test passed.  I'm not running IPMI.  On a side note, the problem 
exists both before and after enabling oVirt's power management feature.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OZ4ITKNEGD5MITTYMMPONQQ23EFJQFEY/
  ___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PBIAY4ND5KFVPKUMZWAWMW4QALOXBCKZ/


[ovirt-users] Re: [External] : Unrecoverable NMI error on HP Gen8 hosts.

2022-01-08 Thread Diggy Mc
> Hi Diggy,
> 
> I'm not sure if it's an oVirt issue, but it can be a network or firewall 
> issue. 
> Did you test the connection between oVirt hosts and the iLO interfaces?
> Simple tests like ping to ensure one host can reach others iLO interfaces and 
> ipmitool to
> ensure you can connect to the management interfaces?
> 
> Marcos
> 

It is not network or firewall related.  There is no firewall between the oVirt 
hosts/engine and the iLO interfaces.  When I configured oVirt power management, 
the offered test passed.  I'm not running IPMI.  On a side note, the problem 
exists both before and after enabling oVirt's power management feature.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OZ4ITKNEGD5MITTYMMPONQQ23EFJQFEY/


[ovirt-users] Re: [External] : Unrecoverable NMI error on HP Gen8 hosts.

2022-01-03 Thread Marcos Sungaila
Hi Diggy,

I'm not sure if it's an oVirt issue, but it can be a network or firewall issue. 
Did you test the connection between oVirt hosts and the iLO interfaces?
Simple tests like ping to ensure one host can reach others iLO interfaces and 
ipmitool to ensure you can connect to the management interfaces?

Marcos

-Original Message-
From: Diggy Mc  
Sent: quinta-feira, 30 de dezembro de 2021 15:02
To: users@ovirt.org
Subject: [External] : [ovirt-users] Unrecoverable NMI error on HP Gen8 hosts.


I have oVirt Node v4.4.8.3 running on several HP ProLiant Gen8 servers.  I 
receive the following error under certain circumstances:
"An Unrecoverable System Error (NMI) has occurred (iLO application watchdog 
timeout NMI, Service Information: 0x002B, 0x)"

When a host starts taking a load (but nowhere near a threshold), I encounter 
the above iLO-logged error and the host locks-up.  I have had to grossly 
under-utilize my hosts to avoid this problem.  I'm hoping for a better fix or 
work-around.

I've had the same problem beginning with my oVirt 4.3.x hosts, so it isn't 
oVirt version specific.

The little information I could find on the error wasn't helpful.  Red Hat 
acknowledges the issue, but limited to shutdown/reboot operations; not during 
"normal" operations.

Anyone else experienced this problem?  How did you fix it or work around it?  
I'd like to better utilize my servers if possible.

In advance, thank you to anyone and everyone who offers help.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: 
https://urldefense.com/v3/__https://www.ovirt.org/privacy-policy.html__;!!ACWV5N9M2RV99hQ!a6OHzvGD0b5iUzb_mBbtUrXPMpBK-LakS8WhkGEPLmCt6txCvFyp6Cpn4jQ1f9-D_qc$
oVirt Code of Conduct: 
https://urldefense.com/v3/__https://www.ovirt.org/community/about/community-guidelines/__;!!ACWV5N9M2RV99hQ!a6OHzvGD0b5iUzb_mBbtUrXPMpBK-LakS8WhkGEPLmCt6txCvFyp6Cpn4jQ1Hv_fQLs$
List Archives: 
https://urldefense.com/v3/__https://lists.ovirt.org/archives/list/users@ovirt.org/message/F33AM3PNIQF5ONEBSBTVFHS2UGBOOE4A/__;!!ACWV5N9M2RV99hQ!a6OHzvGD0b5iUzb_mBbtUrXPMpBK-LakS8WhkGEPLmCt6txCvFyp6Cpn4jQ14HIXpIA$
 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/R2KKZCG5WWMHDMIJS6GGSOW2A3GOHIRX/