[ovirt-users] Re: Random reboots

2022-02-17 Thread Strahil Nikolov via Users
As the rest of the cluster didn't have issues (check dmesg on the Hypervisors), in 99% of the cases it's network.Check Server NICs, enclosure network devices ,switches, backup running during the same time . I would start with any firmware upgrades of the server (if there are any). Best

[ovirt-users] Re: Random reboots

2022-02-17 Thread Pablo Olivera
Hi Nir, Thank you very much for all the help and information. We will continue to investigate the NFS server side. To find what may be causing one of the hosts to lose access to storage. The strange thing is that it happens only on one of the NFS client hosts and not on all of them at the same

[ovirt-users] Re: Random reboots

2022-02-17 Thread Nir Soffer
On Thu, Feb 17, 2022 at 11:58 AM Nir Soffer wrote: > > On Thu, Feb 17, 2022 at 11:20 AM Pablo Olivera wrote: > > > > Hi Nir, > > > > > > Thank you very much for your detailed explanations. > > > > The pid 6398 looks like it's HostedEngine: > > > > audit/audit.log:type=VIRT_CONTROL

[ovirt-users] Re: Random reboots

2022-02-17 Thread Nir Soffer
On Thu, Feb 17, 2022 at 11:20 AM Pablo Olivera wrote: > > Hi Nir, > > > Thank you very much for your detailed explanations. > > The pid 6398 looks like it's HostedEngine: > > audit/audit.log:type=VIRT_CONTROL msg=audit(1644587639.935:7895): pid=3629 > uid=0 auid=4294967295 ses=4294967295 >

[ovirt-users] Re: Random reboots

2022-02-17 Thread Jiří Sléžka
On 2/16/22 23:37, Nir Soffer wrote: On Wed, Feb 16, 2022 at 9:18 PM Nir Soffer wrote: On Wed, Feb 16, 2022 at 5:12 PM Nir Soffer wrote: On Wed, Feb 16, 2022 at 10:10 AM Pablo Olivera wrote: Hi community, We're dealing with an issue as we occasionally have random reboots on any of our

[ovirt-users] Re: Random reboots

2022-02-17 Thread Pablo Olivera
Hi Nir, Thank you very much for your detailed explanations. The pid 6398 looks like it's HostedEngine: /audit/audit.log:type=VIRT_CONTROL msg=audit(1644587639.935:7895): pid=3629 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm op=start

[ovirt-users] Re: Random reboots

2022-02-16 Thread Nir Soffer
On Wed, Feb 16, 2022 at 9:18 PM Nir Soffer wrote: > > On Wed, Feb 16, 2022 at 5:12 PM Nir Soffer wrote: > > > > On Wed, Feb 16, 2022 at 10:10 AM Pablo Olivera wrote: > > > > > > Hi community, > > > > > > We're dealing with an issue as we occasionally have random reboots on > > > any of our

[ovirt-users] Re: Random reboots

2022-02-16 Thread Nir Soffer
On Wed, Feb 16, 2022 at 5:12 PM Nir Soffer wrote: > > On Wed, Feb 16, 2022 at 10:10 AM Pablo Olivera wrote: > > > > Hi community, > > > > We're dealing with an issue as we occasionally have random reboots on > > any of our hosts. > > We're using ovirt 4.4.3 in production with about 60 VM

[ovirt-users] Re: Random reboots

2022-02-16 Thread Nir Soffer
On Wed, Feb 16, 2022 at 10:10 AM Pablo Olivera wrote: > > Hi community, > > We're dealing with an issue as we occasionally have random reboots on > any of our hosts. > We're using ovirt 4.4.3 in production with about 60 VM distributed over > 5 hosts. We've a virtualized engine and a DRBD storage

[ovirt-users] Re: Random reboots

2022-02-16 Thread Pablo Olivera
Hi, Thanks for your answer. On the other nodes there are no errors in this period. In the Cisco log there are only link-down errors due to the restart of 'nodo1'. There is no error before. I attach the Cisco log during this period. We use bond between nodo1 and Cisco switch. The storage is

[ovirt-users] Re: Random reboots

2022-02-16 Thread Valkov, Alexey
Hello, Pablo. It looks like nodo1 have lost connection with the storage (sanlock on nodo1 can't renew leases), then nodo1 has been reset by the watchdog. Are there any errors in logs on the other nodes at this period (15:02 - 15:03)? Are there any errors (near 15:02:13) in cisco9000's log (except