Re: [ovirt-users] Host in kdumping state

2017-11-13 Thread Davide Ferrari
Sorry for not following-up...well as strange as it sounds, after replacing
the faulty HW while keeping the same old disks (basically putting the disks
in a new identical server) the kdumping error disappeared and the host is
back working as usual.
Maybe "kdumping" is displayed after a kernel panic reboot?

Thanks

2017-11-06 6:56 GMT+01:00 Oved Ourfali :

> Hosts should not be stuck in that status.
> Can you please attach the engine logs + the relevant host's logs?
>
> Also, are you using the latest 4.1?
>
> On Fri, Nov 3, 2017 at 3:52 PM, Davide Ferrari 
> wrote:
>
>>
>>
>> On 02/11/17 12:00, Davide Ferrari wrote:
>>
>>> I've got a faulty host that keeps rebooting itself from time to time
>>> (due to HW issues), that is/was part of the 3 hosts group hosting the
>>> HostedEngine, and now it always appears as "Kdumping" in the web
>>> administration panel.
>>>
>>
>> Hello again
>>
>> no idea anybody about at least how to reset this "kdumping" status?
>>
>> Thanks
>>
>>
>> --
>> Davide Ferrari
>> Lead System Engineer
>> Billy Performance Network
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>


-- 
Davide Ferrari
Senior Systems Engineer
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Host in kdumping state

2017-11-05 Thread Oved Ourfali
Hosts should not be stuck in that status.
Can you please attach the engine logs + the relevant host's logs?

Also, are you using the latest 4.1?

On Fri, Nov 3, 2017 at 3:52 PM, Davide Ferrari  wrote:

>
>
> On 02/11/17 12:00, Davide Ferrari wrote:
>
>> I've got a faulty host that keeps rebooting itself from time to time (due
>> to HW issues), that is/was part of the 3 hosts group hosting the
>> HostedEngine, and now it always appears as "Kdumping" in the web
>> administration panel.
>>
>
> Hello again
>
> no idea anybody about at least how to reset this "kdumping" status?
>
> Thanks
>
>
> --
> Davide Ferrari
> Lead System Engineer
> Billy Performance Network
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Host in kdumping state

2017-11-03 Thread Davide Ferrari



On 02/11/17 12:00, Davide Ferrari wrote:
I've got a faulty host that keeps rebooting itself from time to time 
(due to HW issues), that is/was part of the 3 hosts group hosting the 
HostedEngine, and now it always appears as "Kdumping" in the web 
administration panel. 


Hello again

no idea anybody about at least how to reset this "kdumping" status?

Thanks

--
Davide Ferrari
Lead System Engineer
Billy Performance Network

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Host in kdumping state

2017-11-02 Thread Davide Ferrari

Hello

I've got a faulty host that keeps rebooting itself from time to time 
(due to HW issues), that is/was part of the 3 hosts group hosting the 
HostedEngine, and now it always appears as "Kdumping" in the web 
administration panel.


All my hosts are oVirt 4.1 on Centos 7.3 with glusterfs 3.7 but this one 
that was updated by mistake to Centos 7.4 with glusterfs 3.8.


Is this due to the different OS/gluster version? How can I "reset" it? I 
want to remove it permanently and assign the HostedEngine to another host?


Moreover, the main glusterfs volume, the one which holds the HE image, 
has some bricks on this failing machine (vm03):


#  gluster volume status data_ssd
Status of volume: data_ssd
Gluster process TCP Port RDMA Port  Online  Pid
--
Brick vm01.storage.billy:/gluster/ssd/data/
brick   49156 0  Y   6039
Brick vm02.storage.billy:/gluster/ssd/data/
brick   49153 0  Y   99097
Brick vm03.storage.billy:/gluster/ssd/data/
arbiter_brick   49159 0  Y   5325
Brick vm03.storage.billy:/gluster/ssd/data/
brick   N/A N/A    N   N/A
Brick vm04.storage.billy:/gluster/ssd/data/
brick   49152 0  Y   14811
Brick vm02.storage.billy:/gluster/ssd/data/
arbiter_brick   49154 0  Y   99104
Self-heal Daemon on localhost   N/A N/A    Y   6753
Self-heal Daemon on vm01.storage.billy  N/A N/A    Y   79317
Self-heal Daemon on vm02.storage.billy  N/A N/A    Y   41778
Self-heal Daemon on vm04.storage.billy  N/A N/A    Y   125116

What's the best way to replace them? Is this guide still useful? 
https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-glusterfs-array/ 
(I guess so)


Thanks!

--
Davide Ferrari
Lead System Engineer
Billy Performance Network

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users