[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-14 Thread Adrian Quintero
Thanks Gobinda, I am in the process of finishing up the 9 node cluster, once done I will test this ansible role... On Fri, Jun 14, 2019 at 12:45 PM Gobinda Das wrote: > We have ansible role to replace gluster node.I think it works only with > same FQDN. >

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-14 Thread Gobinda Das
We have ansible role to replace gluster node.I think it works only with same FQDN. https://github.com/sac/gluster-ansible-maintenance I am not sure if it covers all senarios, but you can try with same FQDN. On Fri, Jun 14, 2019 at 7:13 AM Adrian Quintero wrote: > Strahil, > Thanks for all the

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-13 Thread Adrian Quintero
Strahil, Thanks for all the follow up, I will try to reproduce the same scenario today, deploy a 9 node cluster, Completely kill the initiating node (vmm10) and see If i can recover using the extra server approach (Different IP/FQDN). If I am able to recover I will also try to test with your

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-12 Thread Strahil
Hi Adrian, Please keep in mind that when a server dies, the easiest way to recover is to get another freshly installed server with different IP/FQDN . Then you will need to use 'replace-brick' and once gluster replaces that node - you should be able to remove the old entry in oVirt. Once the

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-12 Thread Adrian Quintero
Strahil, I dont use the GUI that much, in this case I need to understand how all is tied together if I want to move to production. As far as Gluster goes, I can get do the administration thru CLI, however when my test environment was set up it was setup using geodeploy for Hyperconverged setup

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-11 Thread Strahil Nikolov
Do you have empty space to store the VMs ? If yes, you can always script the migration of the disks via the API . Even a bash script and curl can do the trick. About the /dev/sdb , I still don't get it . A pure "df -hT" from a node will make it way clear. I guess '/dev/sdb' is a PV and you got

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-11 Thread Adrian Quintero
adding gluster pool list: UUID Hostname State 2c86fa95-67a2-492d-abf0-54da625417f8 vmm12.mydomain.com Connected ab099e72-0f56-4d33-a16b-ba67d67bdf9d vmm13.mydomain.com Connected c35ad74d-1f83-4032-a459-079a27175ee4 vmm14.mydomain.com Connected aeb7712a-e74e-4492-b6af-9c266d69bfd3

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-11 Thread Adrian Quintero
Strahil, Looking at your suggestions I think I need to provide a bit more info on my current setup. 1. I have 9 hosts in total 2. I have 5 storage domains: - hosted_storage (Data Master) - vmstore1 (Data) - data1 (Data) - data2

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Strahil
Hi Adrian, You have several options: A) If you have space on another gluster volume (or volumes) or on NFS-based storage, you can migrate all VMs live . Once you do it,  the simple way will be to stop and remove the storage domain (from UI) and gluster volume that correspond to the problematic

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Adrian Quintero
Thanks for pointing me in the right direction, I was able to add the server to the cluster by adding /etc/vdsm/vdsm.id I will now try to create the new bricks and try a replacement brick, this part I think I will have to do thru command line because my Hyperconverged setup with a replica 3 is as

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Leo David
https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids On Mon, Jun 10, 2019, 18:13 Adrian Quintero wrote: > Ok I have tried reinstalling the server from scratch with a different name > and IP address and when trying to add it to cluster I get the following > error: > > Event

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Leo David
Hi, i think you can generate and use a new uuid, althought i am not sure about the procedure right now.. On Mon, Jun 10, 2019, 18:13 Adrian Quintero wrote: > Ok I have tried reinstalling the server from scratch with a different name > and IP address and when trying to add it to cluster I get

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Adrian Quintero
Can you let me know how to fix the gluster and missing brick?, I tried removing it by going to "storage > Volumes > vmstore > bricks > selected the brick However it is showing as an unknown status (which is expected because the server was completely wiped) so if I try to "remove", "replace brick"

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Dmitry Filonov
At this point I'd go to engine VM and remove host from the postgres DB manually. A bit of a hack, but... ssh root@ su - postgres cd /opt/rh/rh-postgresql10/ source enable psql engine select vds_id from vds_static where host_name='myhost1.mydomain.com'; select DeleteVds(''); Of course, keep in

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Strahil
Hi Adrian, Did you fix the issue with the gluster and the missing brick? If yes, try to set the 'old' host in maintenance and then forcefully remove it from oVirt. If it succeeds (and it should), then you can add the server back and then check what happens. Best Regards, Strahil NikolovOn Jun

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-10 Thread Adrian Quintero
Ok I have tried reinstalling the server from scratch with a different name and IP address and when trying to add it to cluster I get the following error: Event details ID: 505 Time: Jun 10, 2019, 10:00:00 AM Message: Host myshost2.virt.iad3p installation failed. Host myhost2.mydomain.com reports

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-08 Thread Adrian Quintero
Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume. Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server. I will rename

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-07 Thread Leo David
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-06 Thread Dmitry Filonov
Can you remove bricks that belong to a fried server? Either from a GUI or CLI You should be able to do so and then it should allow you to remove host from the oVirt setup. -- Dmitry Filonov Linux Administrator SBGrid Core | Harvard Medical School 250 Longwood Ave, SGM-114 Boston, MA 02115 On

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-06 Thread Edward Berger
I'll presume you didn't fully backup your hosts root file systems on the host which was fried. It may be easier to replace with a new hostname/IP. I would focus on the gluster config first, since it was hyperconverged. I don't know which way engine UI is using to detect gluster mount on missing

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-06 Thread adrianquintero
Definitely is a challenge trying to replace a bad host. So let me tell you what I see and have done so far: 1.-I have a host that went bad due to HW issues. 2.-This bad host is still showing in the compute --> hosts section. 3.-This host was part of a hyperconverged setup with Gluster. 4.-The

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-06 Thread Strahil Nikolov
Have you tried with "Force remove" tick ? Best Regards,Strahil Nikolov В четвъртък, 6 юни 2019 г., 21:47:20 ч. Гринуич+3, Adrian Quintero написа: I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action: 

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-06 Thread Adrian Quintero
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action: host1.mydomain.com - Cannot remove Host. Server having Gluster volume. On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero wrote: > Leo, I forgot to mention that I

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-06 Thread Adrian Quintero
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved? thanks, Adrian On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero wrote: > Hi Leo, yes, this helps a lot, this confirms the plan we had in mind. > > Will test tomorrow and post the

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-05 Thread Adrian Quintero
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind. Will test tomorrow and post the results. Thanks again Adrian On Wed, Jun 5, 2019 at 11:18 PM Leo David wrote: > Hi Adrian, > I think the steps are: > - reinstall the host > - join it to virtualisation cluster > And if was

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

2019-06-05 Thread Leo David
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the