[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

Adrian Quintero Fri, 14 Jun 2019 17:09:10 -0700

Thanks Gobinda,
I am in the process of finishing up the 9 node cluster, once done I will
test this ansible role...




On Fri, Jun 14, 2019 at 12:45 PM Gobinda Das <[email protected]> wrote:

> We have ansible role to replace gluster node.I think it works only with
> same FQDN.
> https://github.com/sac/gluster-ansible-maintenance
> I am not sure if it covers all senarios, but you can try with same FQDN.
>
> On Fri, Jun 14, 2019 at 7:13 AM Adrian Quintero <[email protected]>
> wrote:
>
>> Strahil,
>> Thanks for all the follow up, I will try to reproduce the same scenario
>> today, deploy a 9 node cluster, Completely kill the initiating node (vmm10)
>> and see If i can recover using the extra server approach (Different
>> IP/FQDN). If I am able to recover I will also try to test with your
>> suggested second approach (Using same IP/FQDN).
>> My objective here is to document the possible recovery scenarios without
>> any downtime or impact.
>>
>> I have documented a few setup and recovery scenarios with 6 and 9 nodes
>> already with a hyperconverged setup and I will make them available to the
>> community, hopefully this week, including the tests that you have been
>> helping me with. Hopefully this will provide help to others that are in the
>> same situation that I am, and it will also provide me with feedback from
>> more knowledgeable admins out there so that I can get this into production
>> in the near future.
>>
>>
>> Thanks again.
>>
>>
>>
>> On Wed, Jun 12, 2019 at 11:58 PM Strahil <[email protected]> wrote:
>>
>>> Hi Adrian,
>>>
>>> Please keep in mind that when a server dies, the easiest way to recover
>>> is to get another freshly installed server with different IP/FQDN .
>>> Then you will need to use 'replace-brick' and once gluster replaces that
>>> node - you should be able to remove the old entry in oVirt.
>>> Once the old entry is gone, you can add the new installation in oVirt
>>> via the UI.
>>>
>>> Another approach is to have the same IP/FQDN for the fresh install.In
>>> this situation, you need to have the same gluster ID (which should be a
>>> text file) and the peer IDs. Most probably you can create them on your own
>>> , based on data on the other gluster peers.
>>> Once the fresh install is available in 'gluster peer' , you can initiate
>>> a reset-brick' (don't forget to set the SELINUX , firewall and repos) and a
>>> full heal.
>>> From there you can reinstall the machine from the UI and it should be
>>> available for usage.
>>>
>>> P.S.: I know that the whole procedure is not so easy :)
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>> On Jun 12, 2019 19:02, Adrian Quintero <[email protected]> wrote:
>>>
>>> Strahil, I dont use the GUI that much, in this case I need to understand
>>> how all is tied together if I want to move to production. As far as Gluster
>>> goes, I can get do the administration thru CLI, however when my test
>>> environment was set up it was setup using geodeploy for Hyperconverged
>>> setup under oVirt.
>>> The initial setup was 3 servers with the same amount of physical disks:
>>> sdb, sdc, sdc, sdd, sde(this last one used for caching as it is an SSD)
>>>
>>> vmm10.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm10.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm10.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm10.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm11.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm11.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm11.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm11.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm12.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm12.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm12.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm12.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> *As you can see from the above the the engine volume is conformed of
>>> hosts vmm10 (Initiating cluster server but now dead sever), vmm11 and vmm12
>>> and on block device /dev/sdb (100Gb LV), also the vmstore1 volume is also
>>> on /dev/sdb (2600Gb LV).*
>>> /dev/mapper/gluster_vg_sdb-gluster_lv_engine                   xfs
>>>       100G  2.0G   98G   2% /gluster_bricks/engine
>>> /dev/mapper/gluster_vg_sdb-gluster_lv_vmstore1                 xfs
>>>       2.6T   35M  2.6T   1% /gluster_bricks/vmstore1
>>> /dev/mapper/gluster_vg_sdc-gluster_lv_data1                    xfs
>>>       2.7T  4.6G  2.7T   1% /gluster_bricks/data1
>>> /dev/mapper/gluster_vg_sdd-gluster_lv_data2                    xfs
>>>       2.7T  9.5G  2.7T   1% /gluster_bricks/data2
>>> vmm10.mydomain.com:/engine
>>> fuse.glusterfs  300G  9.2G  291G   4%
>>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_engine
>>> vmm10.mydomain.com:/vmstore1
>>> fuse.glusterfs  5.1T   53G  5.1T   2%
>>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_vmstore1
>>> vmm10.mydomain.com:/data1
>>>  fuse.glusterfs  8.0T   95G  7.9T   2%
>>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_data1
>>> vmm10.mydomain.com:/data2
>>>  fuse.glusterfs  8.0T  112G  7.8T   2%
>>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_data2
>>>
>>>
>>>
>>>
>>> *before any issues I increased the size of the cluster and the gluster
>>> cluster with the following, creating 4 distributed replicated volumes
>>> (engine, vmstore1, data1, data2)*
>>>
>>> vmm13.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm13.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm13.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm13.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm14.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm14.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm14.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm14.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm15.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm15.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm15.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm15.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm16.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm16.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm16.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm16.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm17.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm17.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm17.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm17.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>>
>>> vmm18.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>>> vmm18.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>>> vmm18.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>>> vmm18.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data
>>>
>>>
>>> *with your first suggestion I dont think it is possible to recover as I
>>> will lose the engine if I stop the "engine" volume, It might be doable for
>>> vmstore1, data1 and data2 but not the engine.*
>>> A) If you have space on another gluster volume (or volumes) or on
>>> NFS-based storage, you can migrate all VMs live . Once you do it,  the
>>> simple way will be to stop and remove the storage domain (from UI) and
>>> gluster volume that correspond to the problematic brick. Once gone, you
>>> can  remove the entry in oVirt for the old host and add the newly built
>>> one. Then you can recreate your volume and migrate the data back.
>>>
>>> *I tried removing the brick using CLI but get the following error:*
>>> volume remove-brick start: failed: Host node of the brick
>>> vmm10.mydomain.com:/gluster_bricks/engine/engine is down
>>>
>>> *So I used the force command:*
>>> gluster vol remove-brick engine 
>>> vmm10.mydomain.com:/gluster_bricks/engine/engine
>>>  vmm11.mydomain.com:/gluster_bricks/engine/engine  
>>> vmm12.mydomain.com:/gluster_bricks/engine/engine
>>> force
>>> Remove-brick force will not migrate files from the removed bricks, so
>>> they will no longer be available on the volume.
>>> Do you want to continue? (y/n) y
>>> volume remove-brick commit force: success
>>>
>>> *so I lost my engine:*
>>> Please enter your authentication name: vdsm@ovirt
>>> Please enter your password:
>>>  Id    Name                           State
>>> ----------------------------------------------------
>>>  3     HostedEngine                   paused
>>>
>>>  hosted-engine --vm-start
>>> The hosted engine configuration has not been retrieved from shared
>>> storage. Please ensure that ovirt-ha-agent is running and the storage
>>> server is reachable.
>>>
>>> I guess this fail scenario is more complex than I thought, hosted engine
>>> should of survived, as far as gluster I am able to get around command line,
>>> the issue is the engine, though it was running on vmm18 and not running on
>>> any bricks belonging to vmm10, 11, or 12 (original setup) it still failed...
>>> virsh list --all
>>> Please enter your authentication name: vdsm@ovirt
>>> Please enter your password:
>>>  Id    Name                           State
>>> ----------------------------------------------------
>>>  -     HostedEngine                   shut off
>>>
>>> *Now I cant get it to start:*
>>> hosted-engine --vm-start
>>> The hosted engine configuration has not been retrieved from shared
>>> storage. Please ensure that ovirt-ha-agent is running and the storage
>>> server is reachable.
>>> df -hT still showing mounts from old hosts bricks, could the problem be
>>> that this was the initiating host of the hyperconverged setup?
>>> vmm10.mydomain.com:/engine
>>> fuse.glusterfs  200G  6.2G  194G   4% /rhev/data-center/mnt/glusterSD/
>>> vmm10.mydomain.com:_engine
>>>
>>>
>>> I will re-create everything from scratch and simulate this again, and
>>> see why is it too complex to recover ovirt's engine with gluster when a
>>> server dies completely. Maybe it is my lack of understanding with regards
>>> how ovirt integrates with gluster though I have a decent understanding of
>>> Gluster to work with it...
>>>
>>> I will let you know once I have the cluster recreated and will kill the
>>> same server and see if I missed anything from the recommendations your
>>> provided.
>>>
>>> Thanks,
>>>
>>> --
>>> Adrian.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jun 11, 2019 at 4:13 PM Strahil Nikolov <[email protected]>
>>> wrote:
>>>
>>> Do you have empty space to store the VMs ? If yes, you can always script
>>> the migration of the disks via the API . Even a bash script and curl can do
>>> the trick.
>>>
>>> About the /dev/sdb , I still don't get it . A pure "df -hT" from a node
>>> will make it way clear. I guess '/dev/sdb' is a PV and you got 2 LVs ontop
>>> of it.
>>>
>>> Note: I should admit that as an admin - I don't use UI for gluster
>>> management.
>>>
>>> For now do not try to remove the brick. The approach is either to
>>> migrate the qemu disks to another storage or to reset-brick/replace-brick
>>> in order to restore the replica count.
>>> I will check the file and I will try to figure it out.
>>>
>>> Redeployment never fixes the issue, it just speeds up the recovery. If
>>> you can afford the time to spent on fixing the issue - then do not redeploy.
>>>
>>> I would be able to take a look next week , but keep in mind that I'm not
>>> so in deep with oVirt - I have started playing with it when I deployed my
>>> lab.
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> Strahil,
>>>
>>> Looking at your suggestions I think I need to provide a bit more info on
>>> my current setup.
>>>
>>>
>>>
>>>    1.
>>>
>>>    I have 9 hosts in total
>>>    2.
>>>
>>>    I have 5 storage domains:
>>>    -
>>>
>>>       hosted_storage (Data Master)
>>>       -
>>>
>>>       vmstore1 (Data)
>>>       -
>>>
>>>       data1 (Data)
>>>       -
>>>
>>>       data2 (Data)
>>>       -
>>>
>>>       ISO (NFS) //had to create this one because oVirt 4.3.3.1 would
>>>       not let me upload disk images to a data domain without an ISO (I 
>>> think this
>>>       is due to a bug)
>>>
>>>       3.
>>>
>>>    Each volume is of the type “Distributed Replicate” and each one is
>>>    composed of 9 bricks.
>>>    I started with 3 bricks per volume due to the initial Hyperconverged
>>>    setup, then I expanded the cluster and the gluster cluster by 3 hosts at 
>>> a
>>>    time until I got to a total of 9 hosts.
>>>
>>>
>>>    1.
>>>       -
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *Disks, bricks and sizes used per volume / dev/sdb engine 100GB /
>>>       dev/sdb vmstore1 2600GB / dev/sdc data1 2600GB / dev/sdd data2 2600GB 
>>> /
>>>       dev/sde -------- 400GB SSD Used for caching purposes From the above 
>>> layout
>>>       a few questions came up:*
>>>       1.
>>>
>>>
>>>
>>> *Using the web UI, How can I create a 100GB brick and a 2600GB brick to
>>>          replace the bad bricks for “engine” and “vmstore1” within the same 
>>> block
>>>          device (sdb) ? What about / dev/sde (caching disk), When I tried 
>>> creating a
>>>          new brick thru the UI I saw that I could use / dev/sde for caching 
>>> but only
>>>          for 1 brick (i.e. vmstore1) so if I try to create another brick 
>>> how would I
>>>          specify it is the same / dev/sde device to be used for caching?*
>>>
>>>
>>>
>>>    1.
>>>
>>>    If I want to remove a brick and it being a replica 3, I go to
>>>    storage > Volumes > select the volume > bricks once in there I can select
>>>    the 3 servers that compose the replicated bricks and click remove, this
>>>    gives a pop-up window with the following info:
>>>
>>>    Are you sure you want to remove the following Brick(s)?
>>>    - vmm11:/gluster_bricks/vmstore1/vmstore1
>>>    - vmm12.virt.iad3p:/gluster_bricks/vmstore1/vmstore1
>>>    - 192.168.0.100:/gluster-bricks/vmstore1/vmstore1
>>>    - Migrate Data from the bricks?
>>>
>>>    If I proceed with this that means I will have to do this for all the
>>>    4 volumes, that is just not very efficient, but if that is the only way,
>>>    then I am hesitant to put this into a real production environment as 
>>> there
>>>    is no way I can take that kind of a hit for +500 vms :) and also I
>>>    wont have that much storage or extra volumes to play with in a real
>>>    sceneario.
>>>
>>>    2.
>>>
>>>    After modifying yesterday */ etc/vdsm/ <http://vdsm.id>vdsm.id
>>>    <http://vdsm.id> by following (
>>>    
>>> <https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids>https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids
>>>    <https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids>) 
>>> I
>>>    was able to add the server **back **to the cluster using a new fqdn
>>>    and a new IP, and tested replacing one of the bricks and this is my 
>>> mistake
>>>    as mentioned in #3 above I used / dev/sdb entirely for 1 brick because 
>>> thru
>>>    the UI I could not separate the block device and be used for 2 bricks 
>>> (one
>>>    for the engine and one for vmstore1). **So in the “gluster vol info”
>>>    you might see <http://vmm102.mydomain.com>vmm102.mydomain.com
>>>    <http://vmm102.mydomain.com> *
>>> *but in reality it is <http://myhost1.mydomain.com>myhost1.mydomain.com
>>>    <http://myhost1.mydomain.com> *
>>>    3.
>>>
>>>    *I am also attaching gluster_peer_status.txt * *and in the last 2
>>>    entries of that file you will see and entry
>>>    <http://vmm10.mydomain.com>vmm10.mydomain.com <http://vmm10.mydomain.com>
>>>    (old/bad entry) and <http://vmm102.mydomain.com>vmm102.mydomain.com
>>>    <http://vmm102.mydomain.com> (new entry, same server vmm10, but renamed 
>>> to
>>>    vmm102). *
>>> *Also please find gluster_vol_info.txt file. *
>>>    4.
>>>
>>>    *I am ready *
>>> *to redeploy this environment if needed, but I am also ready to test any
>>>    other suggestion. If I can get a good understanding on how to recover 
>>> from
>>>    this I will be ready to move to production. *
>>>    5.
>>>
>>>
>>>
>>> *Wondering if you’d be willing to have a look at my setup through a
>>>    shared screen? *
>>>
>>> *Thanks *
>>>
>>>
>>> *Adrian*
>>>
>>> On Mon, Jun 10, 2019 at 11:41 PM Strahil <[email protected]> wrote:
>>>
>>> Hi Adrian,
>>>
>>> You have several options:
>>> A) If you have space on another gluster volume (or volumes) or on
>>> NFS-based storage, you can migrate all VMs live . Once you do it,  the
>>> simple way will be to stop and remove the storage domain (from UI) and
>>> gluster volume that correspond to the problematic brick. Once gone, you
>>> can  remove the entry in oVirt for the old host and add the newly built
>>> one.Then you can recreate your volume and migrate the data back.
>>>
>>> B)  If you don't have space you have to use a more riskier approach
>>> (usually it shouldn't be risky, but I had bad experience in gluster v3):
>>> - New server has same IP and hostname:
>>> Use command line and run the 'gluster volume reset-brick VOLNAME
>>> HOSTNAME:BRICKPATH HOSTNAME:BRICKPATH commit'
>>> Replace VOLNAME with your volume name.
>>> A more practical example would be:
>>> 'gluster volume reset-brick data ovirt3:/gluster_bricks/data/brick
>>> ovirt3:/gluster_ ricks/data/brick commit'
>>>
>>> If it refuses, then you have to cleanup '/gluster_bricks/data' (which
>>> should be empty).
>>> Also check if the new peer has been probed via 'gluster peer
>>> status'.Check the firewall is allowing gluster communication (you can
>>> compare it to the firewalls on another gluster host).
>>>
>>> The automatic healing will kick in 10 minutes (if it succeeds) and will
>>> stress the other 2 replicas, so pick your time properly.
>>> Note: I'm not recommending you to use the 'force' option in the previous
>>> command ... for now :)
>>>
>>> - The new server has a different IP/hostname:
>>> Instead of 'reset-brick' you can use  'replace-brick':
>>> It should be like this:
>>> gluster volume replace-brick data old-server:/path/to/brick
>>> new-server:/new/path/to/brick commit force
>>>
>>> In both cases check the status via:
>>> gluster volume info VOLNAME
>>>
>>> If your cluster is in production , I really recommend you the first
>>> option as it is less risky and the chance for unplanned downtime will be
>>> minimal.
>>>
>>> The 'reset-brick'  in your previous e-mail shows that one of the servers
>>> is not connected. Check peer status on all servers, if they are less than
>>> they should check for network and/or firewall issues.
>>> On the new node check if glusterd is enabled and running.
>>>
>>> In order to debug - you should provide more info like 'gluster volume
>>> info' and the peer status from each node.
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> On Jun 10, 2019 20:10, Adrian Quintero <[email protected]> wrote:
>>>
>>> >
>>> > Can you let me know how to fix the gluster and missing brick?,
>>> > I tried removing it by going to "storage > Volumes > vmstore > bricks
>>> > selected the brick
>>> > However it is showing as an unknown status (which is expected because
>>> the server was completely wiped) so if I try to "remove", "replace brick"
>>> or "reset brick" it wont work
>>> > If i do remove brick: Incorrect bricks selected for removal in
>>> Distributed Replicate volume. Either all the selected bricks should be from
>>> the same sub volume or one brick each for every sub volume!
>>> > If I try "replace brick" I cant because I dont have another server
>>> with extra bricks/disks
>>> > And if I try "reset brick": Error while executing action Start Gluster
>>> Volume Reset Brick: Volume reset brick commit force failed: rc=-1 out=()
>>> err=['Host myhost1_mydomain_com  not connected']
>>> >
>>> > Are you suggesting to try and fix the gluster using command line?
>>> >
>>> > Note that I cant "peer detach"   the sever , so if I force the removal
>>> of the bricks would I need to force downgrade to replica 2 instead of 3?
>>> what would happen to oVirt as it only supports replica 3?
>>> >
>>> > thanks again.
>>> >
>>> > On Mon, Jun 10, 2019 at 12:52 PM Strahil <[email protected]>
>>> wrote:
>>>
>>> >>
>>> >> Hi Adrian,
>>> >> Did you fix the issue with the gluster and the missing brick?
>>> >> If yes, try to set the 'old' host in maintenance an
>>>
>>>
>>>
>>> --
>>> Adrian Quintero
>>>
>>>
>>>
>>> --
>>> Adrian Quintero
>>>
>>>
>>
>> --
>> Adrian Quintero
>> _______________________________________________
>> Users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/[email protected]/message/44ITPT3QMJIZIQRRHKYETFED4GGMUJRX/
>>
>
>
> --
>
>
> Thanks,
> Gobinda
>


-- 
Adrian Quintero

_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/PG6EZNY7NBLKYXYR7YJARPAWH72BSZ3H/

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

Reply via email to