On February 6, 2020 6:06:18 PM GMT+02:00, Christian Reiss <[email protected]> wrote: >Hey folks, > >Running a 3-way HCI (again (sigh)) on gluster. Now the _inside_ of the >vms is backup'ed seperatly using bareos on an hourly basis, so files >are >present with worst case 59 minutes data loss. > >Now, on the outside I thought of doing gluster snapshots and then >syncing those .snap dirs away to a remote 10gig connected machine on a >weekly-or-so basis. As those contents of the snaps are the oVirt images > >(entire DC) I could re-setup gluster and copy those files back into >gluster and be done with it. > >Now some questions, if I may: > >- If the hosts remain intact but gluster dies, I simply setup Gluster, >stop the ovirt engine (seperate standalone hardware) copy everything >back and start ovirt engine again. All disks are accessible again >(tested). The bricks are marked as down (new bricks, same name). There >is a "reset brick" button that made the bricks come back online again. >What _exactly_ does it do? Does it reset the brick info in oVirt or >copy >all the data over from another node and really, really reset the brick? > >- If the hosts remain intact, but the engine dies: Can I re-attach the >engine the the running cluster? > >- If hosts and engine dies and everything needs to be re-setup would it > >be possible to do the setup wizard(s) again up to a running point then >copy the disk images to the new gluster-dc-data-dir? Would oVirt rescan > >the dir for newly found vms? > >- If _one_ host dies, but 2 and the engine remain online: Whats the >oVirt way of resetting up the failed one? Reinstalling the node and >then >what? From all the cases above this is the most likely one. > >Having had to reinstall the entire Cluster three times already scares >me. Always gluster related. > >Again thank you community for your great efforts!
Gluster reset brick actually wipes the brick and starts a heal process from another brick. If your node dies, ovirt won't allow you to remove it from untill you restore the 'replica 3' status of gluster. I think that the fastest way to restore a node is: 1. Reinstall the node with same hostname and network settings 2. Restore from backup gluster directory /var/lib/glusterd/ 3. Restart the node and initiate a reaet brick. 4. Go to UI and remove the node that was defective 5. Add again the node Voila. About the gluster issues - you are not testing enough your upgrades and if you use the cluster in production, it will be quite disruptive. For example, the ACL issue I had met (and actually you too) was discussed in the mailing list for 2 weeks before I have managed to resolve it. I'm using latest oVirt with Gluster v7 - but this is my lab and I can afford downtime of a week (or even more). The more tested is an oVirt/Gluster release - the more reliable it will be. Best Regards, Strahil Nikolov _______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/GUAT3VEJ4BAJN7PN4VCT4PGDXSL4OE4M/

