[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
Hi David, I hope you manage to recover the VM or most of the data. If you got multiple disks in that VM (easily observeable in oVirt UI), you might need to repeat that again for the rest of the disks. Check with xfs_info the inode size (isize), as the default used to be 256, but I have noticed that in some cases mkfs.xfs picked a higher value (EL7). Also, check the gluster's logs or at least keep them for a later check. Usually, smaller inode size can cause a lot and really awkward issues in Gluster, but this needs to be verified. Once the raid is fully rebuilt, you will have to add both the HW raid and the arbiter brick (add-brick replica 3 arbiter 1) . As you will be reusing the arbiter brick, the safest is to mkfs.xfs and also increase the inode ratio to 90%. Can you provide your volume info ? The default shard size is just 64MB and transfer is quite fast, so there should be no locking or the symptoms reported . Once the healing is over, you should be ready for the rebuilt of the other node. Best Regards,Strahil Nikolov Ok, so right now, my production cluster is operating off of a single brick. I was planning on expanding the storage on the 2nd host next week, and adding that back into the cluster, and getting the Replica 2, Arbiter 1 redundancy working again. How would you recommend I proceed with that plan, knowing that I'm currently operating off of a single brick in which I did NOT specify the size with `mkfs.xfs -i size=512? Should I specify the size on the new brick I build next week, and then once everything is healed, reformat the current brick? > And then there is a lot of information missing between the lines: I guess you > are using a 3 node HCI setup and were adding new disks (/dev/sdb) on all > three nodes and trying to move the glusterfs to those new bigger disks? You are correct in that I'm using 3-node HCI. I originally built HCI with Gluster replication on all 3 nodes (Replica 3). As I'm increasing the storage, I'm also moving to an architecture of Replica 2/Arbiter 1. So yes, the plan was: 1) Convert FROM Replica 3 TO replica 2/arbiter 1 2) Convert again down to a Replica 1 (so no replication... just operating storage on a single host) 3) Rebuild the RAID array (with larger storage) on one of the unused hosts, and rebuild the gluster bricks 4) Add the larger RAID back into gluster, let it heal 5) Now, remove the bricks from the host with the smaller storage -- THIS is where things went awry, and what caused the data loss on this 1 particular VM --- This is where I am currently --- 6) Rebuild the RAID array on the remaining host that is now unused (This is what I am / was planning to do next week) Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Thursday, August 5th, 2021 at 3:12 PM, Thomas Hoberg wrote: > If you manage to export the disk image via the GUI, the result should be a > qcow2 format file, which you can mount/attach to anything Linux (well, if the > VM was Linux... it didn't say) > > But it's perhaps easier to simply try to attach the disk of the failed VM as > a secondary to a live VM to recover the data. > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/privacy-policy.html > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/SLXLQ4BLQUPBV5355DFFACF6LFJX4MWY/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XPUTVULWCBIRWJJZMHWRO7XB4VBVBSHU/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
Thank you for all the responses. Following Strahil's instructions, I *think* that I was able to reconstruct the disk image. I'm just waiting for that image to finish downloading onto my local machine, at which point I'll try to import into VirtualBox or something. Fingers crossed! Worst case scenario, I do have backups for that particular VM from 3 months ago, which I have already restored onto a new VM. Losing 3 months of data is much better than losing 100% of the data from the past 2-3+ years. Thank you. > First of all you diddn't 'mkfs.xfs -i size=512' . You just 'mkfs.xfs' , whis > is not good and could have caused your VM problems. Also , check with > xfs_info the isize of the FS. Ok, so right now, my production cluster is operating off of a single brick. I was planning on expanding the storage on the 2nd host next week, and adding that back into the cluster, and getting the Replica 2, Arbiter 1 redundancy working again. How would you recommend I proceed with that plan, knowing that I'm currently operating off of a single brick in which I did NOT specify the size with `mkfs.xfs -i size=512? Should I specify the size on the new brick I build next week, and then once everything is healed, reformat the current brick? > And then there is a lot of information missing between the lines: I guess you > are using a 3 node HCI setup and were adding new disks (/dev/sdb) on all > three nodes and trying to move the glusterfs to those new bigger disks? You are correct in that I'm using 3-node HCI. I originally built HCI with Gluster replication on all 3 nodes (Replica 3). As I'm increasing the storage, I'm also moving to an architecture of Replica 2/Arbiter 1. So yes, the plan was: 1) Convert FROM Replica 3 TO replica 2/arbiter 1 2) Convert again down to a Replica 1 (so no replication... just operating storage on a single host) 3) Rebuild the RAID array (with larger storage) on one of the unused hosts, and rebuild the gluster bricks 4) Add the larger RAID back into gluster, let it heal 5) Now, remove the bricks from the host with the smaller storage -- THIS is where things went awry, and what caused the data loss on this 1 particular VM --- This is where I am currently --- 6) Rebuild the RAID array on the remaining host that is now unused (This is what I am / was planning to do next week) Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Thursday, August 5th, 2021 at 3:12 PM, Thomas Hoberg wrote: > If you manage to export the disk image via the GUI, the result should be a > qcow2 format file, which you can mount/attach to anything Linux (well, if the > VM was Linux... it didn't say) > > But it's perhaps easier to simply try to attach the disk of the failed VM as > a secondary to a live VM to recover the data. > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/privacy-policy.html > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/SLXLQ4BLQUPBV5355DFFACF6LFJX4MWY/ publickey - dmwhite823@protonmail.com - 0x320CD582.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CZDHKGQES4ZOGGFJIBB46CZEGD647DLZ/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
If you manage to export the disk image via the GUI, the result should be a qcow2 format file, which you can mount/attach to anything Linux (well, if the VM was Linux... it didn't say) But it's perhaps easier to simply try to attach the disk of the failed VM as a secondary to a live VM to recover the data. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SLXLQ4BLQUPBV5355DFFACF6LFJX4MWY/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
First off, I have very little hope, you'll be able to recover your data working at gluster level... And then there is a lot of information missing between the lines: I guess you are using a 3 node HCI setup and were adding new disks (/dev/sdb) on all three nodes and trying to move the glusterfs to those new bigger disks? Resizing/moving/adding or removing disks are "natural" operations for Gluster. But oVirt isn't "gluster native" and may not be so forgiving if you just swap device paths on bricks. Practical guides on how to replace the storage without down time (after all this is a HA solution, right?) are somehow missing from the oVirt documentation, and if I was a rich man, perhaps I'd get myself an RHV support contract and see if RHEL engineers would say anything but "not supported". The first thing I'd recommend is to create some temporary space. I found using an extra disk as NFS storage on one of the hosts was a good way to gain some maneuvering room e.g. for backups. You can try to attach the disk of the broken VM as a secondary to another good VM to see if the data can be salvaged from there. But before you attach it (and perhaps an automatic fsck ruins it for you), you can perhaps create a copy to the NFS export/backup (domain). If you weren't out of space, you'd just create a local copy and work with that. You can also try exporting the disk image, but there is a lot of untested or slow code in that operation from my experience. If that image happens to be empty (I've seen that happen) or the data on it cannot be recovered, there is little to be gained, by trying to work at the GlusterFS level. The logical disk image file will be chunked into 64MB bits and their order is buried deep either in GlusterFS or in oVirt and perhaps your business is the better place to invest your energy. But there is a good chance the data portion of that disk image still has your data. The fact that oVirt/KVM generally pauses VMs when it has issues with the storage, tends to preserve and protect your data rather better than what happens when physical hosts suffer brown outs or power glitches. I guess you'll have learned that oVirt doesn't protect you from making mistakes, it only tries to offer some resilience against faults. It's good and valuable to report these things, because it helps others to learn, too. I sincerely hope you'll make do! ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XSCXBSKCI3RKM5FUH57WG6JCMHOE7PMB/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
*should be 2 On Thu, Aug 5, 2021 at 7:42, Strahil Nikolov wrote: when you use 'remove-brick replica 1', you need to specify the removed bricks which should be 1 (data brick and arbiter).Something is mising in your description. Best Regards,Strahil Nikolov On Thu, Aug 5, 2021 at 7:33, Strahil Nikolov via Users wrote: ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/K6DFRDDV5HTRZFLXKO5274AB4RUXOHV6/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GQHJPBVCOT3N6AD2TJMF2RT4PANR2KPK/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
when you use 'remove-brick replica 1', you need to specify the removed bricks which should be 1 (data brick and arbiter).Something is mising in your description. Best Regards,Strahil Nikolov On Thu, Aug 5, 2021 at 7:33, Strahil Nikolov via Users wrote: ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/K6DFRDDV5HTRZFLXKO5274AB4RUXOHV6/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/26R3AM3EEAEQ5LTORWFX7RMY4F5YYI65/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
First of all you diddn't 'mkfs.xfs -i size=512' . You just 'mkfs.xfs' , whis is not good and could have caused your VM problems. Also , check with xfs_info the isize of the FS. You have to find the uuid of the disks of the affected VM.Then go to the removed host,and find that file -> this is the so called shard1.Then you need to find the gfid of the file.The easiest way is to go to the "dead" cluster and find the hard links in the .gluster directory. Something like this: ssh old host (the one specified in the remove-brick)cd /gluster_bricks/data/data//images/ls -li -> take the first numberfind /gluster_bricks/data/data -inum It should show you both the file and the gfid. Then copy the file from images//file.Go to /gluster_bricks/data/data/.shardList all files of: ls -l .* These are your shards. Just cat first file + shards (in the number order) into another file. This should be your VM disk. Best Regards,Strahil Nikolov On Tue, Aug 3, 2021 at 12:58, David White via Users wrote: Hi Patrick, This would be amazing, if possible. Checking /gluster_bricks/data/data on the host where I've removed (but not replaced) the bricks, I see a single directory. When I go into that directory, I see two directories: dom_md images If I go into the images directory, I think I see the hash folders that you're referring to, and inside each of those, I see the 3 files you referenced. Unfortunately, those files clearly don't have all of the data. The parent folder for all of the hash folders is only 687M. [root@cha1-storage data]# du -skh * 687M 31366488-d845-445b-b371-e059bf71f34f And the "iso" files are small. The one I'm looking at now is only 19M. It appears that most of the actual data is located in /gluster_bricks/data/data/.glusterfs, and all of those folders are totally random, incomprehensible directories that I'm not sure how to understand. Perhaps you were on an older version of Gluster, and the actual data hierarchy is different? I don't know. But I do see the 3 files you referenced, so that's a start, even if they are nowhere near the correct size. Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Tuesday, August 3rd, 2021 at 1:49 AM, Patrick Lomakin wrote: > Greetings, I once wondered how data is stored between replicated bricks. > Specifically, how disks are stored on the storage domain in Gluster. I > checked a mounted brick via the standard path (path may be different) > /gluster/data/data and saw many directories there. Maybe the hierarchy is > different, can't check now. But in the end I got a list of directories. Each > directory name is a disk image hash. After going to a directory such as /HASH > there were 3 files. The first is a disk in raw/iso/qcow2 format (but the file > has no extension, I looked at the size) the other two files are the > configuration and metadata. I downloaded the disk image file (.iso) to my > computer via the curl command and service www.station307.com (no ads). And I > got the original .iso which uploaded to the storage domain through the hosted > engine interface. Maybe this way you can download the disk image to your > computer and then load it via the GUI and connect it to a virtual machine. > Good luck! > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/privacy-policy.html > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/A6XITCEX5RNQB37YKDCR4EUKTV6W4HIR/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/K6DFRDDV5HTRZFLXKO5274AB4RUXOHV6/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
Hi Patrick, This would be amazing, if possible. Checking /gluster_bricks/data/data on the host where I've removed (but not replaced) the bricks, I see a single directory. When I go into that directory, I see two directories: dom_md images If I go into the images directory, I think I see the hash folders that you're referring to, and inside each of those, I see the 3 files you referenced. Unfortunately, those files clearly don't have all of the data. The parent folder for all of the hash folders is only 687M. [root@cha1-storage data]# du -skh * 687M31366488-d845-445b-b371-e059bf71f34f And the "iso" files are small. The one I'm looking at now is only 19M. It appears that most of the actual data is located in /gluster_bricks/data/data/.glusterfs, and all of those folders are totally random, incomprehensible directories that I'm not sure how to understand. Perhaps you were on an older version of Gluster, and the actual data hierarchy is different? I don't know. But I do see the 3 files you referenced, so that's a start, even if they are nowhere near the correct size. Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Tuesday, August 3rd, 2021 at 1:49 AM, Patrick Lomakin wrote: > Greetings, I once wondered how data is stored between replicated bricks. > Specifically, how disks are stored on the storage domain in Gluster. I > checked a mounted brick via the standard path (path may be different) > /gluster/data/data and saw many directories there. Maybe the hierarchy is > different, can't check now. But in the end I got a list of directories. Each > directory name is a disk image hash. After going to a directory such as /HASH > there were 3 files. The first is a disk in raw/iso/qcow2 format (but the file > has no extension, I looked at the size) the other two files are the > configuration and metadata. I downloaded the disk image file (.iso) to my > computer via the curl command and service www.station307.com (no ads). And I > got the original .iso which uploaded to the storage domain through the hosted > engine interface. Maybe this way you can download the disk image to your > computer and then load it via the GUI and connect it to a virtual machine. > Good luck! > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/privacy-policy.html > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/A6XITCEX5RNQB37YKDCR4EUKTV6W4HIR/ publickey - dmwhite823@protonmail.com - 0x320CD582.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QFCDE36MUQFFNQQULUYWWI7DBU3GG2KF/
[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM
Greetings, I once wondered how data is stored between replicated bricks. Specifically, how disks are stored on the storage domain in Gluster. I checked a mounted brick via the standard path (path may be different) /gluster/data/data and saw many directories there. Maybe the hierarchy is different, can't check now. But in the end I got a list of directories. Each directory name is a disk image hash. After going to a directory such as /**HASH** there were 3 files. The first is a disk in raw/iso/qcow2 format (but the file has no extension, I looked at the size) the other two files are the configuration and metadata. I downloaded the disk image file (.iso) to my computer via the curl command and service www.station307.com (no ads). And I got the original .iso which uploaded to the storage domain through the hosted engine interface. Maybe this way you can download the disk image to your computer and then load it via the GUI and connect it to a virtual machine. Good luck! ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/A6XITCEX5RNQB37YKDCR4EUKTV6W4HIR/