Re: [Gluster-users] Questions about healing
yes it's configurable with: network.ping-timeout and default is 42 seconds I believe. On 22 May 2016 at 03:39, Kevin Lemonnierwrote: > > Let's assume 10.000 shard on a server being healed. > > Gluster heal 1 shard at once, so the other 9.999 pieces would be read > > from the other servers > > to keep VM running ? If yes, this is good. If not, in this case, the > > whole VM need to be healed > > and thus, the whole VM would hangs > > Yes, that seems to be what's hapenning on 3.7.11. > Couldn't notice any freez during heals, except for a brief one when > a node just went down : looks like gluster hangs for a few seconds > while waiting for the node before deciding to mark it down and continue > without it. > > -- > Kevin Lemonnier > PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
2016-05-21 23:51 GMT+02:00 Kevin Lemonnier: > But with 3.7.11 the shards are getting locked only during the heal of that > specific > shard. That means even if all your shards needs healing, each shard is only > getting > locked a few seconds so the VM keeps running during the heal, it's great. Let's assume 10.000 shard on a server being healed. Gluster heal 1 shard at once, so the other 9.999 pieces would be read from the other servers to keep VM running ? If yes, this is good. If not, in this case, the whole VM need to be healed and thus, the whole VM would hangs ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
>That's clear but if you have a cluster of 3, with replica 3, if you loose >a node, you have to heal the whole VM image as shards are not >distributed,A as there aren't nodes to distribute to. > >All nodes has all shards in my case. > >Sharding would be useful in very large cluster where each chunk is >distributed but on small cluster doesn't change too much having a single >file or 500 files on the same server for the same VM image >you still have to heal everything every time you loose a node Oh no, I'm doing the same thing. The problem you are thinking about I think is present in 3.7.6, all shards are getting locked and it takes a while to heal (still less time than without sharding, as you don't need to heal everything). But with 3.7.11 the shards are getting locked only during the heal of that specific shard. That means even if all your shards needs healing, each shard is only getting locked a few seconds so the VM keeps running during the heal, it's great. Now to be honest I have a corruption problem with that, but since I seem to be the only one encountering it I think it might be hardware on my side, you really should test it yourself to see how it works for you. -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 signature.asc Description: Digital signature ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
>Which is the OS reaction to a locked storage?A It's transparent or could >lead to FS issue? >What would happen if healing starts in the middle of a write, for example >when MySQL flush to disks? In my experience the OS can't notice that, actually. The access to the disk won't fail, it'll just wait for the heal to finish before returning : your VM will freez. When the heal finishes the access will return correctly and everything will continue as if nothing happened. Might depend on the OS though, no idea how windows would react, but for Linux at least even hour long heals never caused any problem to the VM, except of course the fact that the VM froze and all services on it stopped responding during the heal. We didn't have sharding at first, and the behaviour is pretty much the same, but the heals take forever. -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 signature.asc Description: Digital signature ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
Il 21 mag 2016 08:38, "Kevin Lemonnier"ha scritto: > Yeah, but healing a few MB shard takes a few second, so the VM is frozen for a very small > amount of time. Without sharding, the VM is frozen as long as the whole disk hasn't > been healed, which will take hours on big clusters. Which is the OS reaction to a locked storage? It's transparent or could lead to FS issue? What would happen if healing starts in the middle of a write, for example when MySQL flush to disks? Let's assume a 100gb image sharded (10mb) in a replica 3 on a 3 nodes cluster the image is split in 10.000 pieces, stored in all servers (replica 3) If a server goes down, the whole VM needs to be healed , thus the whole vm is locked Shard is useful on single brick failures, that could be avoided by using raid, non for server failure ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
>Anyway how is possible to keep VM up and running when healing is happening >on a shard? That part of disk image is not accessible and thus the VM >could have some issue on a filesystem. Yeah, but healing a few MB shard takes a few second, so the VM is frozen for a very small amount of time. Without sharding, the VM is frozen as long as the whole disk hasn't been healed, which will take hours on big clusters. -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 signature.asc Description: Digital signature ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
Well it's not magic, there is an algorithm that is documented and it is trivial script the recreation of the file from the shards if gluster was truly unavailable: > > > #!/bin/bash > # > # quick and dirty reconstruct file from shards > # takes brick path and file name as arguments > # Copyright May 20th 2016 A. Neil > # > brick=$1 > filen=$2 > file=`find $brick -name $filen` > inode=`ls -i $file | cut -d' ' -f1` > pushd $brick/.glusterfs > gfid=`find . -inum $inode | cut -d'/' -f4` > popd > nshard=`ls -1 $brick/.shard/${gfid}.* | wc -l` > cp $file ./${filen}.restored > for i in `seq 1 $nshard`; do cat $brick/.shard/${gfid}.$i >> > ./${filen}.restored; done Admittedly this is not as easy as pulling the image for from the brick file system, but then the advantages are pretty big. The point is that each shard is small and healing of them is fast. The majority of the time when you need to heal a vm it's is only a few blocks that have changed and without sharding you might have to heal 10 , 20 or 100GB. In my experience if you have 30 or 40 VMs it can take hours to heal. With the limited testing I have done I have found that yes some VMs will experience IO timeouts, freeze, and then need to be restarted. However, at least you don't need to wait hours before you can do that. On 20 May 2016 at 15:20, Gandalf Corvotempesta < gandalf.corvotempe...@gmail.com> wrote: > Il 20 mag 2016 20:14, "Alastair Neil"ha scritto: > > > > I think you are confused about what sharding does. In a sharded > replica 3 volume all the shards exist on all the replicas so there is no > distribution. Might you be getting confused with erasure coding? The > upshot of sharding is that if you have a failure, instead of healing > multiple gigabyte vm files for example, you only heal the shards that have > changed. This generally shortens the heal time dramatically. > > I know what sharding is. > it split each file in multiple, smaller, chunks > > But if all is gonna bad, how can i reconstruct a file from each shard > without gluster? It would be a pain. > Let's assume tens of terabytes of shards to be manually reconstructed ... > > Anyway how is possible to keep VM up and running when healing is happening > on a shard? That part of disk image is not accessible and thus the VM could > have some issue on a filesystem. > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
Il 20 mag 2016 20:14, "Alastair Neil"ha scritto: > > I think you are confused about what sharding does. In a sharded replica 3 volume all the shards exist on all the replicas so there is no distribution. Might you be getting confused with erasure coding? The upshot of sharding is that if you have a failure, instead of healing multiple gigabyte vm files for example, you only heal the shards that have changed. This generally shortens the heal time dramatically. I know what sharding is. it split each file in multiple, smaller, chunks But if all is gonna bad, how can i reconstruct a file from each shard without gluster? It would be a pain. Let's assume tens of terabytes of shards to be manually reconstructed ... Anyway how is possible to keep VM up and running when healing is happening on a shard? That part of disk image is not accessible and thus the VM could have some issue on a filesystem. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
I think you are confused about what sharding does. In a sharded replica 3 volume all the shards exist on all the replicas so there is no distribution. Might you be getting confused with erasure coding? The upshot of sharding is that if you have a failure, instead of healing multiple gigabyte vm files for example, you only heal the shards that have changed. This generally shortens the heal time dramatically. Alastair On 18 May 2016 at 12:54, Gandalf Corvotempesta < gandalf.corvotempe...@gmail.com> wrote: > Il 18/05/2016 13:55, Kevin Lemonnier ha scritto: > >> Yes, that's why you need to use sharding. With sharding, the heal is much >> quicker and the whole VM isn't freezed during the heal, only the shard >> being healed. I'm testing that right now myself and that's almost invisible >> for the VM using 3.7.11. Use the latest version though, it really really >> wasn't transparent in 3.7.6 :). >> > I don't like sharding. With sharing all "files" are split in shard and > distributed across the whole cluster. > If everything went bad, reconstructing a file from it shards could be hard. > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
Il 18 mag 2016 19:31, "Kevin Lemonnier"ha scritto: > Seems like a non issue, you are planning in using replica right ? Yes but what if in case of a gluster bug? Replica protect against hardware failure but also software could fail. What if sharding algorithm would be changed in future and I'm not able to upgrade for any reasons? I prefere to have files always available and not modified by a software, if possible. The same is for hardware raid. What if you have to change the card vendor? You have to manually backup and restore all files to the newer card because raid properties could be incompatible between different vendors ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
On Wed, May 18, 2016 at 06:54:57PM +0200, Gandalf Corvotempesta wrote: > Il 18/05/2016 13:55, Kevin Lemonnier ha scritto: > > Yes, that's why you need to use sharding. With sharding, the heal is > > much quicker and the whole VM isn't freezed during the heal, only the > > shard being healed. I'm testing that right now myself and that's > > almost invisible for the VM using 3.7.11. Use the latest version > > though, it really really wasn't transparent in 3.7.6 :). > I don't like sharding. With sharing all "files" are split in shard and > distributed across the whole cluster. > If everything went bad, reconstructing a file from it shards could be hard. Seems like a non issue, you are planning in using replica right ? Anyway, I tried and without sharding it's pretty much unusable, the VM just freezes for as much time as it needs to redownload the whole disk -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 signature.asc Description: Digital signature ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
Il 18/05/2016 13:55, Kevin Lemonnier ha scritto: Yes, that's why you need to use sharding. With sharding, the heal is much quicker and the whole VM isn't freezed during the heal, only the shard being healed. I'm testing that right now myself and that's almost invisible for the VM using 3.7.11. Use the latest version though, it really really wasn't transparent in 3.7.6 :). I don't like sharding. With sharing all "files" are split in shard and distributed across the whole cluster. If everything went bad, reconstructing a file from it shards could be hard. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about healing
On Wed, May 18, 2016 at 01:39:58PM +0200, Gandalf Corvotempesta wrote: > Ciao, > i'm planning a new infrastructure. I have some questions about > healing to better optimize performances in case of brick failure. > > Let's assume this environment: > > 3 supermicro servers, replica 3, with 12 SATA disks each. > each servers has 2 bricks in RAID-6 (software or > hardware, i don't know) made by 6 disks each. > > 1) in case of a single disk failure, healing would not > happen as RAID is recovering on it's own Correct > > 2) in case of total brick failure (3 broken disks in a RAID-6), > healing would happen, right ? During the healing, the whole brick > is locked for write? Even if the other 2 servers are working properly? Yeah, but that's transparent. You don't access bricks directly, you access the volume, even if you think you are monting a specific brick, you aren't. > > 3) in case of total server failure, healing would happen like on point 2? > > I'm asking this because I don't know if using Gluster to store virtual > machines > disk images or mount a filesystem inside each virtual machine. > In the first case, when healing happen, the whole VM is locked down, right? > If the same brick has multiple VM storage, all VM would be locked. > Yes, that's why you need to use sharding. With sharding, the heal is much quicker and the whole VM isn't freezed during the heal, only the shard being healed. I'm testing that right now myself and that's almost invisible for the VM using 3.7.11. Use the latest version though, it really really wasn't transparent in 3.7.6 :). > In the second case, only the healed file is locked. As we host mainly > webservers > with tons of small files, healing would be almost transparent (heal a > 20kb file > would require a second, not hours) yes, but gluster isn't great for small files. We do have a few websites on glusterFS, but to make the performances acceptable you'll have to enable APCu with stat = 0. Using gluster for the VM disks instead of the application would avoid that. You should test both solutions though, see what fits best ! -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 signature.asc Description: Digital signature ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Questions about healing
Ciao, i'm planning a new infrastructure. I have some questions about healing to better optimize performances in case of brick failure. Let's assume this environment: 3 supermicro servers, replica 3, with 12 SATA disks each. each servers has 2 bricks in RAID-6 (software or hardware, i don't know) made by 6 disks each. 1) in case of a single disk failure, healing would not happen as RAID is recovering on it's own 2) in case of total brick failure (3 broken disks in a RAID-6), healing would happen, right ? During the healing, the whole brick is locked for write? Even if the other 2 servers are working properly? 3) in case of total server failure, healing would happen like on point 2? I'm asking this because I don't know if using Gluster to store virtual machines disk images or mount a filesystem inside each virtual machine. In the first case, when healing happen, the whole VM is locked down, right? If the same brick has multiple VM storage, all VM would be locked. In the second case, only the healed file is locked. As we host mainly webservers with tons of small files, healing would be almost transparent (heal a 20kb file would require a second, not hours) ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users