Re: [Gluster-users] Virtual machines and self-healing on GlusterFS v3.3

Dario Berzano Sun, 16 Sep 2012 08:50:31 -0700

Ok, here's the output for 1816/images/disk.0:

# file: bricks/VmDir01/1816/images/disk.0
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.VmDir-client-0=0x000000000000000000000000
trusted.afr.VmDir-client-1=0x000000000000000000000000
trusted.gfid=0x1cef9d386f1c4424af6d95dfbcf2989b


# file: bricks/VmDir02/1816/images/disk.0
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.VmDir-client-0=0x000000000000000000000000
trusted.afr.VmDir-client-1=0x000000000000000000000000
trusted.gfid=0x1cef9d386f1c4424af6d95dfbcf2989b

And for 1814/images/disk.0:

# file: bricks/VmDir01/1814/images/disk.0
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.VmDir-client-0=0x000000010000000000000000
trusted.afr.VmDir-client-1=0x000000010000000000000000
trusted.gfid=0xaabc0c344ccc4cfe8e2ed588dd78323b

# file: bricks/VmDir02/1814/images/disk.0
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.VmDir-client-0=0x000000010000000000000000
trusted.afr.VmDir-client-1=0x000000010000000000000000
trusted.gfid=0xaabc0c344ccc4cfe8e2ed588dd78323b

Note that these are just two sample files, since the problem occurs with 100% 
of our "big" virtual machines. Here's the whole content of the GlusterFS volume 
along with file sizes:

6.3G ./1981/images/disk.0
53M ./1820/images/disk.0
9.7G ./1838/images/disk.0
10G ./1819/images/disk.0
9.2G ./1818/images/disk.0
10G ./1816/images/disk.0
53M ./1962/images/disk.0
10G ./1814/images/disk.0
6.2G ./1988/images/disk.0
10G ./1817/images/disk.0
53M ./1821/images/disk.0

We currently have 11 running VMs. The "small" ones (53 MB big) have never shown 
any problem so far. *All* the other VMs (6 to 10 GB big) periodically show up 
in the output of:

gluster volume heal VmDir info

when there's some intense I/O occuring, disappearing immediately shortly 
afterwards.

Thanks, cheers,
--
: Dario Berzano
: CERN PH-SFT & Università di Torino (Italy)
: Wiki: http://newton.ph.unito.it/~berzano
: GPG: http://newton.ph.unito.it/~berzano/gpg
: Mobiles: +41 766124782 (CH), +39 3487222520 (IT)


Il giorno 14/set/2012, alle ore 18:21, Pranith Kumar Karampuri 
<[email protected]> ha scritto:

> Dario,
>  Ok that confirms that it is not a split-brain. Could you post the getfattr 
> output I requested as well?. What is the size of the VM files?.
> 
> Pranith
> ----- Original Message -----
> From: "Dario Berzano" <[email protected]>
> To: "Pranith Kumar Karampuri" <[email protected]>
> Cc: "<[email protected]>" <[email protected]>
> Sent: Friday, September 14, 2012 9:42:38 PM
> Subject: Re: [Gluster-users] Virtual machines and self-healing on GlusterFS 
> v3.3
> 
> 
> # gluster volume heal VmDir info healed 
> 
> 
> Heal operation on volume VmDir has been successful 
> 
> 
> Brick one-san-01:/bricks/VmDir01 
> Number of entries: 259 
> Segmentation fault (core dumped) 
> 
> 
> (same story for heal-failed) which seems to be exactly this bug: 
> 
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=836421 
> 
> 
> Should I upgrade to latest QA RPMs to see what is going on? 
> 
> 
> Btw, with split-brain I have no entries: 
> 
> 
> 
> Heal operation on volume VmDir has been successful 
> 
> 
> Brick one-san-01:/bricks/VmDir01 
> Number of entries: 0 
> 
> 
> Brick one-san-02:/bricks/VmDir02 
> Number of entries: 0 
> 
> 
> Thank you, cheers, 
> -- 
> : Dario Berzano 
> : CERN PH-SFT & Università di Torino (Italy) 
> : Wiki: http://newton.ph.unito.it/~berzano 
> : GPG: http://newton.ph.unito.it/~berzano/gpg 
> : Mobiles: +41 766124782 (CH), +39 3487222520 (IT) 
> 
> 
> 
> 
> Il giorno 14/set/2012, alle ore 17:16, Pranith Kumar Karampuri < 
> [email protected] > 
> ha scritto: 
> 
> 
> hi Dario, 
> Could you post the output of the following commands: 
> gluster volume heal VmDir info healed 
> gluster volume heal VmDir info split-brain 
> 
> Also provide the output of 'getfattr -d -m . -e hex' On both the bricks for 
> the two files listed in the output of 'gluster volume heal VmDir info' 
> 
> Pranith. 
> 
> ----- Original Message ----- 
> From: "Dario Berzano" < [email protected] > 
> To: [email protected] 
> Sent: Friday, September 14, 2012 6:57:32 PM 
> Subject: [Gluster-users] Virtual machines and self-healing on GlusterFS v3.3 
> 
> 
> 
> Hello, 
> 
> 
> in our computing centre we have an infrastructure with a GlusterFS volume 
> made of two bricks in replicated mode: 
> 
> 
> 
> 
> 
> Volume Name: VmDir 
> Type: Replicate 
> Volume ID: 9aab85df-505c-460a-9e5b-381b1bf3c030 
> Status: Started 
> Number of Bricks: 1 x 2 = 2 
> Transport-type: tcp 
> Bricks: 
> Brick1: one-san-01:/bricks/VmDir01 
> Brick2: one-san-02:/bricks/VmDir02 
> 
> 
> 
> 
> We are using this volume to store running images of some KVM virtual machines 
> and thought we could benefit from the replicated storage in order to achieve 
> more robustness as well as the ability to live-migrate VMs. 
> 
> 
> Our GlusterFS volume VmDir is mounted on several (three at the moment) 
> hypervisors. 
> 
> 
> However, in many cases (but it is difficult to reproduce: best way is to 
> stress VM I/O), either when one brick becomes unavailable for some reason, or 
> when we perform live migrations, virtual machines decide to remount 
> filesystems from their virtual disks in read-only. At the same time, on the 
> hypervisors mounting the GlusterFS partitions, we spot some kernel messages 
> like: 
> 
> 
> 
> 
> INFO: task kvm:13560 blocked for more than 120 seconds. 
> 
> 
> 
> 
> By googling it I have found some "workarounds" to mitigate this problem, like 
> mounting disks within virtual machines with barrier=0: 
> 
> 
> http://invalidlogic.com/2012/04/28/ubuntu-precise-on-xenserver-disk-errors/ 
> 
> 
> but I actually fear to damage my virtual machine disks by doing such a thing! 
> 
> 
> AFAIK from GlusterFS v3.3 self-healing should be performed server-side (and 
> no self-healing at all is performed on the clients and by granularly locking 
> big files). When I connect to my GlusterFS pool, if I monitor the 
> self-healing status continuously: 
> 
> 
> watch -n1 'gluster volume heal VmDir info' 
> 
> 
> I obtain an output like: 
> 
> 
> 
> 
> 
> Heal operation on volume VmDir has been successful 
> 
> 
> Brick one-san-01:/bricks/VmDir01 
> Number of entries: 2 
> /1814/images/disk.0 
> /1816/images/disk.0 
> 
> 
> Brick one-san-02:/bricks/VmDir02 
> Number of entries: 2 
> /1816/images/disk.0 
> /1814/images/disk.0 
> 
> 
> 
> 
> with a list of virtual machine disks healed by GlusterFS. Those and other 
> files continuously appear and disappear from the list. 
> 
> 
> This is a behavior I don't understand at all: does this mean that those files 
> continuously get corrupted and healed, and self-healing is just a natural 
> part of the replication process?! Or some kind of corruption is actually 
> happening on our virtual disks for some reason? Is this related to the 
> "remount readonly" problem? 
> 
> 
> A more general question maybe would be: is GlusterFS v3.3 ready for storing 
> running virtual machines (and is there some special configuration option 
> needed on the volumes and clients for that)? 
> 
> Thank you in advance for shedding some light... 
> 
> 
> Regards, 
> 
> -- 
> : Dario Berzano 
> : CERN PH-SFT & Università di Torino (Italy) 
> : Wiki: http://newton.ph.unito.it/~berzano 
> : GPG: http://newton.ph.unito.it/~berzano/gpg 
> _______________________________________________ 
> Gluster-users mailing list 
> [email protected] 
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users 
>

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Virtual machines and self-healing on GlusterFS v3.3

Reply via email to