Dario,
Nothing to worry then :-). It was a transient state. Every time an update
is done they are marked and after the update is over they are reset. Similarly
the output of 'gluster volume heal <volname> info' Keeps giving entries when
these flags are set and not show any output when these flags are reset. I
thought it was a persistent one. Seems like your Vm files are doing fine.
Pranith.
----- Original Message -----
From: "Dario Berzano" <[email protected]>
To: "Pranith Kumar Karampuri" <[email protected]>
Cc: "gluster-users" <[email protected]>
Sent: Monday, September 17, 2012 1:36:56 PM
Subject: Re: [Gluster-users] Virtual machines and self-healing on GlusterFS v3.3
Hi Pranith,
those bricks stay on different servers connected on the same switch: the only
possibility I see is that the switch went down for some reason, it is our only
single point of failure. The servers themselves never went down at the same
time.
I do not understand however why if I run getfattr continuously:
watch -n1 'getfattr -d -m . -e hex 1814/images/disk.0'
I get alternating:
trusted.afr.VmDir-client-0=0x000000010000000000000000
trusted.afr.VmDir-client-1=0x000000010000000000000000
and:
trusted.afr.VmDir-client-0=0x000000000000000000000000
trusted.afr.VmDir-client-1=0x000000000000000000000000
This again happens with every "big" file.
Does this suggest a network problem, maybe? One of the servers has 1 GbE while
the other one has a faster 10 GbE, but I do not think this is enough to
continuously de-synchronize the bricks...
Cheers
--
: Dario Berzano
: CERN PH-SFT & Università di Torino (Italy)
: Wiki: http://newton.ph.unito.it/~berzano
: GPG: http://newton.ph.unito.it/~berzano/gpg
: Mobiles: +41 766124782 (CH), +39 3487222520 (IT)
Il giorno 17/set/2012, alle ore 00:11, Pranith Kumar Karampuri
<[email protected]>
ha scritto:
> 1814/images/disk.0 has pending data change log for both subvolumes. i.e.
> 0x00000001. This happens when both the bricks go out at the same time, while
> an operation is in progress. Did that happen?
>
> Pranith.
>
> ----- Original Message -----
> From: "Dario Berzano" <[email protected]>
> To: "Pranith Kumar Karampuri" <[email protected]>
> Cc: "gluster-users" <[email protected]>
> Sent: Sunday, September 16, 2012 9:20:23 PM
> Subject: Re: [Gluster-users] Virtual machines and self-healing on GlusterFS
> v3.3
>
> Ok, here's the output for 1816/images/disk.0:
>
> # file: bricks/VmDir01/1816/images/disk.0
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
> trusted.afr.VmDir-client-0=0x000000000000000000000000
> trusted.afr.VmDir-client-1=0x000000000000000000000000
> trusted.gfid=0x1cef9d386f1c4424af6d95dfbcf2989b
>
> # file: bricks/VmDir02/1816/images/disk.0
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
> trusted.afr.VmDir-client-0=0x000000000000000000000000
> trusted.afr.VmDir-client-1=0x000000000000000000000000
> trusted.gfid=0x1cef9d386f1c4424af6d95dfbcf2989b
>
> And for 1814/images/disk.0:
>
> # file: bricks/VmDir01/1814/images/disk.0
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
> trusted.afr.VmDir-client-0=0x000000010000000000000000
> trusted.afr.VmDir-client-1=0x000000010000000000000000
> trusted.gfid=0xaabc0c344ccc4cfe8e2ed588dd78323b
>
> # file: bricks/VmDir02/1814/images/disk.0
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
> trusted.afr.VmDir-client-0=0x000000010000000000000000
> trusted.afr.VmDir-client-1=0x000000010000000000000000
> trusted.gfid=0xaabc0c344ccc4cfe8e2ed588dd78323b
>
> Note that these are just two sample files, since the problem occurs with 100%
> of our "big" virtual machines. Here's the whole content of the GlusterFS
> volume along with file sizes:
>
> 6.3G ./1981/images/disk.0
> 53M ./1820/images/disk.0
> 9.7G ./1838/images/disk.0
> 10G ./1819/images/disk.0
> 9.2G ./1818/images/disk.0
> 10G ./1816/images/disk.0
> 53M ./1962/images/disk.0
> 10G ./1814/images/disk.0
> 6.2G ./1988/images/disk.0
> 10G ./1817/images/disk.0
> 53M ./1821/images/disk.0
>
> We currently have 11 running VMs. The "small" ones (53 MB big) have never
> shown any problem so far. *All* the other VMs (6 to 10 GB big) periodically
> show up in the output of:
>
> gluster volume heal VmDir info
>
> when there's some intense I/O occuring, disappearing immediately shortly
> afterwards.
>
> Thanks, cheers,
> --
> : Dario Berzano
> : CERN PH-SFT & Università di Torino (Italy)
> : Wiki: http://newton.ph.unito.it/~berzano
> : GPG: http://newton.ph.unito.it/~berzano/gpg
> : Mobiles: +41 766124782 (CH), +39 3487222520 (IT)
>
>
> Il giorno 14/set/2012, alle ore 18:21, Pranith Kumar Karampuri
> <[email protected]> ha scritto:
>
>> Dario,
>> Ok that confirms that it is not a split-brain. Could you post the getfattr
>> output I requested as well?. What is the size of the VM files?.
>>
>> Pranith
>> ----- Original Message -----
>> From: "Dario Berzano" <[email protected]>
>> To: "Pranith Kumar Karampuri" <[email protected]>
>> Cc: "<[email protected]>" <[email protected]>
>> Sent: Friday, September 14, 2012 9:42:38 PM
>> Subject: Re: [Gluster-users] Virtual machines and self-healing on GlusterFS
>> v3.3
>>
>>
>> # gluster volume heal VmDir info healed
>>
>>
>> Heal operation on volume VmDir has been successful
>>
>>
>> Brick one-san-01:/bricks/VmDir01
>> Number of entries: 259
>> Segmentation fault (core dumped)
>>
>>
>> (same story for heal-failed) which seems to be exactly this bug:
>>
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=836421
>>
>>
>> Should I upgrade to latest QA RPMs to see what is going on?
>>
>>
>> Btw, with split-brain I have no entries:
>>
>>
>>
>> Heal operation on volume VmDir has been successful
>>
>>
>> Brick one-san-01:/bricks/VmDir01
>> Number of entries: 0
>>
>>
>> Brick one-san-02:/bricks/VmDir02
>> Number of entries: 0
>>
>>
>> Thank you, cheers,
>> --
>> : Dario Berzano
>> : CERN PH-SFT & Università di Torino (Italy)
>> : Wiki: http://newton.ph.unito.it/~berzano
>> : GPG: http://newton.ph.unito.it/~berzano/gpg
>> : Mobiles: +41 766124782 (CH), +39 3487222520 (IT)
>>
>>
>>
>>
>> Il giorno 14/set/2012, alle ore 17:16, Pranith Kumar Karampuri <
>> [email protected] >
>> ha scritto:
>>
>>
>> hi Dario,
>> Could you post the output of the following commands:
>> gluster volume heal VmDir info healed
>> gluster volume heal VmDir info split-brain
>>
>> Also provide the output of 'getfattr -d -m . -e hex' On both the bricks for
>> the two files listed in the output of 'gluster volume heal VmDir info'
>>
>> Pranith.
>>
>> ----- Original Message -----
>> From: "Dario Berzano" < [email protected] >
>> To: [email protected]
>> Sent: Friday, September 14, 2012 6:57:32 PM
>> Subject: [Gluster-users] Virtual machines and self-healing on GlusterFS v3.3
>>
>>
>>
>> Hello,
>>
>>
>> in our computing centre we have an infrastructure with a GlusterFS volume
>> made of two bricks in replicated mode:
>>
>>
>>
>>
>>
>> Volume Name: VmDir
>> Type: Replicate
>> Volume ID: 9aab85df-505c-460a-9e5b-381b1bf3c030
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: one-san-01:/bricks/VmDir01
>> Brick2: one-san-02:/bricks/VmDir02
>>
>>
>>
>>
>> We are using this volume to store running images of some KVM virtual
>> machines and thought we could benefit from the replicated storage in order
>> to achieve more robustness as well as the ability to live-migrate VMs.
>>
>>
>> Our GlusterFS volume VmDir is mounted on several (three at the moment)
>> hypervisors.
>>
>>
>> However, in many cases (but it is difficult to reproduce: best way is to
>> stress VM I/O), either when one brick becomes unavailable for some reason,
>> or when we perform live migrations, virtual machines decide to remount
>> filesystems from their virtual disks in read-only. At the same time, on the
>> hypervisors mounting the GlusterFS partitions, we spot some kernel messages
>> like:
>>
>>
>>
>>
>> INFO: task kvm:13560 blocked for more than 120 seconds.
>>
>>
>>
>>
>> By googling it I have found some "workarounds" to mitigate this problem,
>> like mounting disks within virtual machines with barrier=0:
>>
>>
>> http://invalidlogic.com/2012/04/28/ubuntu-precise-on-xenserver-disk-errors/
>>
>>
>> but I actually fear to damage my virtual machine disks by doing such a
>> thing!
>>
>>
>> AFAIK from GlusterFS v3.3 self-healing should be performed server-side (and
>> no self-healing at all is performed on the clients and by granularly locking
>> big files). When I connect to my GlusterFS pool, if I monitor the
>> self-healing status continuously:
>>
>>
>> watch -n1 'gluster volume heal VmDir info'
>>
>>
>> I obtain an output like:
>>
>>
>>
>>
>>
>> Heal operation on volume VmDir has been successful
>>
>>
>> Brick one-san-01:/bricks/VmDir01
>> Number of entries: 2
>> /1814/images/disk.0
>> /1816/images/disk.0
>>
>>
>> Brick one-san-02:/bricks/VmDir02
>> Number of entries: 2
>> /1816/images/disk.0
>> /1814/images/disk.0
>>
>>
>>
>>
>> with a list of virtual machine disks healed by GlusterFS. Those and other
>> files continuously appear and disappear from the list.
>>
>>
>> This is a behavior I don't understand at all: does this mean that those
>> files continuously get corrupted and healed, and self-healing is just a
>> natural part of the replication process?! Or some kind of corruption is
>> actually happening on our virtual disks for some reason? Is this related to
>> the "remount readonly" problem?
>>
>>
>> A more general question maybe would be: is GlusterFS v3.3 ready for storing
>> running virtual machines (and is there some special configuration option
>> needed on the volumes and clients for that)?
>>
>> Thank you in advance for shedding some light...
>>
>>
>> Regards,
>>
>> --
>> : Dario Berzano
>> : CERN PH-SFT & Università di Torino (Italy)
>> : Wiki: http://newton.ph.unito.it/~berzano
>> : GPG: http://newton.ph.unito.it/~berzano/gpg
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users