Hi All, in a distributed-replicated volume hosting some VMs disk images (GlusterFS 3.4.2 on CentOS 6.5, qemu-kvm with glusterfs native support, no fuse mount), I always get the same two files that need healing:
[root@networker ~]# gluster volume heal gv_pri info Gathering Heal info on volume gv_pri has been successful Brick nw1glus.gem.local:/glustexp/pri1/brick Number of entries: 2 /alfresco.qc2 /remlog.qc2 Brick nw2glus.gem.local:/glustexp/pri1/brick Number of entries: 2 /alfresco.qc2 /remlog.qc2 Brick nw3glus.gem.local:/glustexp/pri2/brick Number of entries: 0 Brick nw4glus.gem.local:/glustexp/pri2/brick Number of entries: 0 This is not a split-brain situation (I checked) and If I stop the two VMs that use these images, I get the two files healed/synced in about 15min. This is too much time, IMHO. In this volume there are other VM images with (smaller) disk images replicated on the same bricks and they get synced "in real-time". These are the volume's details, the host "networker" is nw1glus.gem.local : [root@networker ~]# gluster volume info gv_pri Volume Name: gv_pri Type: Distributed-Replicate Volume ID: 3d91b91e-4d72-484f-8655-e5ed8d38bb28 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: nw1glus.gem.local:/glustexp/pri1/brick Brick2: nw2glus.gem.local:/glustexp/pri1/brick Brick3: nw3glus.gem.local:/glustexp/pri2/brick Brick4: nw4glus.gem.local:/glustexp/pri2/brick Options Reconfigured: server.allow-insecure: on storage.owner-uid: 107 storage.owner-gid: 107 [root@networker ~]# gluster volume status gv_pri detail Status of volume: gv_pri ------------------------------------------------------------------------------ Brick : Brick nw1glus.gem.local:/glustexp/pri1/brick Port : 50178 Online : Y Pid : 25721 File System : xfs Device : /dev/mapper/vg_guests-lv_brick1 Mount Options : rw,noatime Inode Size : 512 Disk Space Free : 168.4GB Total Disk Space : 194.9GB Inode Count : 102236160 Free Inodes : 102236130 ------------------------------------------------------------------------------ Brick : Brick nw2glus.gem.local:/glustexp/pri1/brick Port : 50178 Online : Y Pid : 27832 File System : xfs Device : /dev/mapper/vg_guests-lv_brick1 Mount Options : rw,noatime Inode Size : 512 Disk Space Free : 168.4GB Total Disk Space : 194.9GB Inode Count : 102236160 Free Inodes : 102236130 ------------------------------------------------------------------------------ Brick : Brick nw3glus.gem.local:/glustexp/pri2/brick Port : 50182 Online : Y Pid : 14571 File System : xfs Device : /dev/mapper/vg_guests-lv_brick2 Mount Options : rw,noatime Inode Size : 512 Disk Space Free : 418.3GB Total Disk Space : 433.8GB Inode Count : 227540992 Free Inodes : 227540973 ------------------------------------------------------------------------------ Brick : Brick nw4glus.gem.local:/glustexp/pri2/brick Port : 50181 Online : Y Pid : 21942 File System : xfs Device : /dev/mapper/vg_guests-lv_brick2 Mount Options : rw,noatime Inode Size : 512 Disk Space Free : 418.3GB Total Disk Space : 433.8GB Inode Count : 227540992 Free Inodes : 227540973 fuse-mount of the gv_pri volume: [root@networker ~]# ll -h /mnt/gluspri/ totale 37G -rw-------. 1 qemu qemu 7,7G 24 gen 10:21 alfresco.qc2 -rw-------. 1 qemu qemu 4,2G 24 gen 10:22 check_mk-salmo.qc2 -rw-------. 1 qemu qemu 27M 23 gen 16:42 newnxserver.qc2 -rw-------. 1 qemu qemu 1,1G 23 gen 13:38 newubutest1.qc2 -rw-------. 1 qemu qemu 11G 24 gen 10:17 nxserver.qc2 -rw-------. 1 qemu qemu 8,1G 24 gen 10:17 remlog.qc2 -rw-------. 1 qemu qemu 5,6G 24 gen 10:19 ubutest1.qc2 Do you think this is the expected behaviour, maybe due to caching? What if the most updated node goes down while the VMs are running? Thanks a lot, Fabio Rosati
_______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
