Re: [Gluster-users] Issue recreating volumes

Amar Tumballi Thu, 07 Jun 2012 22:02:58 -0700

Hi Brian,

Answers inline.

Here are a couple of wrinkles I have come across while trying gluster 3.3.0
under ubuntu-12.04.

(1) At one point I decided to delete some volumes and recreate them. But
it would not let me recreate them:

     root@dev-storage2:~# gluster volume create fast 
dev-storage1:/disk/storage1/fast dev-storage2:/disk/storage2/fast
     /disk/storage2/fast or a prefix of it is already part of a volume

This is even though "gluster volume info" showed no volumes.

Restarting glusterd didn't help either. Nor indeed did a complete reinstall
of glusterfs, even with apt-get remove --purge and rm -rf'ing the state
directories.

Digging around, I found some hidden state files:

     # ls -l /disk/storage1/*/.glusterfs/00/00
     /disk/storage1/fast/.glusterfs/00/00:
     total 0
     lrwxrwxrwx 1 root root 8 Jun  7 14:23 00000000-0000-0000-0000-000000000001 
->  ../../..

     /disk/storage1/safe/.glusterfs/00/00:
     total 0
     lrwxrwxrwx 1 root root 8 Jun  7 14:21 00000000-0000-0000-0000-000000000001 
->  ../../..

I deleted them on both machines:

     rm -rf /disk/*/.glusterfs

Problem solved? No, not even with glusterd restart :-(

     root@dev-storage2:~# gluster volume create safe replica 2 
dev-storage1:/disk/storage1/safe dev-storage2:/disk/storage2/safe
     /disk/storage2/safe or a prefix of it is already part of a volume

In the end, what I needed was to delete the actual data bricks themselves:

     rm -rf /disk/*/fast
     rm -rf /disk/*/safe

That allowed me to recreate the volumes.

This is probably an understanding/documentation issue. I'm sure there's a
lot of magic going on in the gluster 3.3 internals (is that long ID some
sort of replica update sequence number?) which if it were fully documented
would make it easier to recover from these situations.

Preventing of 'recreating' of a volume (actually internally, it justprevents you from 're-using' the bricks, you can create same volume namewith different bricks), is very much intentional to prevent disasters(like data loss) from happening.

We treat data separate from volume's config information. Hence, when avolume is 'delete'd, only the configuration details of the volume islost, but data belonging to the volume is present on its brick as is. Itis admin's discretion to handle the data later.

Considering above point, now, if we allow 're-using' of the same brickwhich was part of some volume earlier, it could lead to issues of dataplacement in wrong brick, internal inode number clashes etc, which couldlead to 'heal' the data from client perspective, leading to deletingsome files which would be important.

If admin is aware of the case, and knows that there is no 'data' insidethe brick, then easier option is to delete the export dir and it getscreated by 'gluster volume create'. If you want to fix it withoutdeleting the export directory, then it is also possible, by deleting theextended attributes on the brick like below.


bash# setfattr -x trusted.glusterfs.volume-id $brickdir
bash# setfattr -x trusted.gfid $brickdir


And now, creating the brick should succeed.


(2) Minor point: the FUSE client no longer seems to understand or need the
"_netdev" option, however it still invokes it if you use "defaults" in
/etc/fstab, and so you get a warning about an unknown option:

     root@dev-storage1:~# grep gluster /etc/fstab
     storage1:/safe /gluster/safe glusterfs defaults,nobootwait 0 0
     storage1:/fast /gluster/fast glusterfs defaults,nobootwait 0 0

     root@dev-storage1:~# mount /gluster/safe
     unknown option _netdev (ignored)


Will look into this.

Regards,
Amar
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Issue recreating volumes

Reply via email to