I'm looking to figure out exactly how gluster's geo-rep works. I have a general idea, but I still have some questions.

How, exactly, does gsyncd's crawl work to determine files to update? I have a FS w/ 50 million+ inodes and I'm just wondering how that crawl will scale. I assume that when an inode is modified, some xattr is set on each parent path to the root. gsyncd reads this xattr and is able to efficiently crawl the tree to find updates? Am I completely wrong?

My two sites will be connected via a dedicated leased line on a non-routable address space, so I'm not concerned about using SSH at the moment. I see that gsyncd recognizes gluster vol definitions for the master; server:vol.

Does it also recognize gluster vol definitions for the slave system, i.e.

gluster volume geo-replication glusterfs://master:vol glusterfs://slave:vol ...

or does it need a directory path for the slave,

... glusterfs://master:vol file:///mnt/slave_vol
... glusterfs://master:vol ssh://slave:vol
...

I assume that the latter case uses ssh to fire up a gsyncd on the slave and listen over ssh.

Is there a doc somewhere with more details on this? The docs on the gluster site leave a lot of questions.


Thanks,
-Brian

Brian Smith
Senior Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to