On 11/16/2010 07:54 PM, Craig Carl wrote:
On 11/16/2010 03:07 PM, Stephan von Krawczynski wrote:
which files
are not in sync in a replication setup? There is no trivial answer to this
question I already brought up in early 2.X development phase...
How can you sell someone a storage platform if you're unable to answer such an essential question? Really, nobody needed auto-healing. All you need is the answer to this question and then stat exactly this file list at a time _of
your choice_.

On the sync question you brought up that is only an issue in the rare case of split brain (if I understand the scenario you've brought up). Split brain is a difficult problem with no answer right now. Gluster 3.1 added much more aggressive locking to reduce the possibility of split brain. The process you described as "...the deamons are talking with each other about whatever..." will also reduce the likelihood of split brain by eliminating the possibility that client or server vol files are not the same across the entire cluster, the cause of a vast majority of split brain issues with Gluster. Auto heal is slow, we have some processes along the lines you are thinking, please let me know if these address some of your ideas around stat -

#cd <gluster mount>
#find ./ -type f -exec stat /<backend device>’{}’ \; this will heal only the files on that device.

If you know when you had a failure you want to recover from this is even faster -

#cd <gluster mount>
#find ./ -type f -mmin <minutes since failure+ some extra> -exec stat /<backend device>’{}’ \; this will heal only the files on that device changed x or more minutes ago.

See also http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2088 which is an enhancement request addressing exactly this issue.
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to