Joe - Scott had sent me a private email and I provided the work around, for some (unknown) reason all the nodes ended up having two uuids for a particular peer which caused it. I've asked for the log files to further debug.
On Fri, 17 Feb 2017 at 21:58, Joe Julian <[email protected]> wrote: > Does your repaired server have the correct uuid /var/lib/glusterd/ > glusterd.info? > > On February 16, 2017 9:49:56 PM PST, Scott Hazelhurst < > [email protected]> wrote: > > > Dear all > > Last week I posted a query about a problem I had with a machine that had > failed but the underlying hard disk with the gluster brick was good. I’ve > made some progress in restoring. I now have the problem with my new restored > machine where it becomes its own peer, which then breaks everything. > > 1. Gluster daemons are off on all peers, content of /var/lib/glusterd/peers > looks good. > 2. I start the gluster daemons on all peers. All looks good. > 3. For about 2 minutes, there’s no obvious problem — if I do a gluster peer > status on any machine it looks good, if I do a gluster volume status A01 on > any machine it looks good. > 4. Then at some point, the /var/lib/glusterd/peers file of the new, restored > machine gets an entry for itself and things start breaking. A typical error > message is the understandable > > : Unable to get lock for uuid: 4fb930f7-554e-462a-9204-4592591feeb8, lock > held by: 4fb930f7-554e-462a-9204-4592591feeb8 > > 5. This is repeatable — if I stop daemons, remove the offending entry in > /var/lib/glusterd/peer, and restart, the same behavior occurs — all good for > a minute or two and then something magically puts something in > /var/lib/glusterd/peers > > In a previous step in restoring my machine, I had a different error of > mismatching cksums and what I did then may be the cause of the problem. In > searching the list archives I found someone with a similar cksum problem, and > the proposed solution was to copy the /var/lib/glusterd/vols/ from another of > the peers to the new machine. This may not be the issue but this is the only > thing I think I did that was unconventional. > > I am running version 3.7.5-19 on Scientific Linux 6.8 > > If anyone can suggest a way forward I would be grateful > > Many thanks > > Scott > > > <table width="100%" border="0" cellspacing="0" cellpadding="0" > style="width:100%;"> > <tr> > <td align="left" style="text-align:justify;"><font face="arial,sans-serif" > size="1" color="#999999"><span style="font-size:11px;">This communication is > intended for the addressee only. It is confidential. If you have received > this communication in error, please notify us immediately and destroy the > original message. You may not copy or disseminate this communication without > the permission of the University. Only authorised signatories are competent > to enter into agreements on behalf of the University and recipients are thus > advised that the content of this message may not be legally binding on the > University and may contain the personal views and opinions of the author, > which are not necessarily the views and opinions of The University of the > Witwatersrand, Johannesburg. All agreements between the University and > outsiders are subject to South African Law unless the University agrees in > writing to the contrary. </span></font></td> > </tr> > </table > ------------------------------ > > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users > > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users -- - Atin (atinm)
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
