Hi Pavel, killing the brick proces, is the way to go. This way, all other bricks on that server, will keep working. After you replace/fix the disk,
A restart of the glusterd proces should me should be enough, to get the brick back online. (self-healing scan, can take some IO) Do you have some logs, about the brick that would not start? Btw, IO error on XFS? Did you lose some files from brick/.glusterfs, which can explain why the brick will not start up. Grtz, Jiri > On 15 Apr 2015, at 17:05, Pavel Riha <[email protected]> wrote: > > Thank you for your reply. > > but btw what is the right way to do this? > stoping the glusterd service does not stop the glustefsd daemons itself > https://bugzilla.redhat.com/show_bug.cgi?id=988946 > > and I have more volumes running, but only one with this problem. > I haven't found any official way how to stop the process, so I just KILLed > them. > It worked.. partiton repaired, seems ok for now. > > But how to run the brick again?? > I didn't save the cmdline showed in ps, but it was crazy. As I see the other > running .. there are crazy numbers (uuid, socked, port) > and the port (for ex) is not the same as on the other server... > > so I restarted the glusterd service .. nothing happend .. I was hopeless > .. but after a while I recognized, that the process is running, so maybe the > glusterd started it after a while > > there should be some way to stop or at least start one brick > > > > Pavel > > > > On 15.4.2015 11:59, Sander Zijlstra wrote: >> Hi Pavel, >> >> you can simply stop the glusterd service and run the fsck, it's similar to >> rebooting a server which is part of a replicated volume. If all is ok before >> you can simply take down one of the two and once it comes back online it >> will be heal each file which hasn't been copied allready. >> >> Do take care of any client which has the volume mounted using the server you >> take down; that will loose connection also. >> >> Met vriendelijke groet / kind regards, >> >> Sander Zijlstra >> >> Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam | >> +31 (0)6 43 99 12 47 | [email protected] | www.surfsara.nl | >> >> ----- Original Message ----- >> From: "Pavel Riha" <[email protected]> >> To: [email protected] >> Sent: Wednesday, 15 April, 2015 10:28:50 >> Subject: [Gluster-users] how to check/fix underlaying partition error? >> >> Hi guys, >> >> I have replicated glusterfs (v3.4.2) on two server and I found logs >> filled by IO error on one server only. But in /var/log/messages is no hw >> error, only XFS error, so I gues the filesystem could be corrupted >> >> My question is, how to stop or pause this brick and run fsck ? >> From the replicate feature I'm expecting no need to stop the gluster >> volume (there are some xen VM running) >> >> what is the right way to do it? with the later re-adding and fast >> rebuild/sync in mind.. >> >> thank for tips >> >> Pavel >> >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://www.gluster.org/mailman/listinfo/gluster-users >> > _______________________________________________ > Gluster-users mailing list > [email protected] > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
