On 10/15/2016 12:27 PM, Dmitri Maziuk wrote: > On 2016-10-15 01:56, Jay Scott wrote: > >> So, what's wrong? (I'm a newbie, of course.) > > Here's what worked for me on centos 7: > http://octopus.bmrb.wisc.edu/dokuwiki/doku.php?id=sysadmin:pacemaker > YMMV and all that.
PS. I can't in all honesty recommend this setup for running NFS clusters at this point. About 1 in 3 times I do 'pcs standby <primary>' I get > Oct 15 15:31:52 lionfish crmd[1137]: notice: Initiating action 46: stop > drbd_filesystem_stop_0 on lionfish (local) > Oct 15 15:31:52 lionfish Filesystem(drbd_filesystem)[32120]: INFO: Running > stop for /dev/drbd0 on /raid > Oct 15 15:31:52 lionfish Filesystem(drbd_filesystem)[32120]: INFO: Trying to > unmount /raid > Oct 15 15:31:52 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid; trying cleanup with TERM > Oct 15 15:31:52 lionfish Filesystem(drbd_filesystem)[32120]: INFO: No > processes on /raid were signalled. force_unmount is set to 'yes' > Oct 15 15:31:53 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid; trying cleanup with TERM > Oct 15 15:31:53 lionfish Filesystem(drbd_filesystem)[32120]: INFO: No > processes on /raid were signalled. force_unmount is set to 'yes' > Oct 15 15:31:54 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid; trying cleanup with TERM > Oct 15 15:31:54 lionfish Filesystem(drbd_filesystem)[32120]: INFO: No > processes on /raid were signalled. force_unmount is set to 'yes' > Oct 15 15:31:56 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid; trying cleanup with KILL > Oct 15 15:31:56 lionfish Filesystem(drbd_filesystem)[32120]: INFO: No > processes on /raid were signalled. force_unmount is set to 'yes' > Oct 15 15:31:57 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid; trying cleanup with KILL > Oct 15 15:31:57 lionfish Filesystem(drbd_filesystem)[32120]: INFO: No > processes on /raid were signalled. force_unmount is set to 'yes' > Oct 15 15:31:58 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid; trying cleanup with KILL > Oct 15 15:31:58 lionfish Filesystem(drbd_filesystem)[32120]: INFO: No > processes on /raid were signalled. force_unmount is set to 'yes' > Oct 15 15:31:59 lionfish Filesystem(drbd_filesystem)[32120]: ERROR: Couldn't > unmount /raid, giving up! > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ umount: /raid: target is busy. ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ (In some cases useful info > about processes that use ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ the device is found by lsof(8) > or fuser(1)) ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid; > trying cleanup with TERM ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ umount: /raid: target is busy. ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ (In some cases useful info > about processes that use ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ the device is found by lsof(8) > or fuser(1)) ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid; > trying cleanup with TERM ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ umount: /raid: target is busy. ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ (In some cases useful info > about processes that use ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ the device is found by lsof(8) > or fuser(1)) ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid; > trying cleanup with TERM ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ umount: /raid: target is busy. ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ (In some cases useful info > about processes that use ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ the device is found by lsof(8) > or fuser(1)) ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid; > trying cleanup with KILL ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ umount: /raid: target is busy. ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ (In some cases useful info > about processes that use ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ the device is found by lsof(8) > or fuser(1)) ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid; > trying cleanup with KILL ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ umount: /raid: target is busy. ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ (In some cases useful info > about processes that use ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ the device is found by lsof(8) > or fuser(1)) ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid; > trying cleanup with KILL ] > Oct 15 15:32:00 lionfish lrmd[1134]: notice: > drbd_filesystem_stop_0:32120:stderr [ ocf-exit-reason:Couldn't unmount /raid, > giving up! ] > Oct 15 15:32:00 lionfish crmd[1137]: notice: Operation > drbd_filesystem_stop_0: unknown error (node=lionfish, call=91, rc=1, > cib-update=107, confirmed=true) > Oct 15 15:32:00 lionfish crmd[1137]: notice: > lionfish-drbd_filesystem_stop_0:91 [ umount: /raid: target is busy.\n > (In some cases useful info about p > rocesses that use\n the device is found by lsof(8) or > fuser(1))\nocf-exit-reason:Couldn't unmount /raid; trying cleanup with > TERM\numount: /raid: tar > get is busy.\n (In some cases useful info about processes that use\n > the device is found by lsof(8) or fuser(1))\nocf-exit-reason:Couldn't > unm > ount /raid; trying cleanup with TERM\numount: /raid: target is busy.\n > Oct 15 15:32:00 lionfish crmd[1137]: warning: Action 46 > (drbd_filesystem_stop_0) on lionfish failed (target: 0 vs. rc: 1): Error > Oct 15 15:32:00 lionfish crmd[1137]: notice: Transition aborted by > drbd_filesystem_stop_0 'modify' on lionfish: Event failed > (magic=0:1;46:4:0:700f71e0-d565 > -496f-a2c6-6b97f0cfd940, cib=0.128.10, source=match_graph_event:381, 0) and I have to take trip to the server room to power-cycle (aka stonith) the nodes. I haven't tried digging into it yet, for all I know the problem may be between the centos kernel and tainted elrepo drbd module -- "no processes were signalled" while "target is busy" may be a bug in the RA of course... -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
