On 03/24/2012 12:09 AM, Robert Langley wrote: > Maybe I need to post this with Pacemaker? Not sure. > I am a bit new to this scene and trying my best to learn all of this > (Linux/DRBD/Pacemaker/Heartbeat). > > I am in the middle of following this document, "Highly available NFS storage > with DRBD and Pacemaker" located at: > http://www.linbit.com/en/education/tech-guides/highly-available-nfs-with-drbd-and-pacemaker/ > > OS: Ubuntu 11.10 > DRBD version: 8.3.11 > Pacemaker version: 1.1.5 > I have two servers with 2.4 TB of internal hard drive space each, plus > mirrored hard drives for the OS. They both have 10 NICs (2 onboard in a bond > and 8 between 2, 4 port intel NICs). > > Issue: I got to the end of part 4.3 (commit) and that is when things went > bad. I actually ended up with a split-brain and I seem to have recovered from > that, but now my resources are as follows (running crm_mon -1): > My slave node is actually showing as the Master under the Master/Slave Set: > ms_drbd_nfs [p_drbd_nfs] > Clone set Started > Resource Group: Only p_lvm_nfs is Started on my slave node. All of the > Filesystem resources are Stopped. > > Then, I have this at the bottom: > Failed actions: > p_fs_vol01_start_0 (node=ds01, call=46, rc=5, status=complete): > not installed > p_fs_vol01_start_0 (node=ds02, call=430, rc=5, status=complete): > not installed
Mountpoint created on both nodes, defined correct device and valid file system? What happens after a cleanup? ... crm resource cleanup p_fs_vol01 ... grep for "Filesystem" in your logs to get the error output from the resource agent. For more ... please share current drbd state/configuration and your cluster configuration. Regards, Andreas -- Need help with DRBD? http://www.hastexo.com/now > > Looking in the syslog on ds01 (primary node) does not reveal anything worth > mentioning; but, looking at the syslog on ds02 (secondary node) shows the > following messages: > > pengine: [11725]: notice: unpack_rsc_op: Hard error - p_fs_vol01_start_0 > failed with rc=5: Preventing p_fs_vol01 from re-starting on ds01 > pengine: [11725]: WARN: unpack_rsc_op: Processing failed op > p_fs_vol01_start_0 on ds01: not installed (5) > pengine: [11725]: notice: unpack_rsc_op: Operation > p_lsb_nfsserver:1_monitor_0 found resource p_lsb_nfsserver:1 active on ds02 > pengine: [11725]: notice: unpack_rsc_op: Hard error - p_fs_vol01_start_0 > failed with rc=5: Preventing p_fs_vol01 from re-starting on ds02 > pengine: [11725]: WARN: unpack_rsc_op: Processing failed op > p_fs_vol01_start_0 on ds02: not installed (5) > pengine: [11725]: notice: native_print: > failover-ip#011(ocf::heartbeat:IPaddr):#011Stopped > pengine: [11725]: notice: clone_print: Master/Slave Set: ms_drbd_nfs > [p_drbd_nfs] > ... > pengine: [11725]: WARN: common_apply_stickiness: Forcing p_fs_vol01 away from > ds01 after 1000000 failures (max=1000000) > pengine: [11725]: notice: common_apply_stickiness: p_lvm_nfs can fail 999999 > more times on ds02 before being forced off > pengine: [11725]: WARN: common_apply_stickiness: Forcing p_fs_vol01 away from > ds02 after 1000000 failures (max=1000000) > pengine: [11725]: notice: LogActions: Leave failover-ip#011(Stopped) > pengine: [11725]: notice: LogActions: Leave p_drbd_nfs:0#011(Slave ds01) > pengine: [11725]: notice: LogActions: Leave p_drbd_nfs:1#011(Master ds02) > > > Thank you in advance for any assistance, > Robert > > > _______________________________________________ > drbd-user mailing list > [email protected] > http://lists.linbit.com/mailman/listinfo/drbd-user
signature.asc
Description: OpenPGP digital signature
_______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
