Re: [DRBD-user] DRBD Filesystem Pacemaker Resources Stopped

Andreas Kurz Mon, 26 Mar 2012 16:18:37 -0700

On 03/24/2012 12:09 AM, Robert Langley wrote:
> Maybe I need to post this with Pacemaker? Not sure.
> I am a bit new to this scene and trying my best to learn all of this 
> (Linux/DRBD/Pacemaker/Heartbeat).
> 
> I am in the middle of following this document, "Highly available NFS storage 
> with DRBD and Pacemaker" located at:
> http://www.linbit.com/en/education/tech-guides/highly-available-nfs-with-drbd-and-pacemaker/
> 
> OS: Ubuntu 11.10
> DRBD version: 8.3.11
> Pacemaker version: 1.1.5
> I have two servers with 2.4 TB of internal hard drive space each, plus 
> mirrored hard drives for the OS. They both have 10 NICs (2 onboard in a bond 
> and 8 between 2, 4 port intel NICs).
> 
> Issue: I got to the end of part 4.3 (commit) and that is when things went 
> bad. I actually ended up with a split-brain and I seem to have recovered from 
> that, but now my resources are as follows (running crm_mon -1):
> My slave node is actually showing as the Master under the Master/Slave Set: 
> ms_drbd_nfs [p_drbd_nfs]
> Clone set Started
> Resource Group: Only p_lvm_nfs is Started on my slave node. All of the 
> Filesystem resources are Stopped.
> 
> Then, I have this at the bottom:
> Failed actions:
>             p_fs_vol01_start_0 (node=ds01, call=46, rc=5, status=complete): 
> not installed
>             p_fs_vol01_start_0 (node=ds02, call=430, rc=5, status=complete): 
> not installed


Mountpoint created on both nodes, defined correct device and valid file
system? What happens after a cleanup? ... crm resource cleanup
p_fs_vol01 ... grep for "Filesystem" in your logs to get the error
output from the resource agent.

For more ... please share current drbd state/configuration and your
cluster configuration.

Regards,
Andreas

-- 
Need help with DRBD?
http://www.hastexo.com/now

> 
> Looking in the syslog on ds01 (primary node) does not reveal anything worth 
> mentioning; but, looking at the syslog on ds02 (secondary node) shows the 
> following messages:
> 
> pengine: [11725]: notice: unpack_rsc_op: Hard error - p_fs_vol01_start_0 
> failed with rc=5: Preventing p_fs_vol01 from re-starting on ds01
> pengine: [11725]: WARN: unpack_rsc_op: Processing failed op 
> p_fs_vol01_start_0 on ds01: not installed (5)
> pengine: [11725]: notice: unpack_rsc_op: Operation 
> p_lsb_nfsserver:1_monitor_0 found resource p_lsb_nfsserver:1 active on ds02
> pengine: [11725]: notice: unpack_rsc_op: Hard error - p_fs_vol01_start_0 
> failed with rc=5: Preventing p_fs_vol01 from re-starting on ds02
> pengine: [11725]: WARN: unpack_rsc_op: Processing failed op 
> p_fs_vol01_start_0 on ds02: not installed (5)
> pengine: [11725]: notice: native_print: 
> failover-ip#011(ocf::heartbeat:IPaddr):#011Stopped
> pengine: [11725]: notice: clone_print:  Master/Slave Set: ms_drbd_nfs 
> [p_drbd_nfs]
> ...
> pengine: [11725]: WARN: common_apply_stickiness: Forcing p_fs_vol01 away from 
> ds01 after 1000000 failures (max=1000000)
> pengine: [11725]: notice: common_apply_stickiness: p_lvm_nfs can fail 999999 
> more times on ds02 before being forced off
> pengine: [11725]: WARN: common_apply_stickiness: Forcing p_fs_vol01 away from 
> ds02 after 1000000 failures (max=1000000)
> pengine: [11725]: notice: LogActions: Leave   failover-ip#011(Stopped)
> pengine: [11725]: notice: LogActions: Leave   p_drbd_nfs:0#011(Slave ds01)
> pengine: [11725]: notice: LogActions: Leave   p_drbd_nfs:1#011(Master ds02)
> 
> 
> Thank you in advance for any assistance,
> Robert
> 
> 
> _______________________________________________
> drbd-user mailing list
> [email protected]
> http://lists.linbit.com/mailman/listinfo/drbd-user

signature.asc
Description: OpenPGP digital signature

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] DRBD Filesystem Pacemaker Resources Stopped

Reply via email to