Re: [ClusterLabs] drbd clone not becoming master

2017-11-03 Thread Dennis Jacobfeuerborn
On 03.11.2017 15:49, Ken Gaillot wrote:
> On Thu, 2017-11-02 at 23:18 +0100, Dennis Jacobfeuerborn wrote:
>> On 02.11.2017 23:08, Dennis Jacobfeuerborn wrote:
>>> Hi,
>>> I'm setting up a redundant NFS server for some experiments but
>>> almost
>>> immediately ran into a strange issue. The drbd clone resource never
>>> promotes either of the to clones to the Master state.
>>>
>>> The state says this:
>>>
>>>  Master/Slave Set: drbd-clone [drbd]
>>>  Slaves: [ nfsserver1 nfsserver2 ]
>>>  metadata-fs(ocf::heartbeat:Filesystem):Stopped
>>>
>>> The resource configuration looks like this:
>>>
>>> Resources:
>>>  Master: drbd-clone
>>>   Meta Attrs: master-node-max=1 clone-max=2 notify=true master-
>>> max=1
>>> clone-node-max=1
>>>   Resource: drbd (class=ocf provider=linbit type=drbd)
>>>    Attributes: drbd_resource=r0
>>>    Operations: demote interval=0s timeout=90 (drbd-demote-interval-
>>> 0s)
>>>    monitor interval=60s (drbd-monitor-interval-60s)
>>>    promote interval=0s timeout=90 (drbd-promote-
>>> interval-0s)
>>>    start interval=0s timeout=240 (drbd-start-interval-
>>> 0s)
>>>    stop interval=0s timeout=100 (drbd-stop-interval-0s)
>>>  Resource: metadata-fs (class=ocf provider=heartbeat
>>> type=Filesystem)
>>>   Attributes: device=/dev/drbd/by-res/r0/0
>>> directory=/var/lib/nfs_shared
>>> fstype=ext4 options=noatime
>>>   Operations: monitor interval=20 timeout=40
>>> (metadata-fs-monitor-interval-20)
>>>   start interval=0s timeout=60 (metadata-fs-start-
>>> interval-0s)
>>>   stop interval=0s timeout=60 (metadata-fs-stop-
>>> interval-0s)
>>>
>>> Location Constraints:
>>> Ordering Constraints:
>>>   promote drbd-clone then start metadata-fs (kind:Mandatory)
>>> Colocation Constraints:
>>>   metadata-fs with drbd-clone (score:INFINITY) (with-rsc-
>>> role:Master)
>>>
>>> Shouldn't one of the clones be promoted to the Master state
>>> automatically?
>>
>> I think the source of the issue is this:
>>
>> Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Called
>> /usr/sbin/crm_master -Q -l reboot -v 1
>> Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Exit code 107
>> Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Command
>> output:
>> Nov  2 23:12:03 nfsserver1 lrmd[2163]:  notice:
>> drbd_monitor_6:4673:stderr [ Error signing on to the CIB service:
>> Transport endpoint is not connected ]
>>
>> It seems the drbd resource agent tries to use crm_master to promote
>> the
>> clone but fails because it cannot "sign on to the CIB service". Does
>> anybody know what that means?
>>
>> Regards,
>>   Dennis
>>
> 
> That's odd, it should only happen if the cluster is not running, but
> then the agent wouldn't have been called.
> 
> The CIB is one of the core daemons of pacemaker; it manages the cluster
> configuration and status. If it's not running, the cluster can't do
> anything.
> 
> Perhaps the CIB is crashing, or something is blocking the communication
> between the agent and the CIB.

SELinux was the culprit. After disabling it the problem went away.

Regards,
  Dennis


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] drbd clone not becoming master

2017-11-03 Thread Ken Gaillot
On Thu, 2017-11-02 at 23:18 +0100, Dennis Jacobfeuerborn wrote:
> On 02.11.2017 23:08, Dennis Jacobfeuerborn wrote:
> > Hi,
> > I'm setting up a redundant NFS server for some experiments but
> > almost
> > immediately ran into a strange issue. The drbd clone resource never
> > promotes either of the to clones to the Master state.
> > 
> > The state says this:
> > 
> >  Master/Slave Set: drbd-clone [drbd]
> >  Slaves: [ nfsserver1 nfsserver2 ]
> >  metadata-fs(ocf::heartbeat:Filesystem):Stopped
> > 
> > The resource configuration looks like this:
> > 
> > Resources:
> >  Master: drbd-clone
> >   Meta Attrs: master-node-max=1 clone-max=2 notify=true master-
> > max=1
> > clone-node-max=1
> >   Resource: drbd (class=ocf provider=linbit type=drbd)
> >    Attributes: drbd_resource=r0
> >    Operations: demote interval=0s timeout=90 (drbd-demote-interval-
> > 0s)
> >    monitor interval=60s (drbd-monitor-interval-60s)
> >    promote interval=0s timeout=90 (drbd-promote-
> > interval-0s)
> >    start interval=0s timeout=240 (drbd-start-interval-
> > 0s)
> >    stop interval=0s timeout=100 (drbd-stop-interval-0s)
> >  Resource: metadata-fs (class=ocf provider=heartbeat
> > type=Filesystem)
> >   Attributes: device=/dev/drbd/by-res/r0/0
> > directory=/var/lib/nfs_shared
> > fstype=ext4 options=noatime
> >   Operations: monitor interval=20 timeout=40
> > (metadata-fs-monitor-interval-20)
> >   start interval=0s timeout=60 (metadata-fs-start-
> > interval-0s)
> >   stop interval=0s timeout=60 (metadata-fs-stop-
> > interval-0s)
> > 
> > Location Constraints:
> > Ordering Constraints:
> >   promote drbd-clone then start metadata-fs (kind:Mandatory)
> > Colocation Constraints:
> >   metadata-fs with drbd-clone (score:INFINITY) (with-rsc-
> > role:Master)
> > 
> > Shouldn't one of the clones be promoted to the Master state
> > automatically?
> 
> I think the source of the issue is this:
> 
> Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Called
> /usr/sbin/crm_master -Q -l reboot -v 1
> Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Exit code 107
> Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Command
> output:
> Nov  2 23:12:03 nfsserver1 lrmd[2163]:  notice:
> drbd_monitor_6:4673:stderr [ Error signing on to the CIB service:
> Transport endpoint is not connected ]
> 
> It seems the drbd resource agent tries to use crm_master to promote
> the
> clone but fails because it cannot "sign on to the CIB service". Does
> anybody know what that means?
> 
> Regards,
>   Dennis
> 

That's odd, it should only happen if the cluster is not running, but
then the agent wouldn't have been called.

The CIB is one of the core daemons of pacemaker; it manages the cluster
configuration and status. If it's not running, the cluster can't do
anything.

Perhaps the CIB is crashing, or something is blocking the communication
between the agent and the CIB.
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] drbd clone not becoming master

2017-11-02 Thread Dennis Jacobfeuerborn
On 02.11.2017 23:08, Dennis Jacobfeuerborn wrote:
> Hi,
> I'm setting up a redundant NFS server for some experiments but almost
> immediately ran into a strange issue. The drbd clone resource never
> promotes either of the to clones to the Master state.
> 
> The state says this:
> 
>  Master/Slave Set: drbd-clone [drbd]
>  Slaves: [ nfsserver1 nfsserver2 ]
>  metadata-fs  (ocf::heartbeat:Filesystem):Stopped
> 
> The resource configuration looks like this:
> 
> Resources:
>  Master: drbd-clone
>   Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1
> clone-node-max=1
>   Resource: drbd (class=ocf provider=linbit type=drbd)
>Attributes: drbd_resource=r0
>Operations: demote interval=0s timeout=90 (drbd-demote-interval-0s)
>monitor interval=60s (drbd-monitor-interval-60s)
>promote interval=0s timeout=90 (drbd-promote-interval-0s)
>start interval=0s timeout=240 (drbd-start-interval-0s)
>stop interval=0s timeout=100 (drbd-stop-interval-0s)
>  Resource: metadata-fs (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: device=/dev/drbd/by-res/r0/0 directory=/var/lib/nfs_shared
> fstype=ext4 options=noatime
>   Operations: monitor interval=20 timeout=40
> (metadata-fs-monitor-interval-20)
>   start interval=0s timeout=60 (metadata-fs-start-interval-0s)
>   stop interval=0s timeout=60 (metadata-fs-stop-interval-0s)
> 
> Location Constraints:
> Ordering Constraints:
>   promote drbd-clone then start metadata-fs (kind:Mandatory)
> Colocation Constraints:
>   metadata-fs with drbd-clone (score:INFINITY) (with-rsc-role:Master)
> 
> Shouldn't one of the clones be promoted to the Master state automatically?

I think the source of the issue is this:

Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Called
/usr/sbin/crm_master -Q -l reboot -v 1
Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Exit code 107
Nov  2 23:12:03 nfsserver1 drbd(drbd)[4673]: ERROR: r0: Command output:
Nov  2 23:12:03 nfsserver1 lrmd[2163]:  notice:
drbd_monitor_6:4673:stderr [ Error signing on to the CIB service:
Transport endpoint is not connected ]

It seems the drbd resource agent tries to use crm_master to promote the
clone but fails because it cannot "sign on to the CIB service". Does
anybody know what that means?

Regards,
  Dennis




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] drbd clone not becoming master

2017-11-02 Thread Dennis Jacobfeuerborn
Hi,
I'm setting up a redundant NFS server for some experiments but almost
immediately ran into a strange issue. The drbd clone resource never
promotes either of the to clones to the Master state.

The state says this:

 Master/Slave Set: drbd-clone [drbd]
 Slaves: [ nfsserver1 nfsserver2 ]
 metadata-fs(ocf::heartbeat:Filesystem):Stopped

The resource configuration looks like this:

Resources:
 Master: drbd-clone
  Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1
clone-node-max=1
  Resource: drbd (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=r0
   Operations: demote interval=0s timeout=90 (drbd-demote-interval-0s)
   monitor interval=60s (drbd-monitor-interval-60s)
   promote interval=0s timeout=90 (drbd-promote-interval-0s)
   start interval=0s timeout=240 (drbd-start-interval-0s)
   stop interval=0s timeout=100 (drbd-stop-interval-0s)
 Resource: metadata-fs (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd/by-res/r0/0 directory=/var/lib/nfs_shared
fstype=ext4 options=noatime
  Operations: monitor interval=20 timeout=40
(metadata-fs-monitor-interval-20)
  start interval=0s timeout=60 (metadata-fs-start-interval-0s)
  stop interval=0s timeout=60 (metadata-fs-stop-interval-0s)

Location Constraints:
Ordering Constraints:
  promote drbd-clone then start metadata-fs (kind:Mandatory)
Colocation Constraints:
  metadata-fs with drbd-clone (score:INFINITY) (with-rsc-role:Master)

Shouldn't one of the clones be promoted to the Master state automatically?

Regards,
  Dennis

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org