On Wed, Nov 12, 2014 at 12:33:49PM -0600, Zev Weiss wrote:
> Hi,
> 
> I recently had the following occur on the primary node of a DRBD resource, 
> running DRBD 8.4.5 on CentOS 6.6 (kernel 2.6.32-504.el6.x86_64):
> 
> Nov 11 05:34:54 kernel: block drbd5: Remote failed to finish a request within 
> ko-count * timeout
> Nov 11 05:34:54 kernel: block drbd5: peer( Secondary -> Unknown ) conn( 
> Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
> 
> Being unfamiliar with ko-count, I looked at the documentation and found:
> 
>     ko-count number
>         In case the secondary node fails to complete a single write request 
> for count times the timeout, it is expelled from the cluster. (I.e. the 
> primary node goes into StandAlone mode.) The default value is 0, which 
> disables this feature.
> 
> The thing is -- nowhere in my config was ko-count set.  So seeing it 
> apparently kick in was an unwelcome surprise.  I have since set ko-count and 
> timeout to "large" values in the hope that it doesn't happen again.

So documentation is NEVER EVER wrong, right?

ko-count 0 (disabled) was 8.3 default.
ko-count 7 (iirc) is 8.4 default.

drbdsetup 5 show --show-default

How about explicitly configure ko-count 0, if you mean it.

> Is this a DRBD bug, or expected behavior?  If it's somehow the latter,
> I think the combination of the documentation and error messages is
> quite misleading and should be fixed.

-- 
: Lars Ellenberg
: http://www.LINBIT.com | Your Way to High Availability
: DRBD, Linux-HA  and  Pacemaker support and consulting

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to