On Wed, Nov 12, 2014 at 12:33:49PM -0600, Zev Weiss wrote: > Hi, > > I recently had the following occur on the primary node of a DRBD resource, > running DRBD 8.4.5 on CentOS 6.6 (kernel 2.6.32-504.el6.x86_64): > > Nov 11 05:34:54 kernel: block drbd5: Remote failed to finish a request within > ko-count * timeout > Nov 11 05:34:54 kernel: block drbd5: peer( Secondary -> Unknown ) conn( > Connected -> Timeout ) pdsk( UpToDate -> DUnknown ) > > Being unfamiliar with ko-count, I looked at the documentation and found: > > ko-count number > In case the secondary node fails to complete a single write request > for count times the timeout, it is expelled from the cluster. (I.e. the > primary node goes into StandAlone mode.) The default value is 0, which > disables this feature. > > The thing is -- nowhere in my config was ko-count set. So seeing it > apparently kick in was an unwelcome surprise. I have since set ko-count and > timeout to "large" values in the hope that it doesn't happen again.
So documentation is NEVER EVER wrong, right? ko-count 0 (disabled) was 8.3 default. ko-count 7 (iirc) is 8.4 default. drbdsetup 5 show --show-default How about explicitly configure ko-count 0, if you mean it. > Is this a DRBD bug, or expected behavior? If it's somehow the latter, > I think the combination of the documentation and error messages is > quite misleading and should be fixed. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
