[
https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483105#comment-14483105
]
Robert Munteanu commented on OAK-2682:
--------------------------------------
[~egli] - I've looked into this issue briefly, as I'm intereted in contributing
a patch.
You mention in the issue description 'all nodes of the cluster'. I assume that
you mean an Oak cluster, not a MongoDB cluster. When talking about clock skew
in MongoDB, we actually have two situations:
- replica sets
- sharded clusters
For replica sets, the different MongoDB instances are actually visible to the
DocumentNodeStore as cluster members. For sharded clusters, Oak would connect
only to a {{mongos}} instance. We can of course find out the shards from the
config database, and connect separately to those {{mongod}} instances to run
the {{serverStatus}} command, but I find it unnecessarily cumbersome.
Furthermore, I see that MongoDB has its own clock skew detection for both
replica sets ( each replica set member does this check ) and for clustered
shards ( the monogos instances perform the check ). MongoDB is also tolerant of
some clock skew, but not too much ( [Mongos throwing clock skew
error?|https://groups.google.com/forum/#!topic/mongodb-user/SPi4Kqox16I]) .
TBH I see this more of an operations issue rather than something that
can/should be done into Oak and would rather suggesting dropping this. Thoughts?
/CC [~chetanm], [~mreutegg]
> Introduce time difference detection for mongoMk
> -----------------------------------------------
>
> Key: OAK-2682
> URL: https://issues.apache.org/jira/browse/OAK-2682
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core, mongomk
> Reporter: Stefan Egli
> Fix For: 1.3.0
>
>
> Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the
> assumption that the clocks are in perfect sync between all nodes of the
> cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are
> off by too much, and background operations happen to take couple seconds, you
> run the risk of timing out a lease. So introducing a check which WARNs if the
> clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help
> increase awareness. Further drastic measure could be to prevent a startup of
> Oak at all if the difference is for example higher than a 2nd threshold
> (optional I guess, but could be 20sec?).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)