Stefan Egli created OAK-2844:
--------------------------------

             Summary: Introducing a simple mongo-based discovery-light service 
(to circumvent mongoMk's eventual consistency delays)
                 Key: OAK-2844
                 URL: https://issues.apache.org/jira/browse/OAK-2844
             Project: Jackrabbit Oak
          Issue Type: New Feature
          Components: mongomk
            Reporter: Stefan Egli
             Fix For: 1.4


When running discovery.impl on a mongoMk-backed jcr repository, there are risks 
of hitting problems such as described in "SLING-3432 
pseudo-network-partitioning": this happens when a jcr-level heartbeat does not 
reach peers within the configured heartbeat timeout - it then treats that 
affected instance as dead, removes it from the topology, and continues with the 
remainings, potentially electing a new leader, running the risk of duplicate 
leaders. This happens when delays in mongoMk grow larger than the (configured) 
heartbeat timeout. These problems ultimately are due to the 'eventual 
consistency' nature of, not only mongoDB, but more so of mongoMk. The only 
alternative so far is to increase the heartbeat timeout to match the expected 
or measured delays that mongoMk can produce (under say given load/performance 
scenarios).

Assuming that mongoMk will always carry a risk of certain delays and a maximum, 
reasonable (for discovery.impl timeout that is) maximum cannot be guaranteed, a 
better solution is to provide discovery with more 'real-time' like information 
and/or privileged access to mongoDb.

Here's a summary of alternatives that have so far been floating around as a 
solution to circumvent eventual consistency:
 # expose existing (jmx) information about active 'clusterIds' - this has been 
proposed in SLING-4603. The pros: reuse of existing functionality. The cons: 
going via jmx, binding of exposed functionality as 'to be maintained API'
 # expose a plain mongo db/collection (via osgi injection) such that a higher 
(sling) level discovery could directly write heartbeats there. The pros: 
heartbeat latency would be minimal (assuming the collection is not sharded). 
The cons: exposes a mongo db/collection potentially also to anyone else, with 
the risk of opening up to unwanted possibilities
 # introduce a simple 'discovery-light' API to oak which solely provides 
information about which instances are active in a cluster. The implementation 
of this is not exposed. The pros: no need to expose a mongoDb/collection, 
allows any other jmx-functionality to remain unchanged. The cons: a new API 
that must be maintained

This ticket is about the 3rd option, about a new mongo-based discovery-light 
service that is introduced to oak. The functionality in short:
 * it defines a 'local instance id' that is non-persisted, ie can change at 
each bundle activation.
 * it defines a 'view id' that uniquely identifies a particular incarnation of 
a 'cluster view/state' (which is: a list of active instance ids)
 * and it defines a list of active instance ids
 * the above attributes are passed to interested components via a listener that 
can be registered. that listener is called whenever the discovery-light notices 
the cluster view has changed.

While the actual implementation could in fact be based on the existing 
{{getActiveClusterNodes()}} {{getClusterId()}} of the 
{{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, as 
that has dependencies to other logic. But instead, the suggestion is to create 
a dedicated, other, collection ('discovery') where heartbeats as well as the 
currentView are stored.

Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to