[
https://issues.apache.org/jira/browse/OAK-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chetan Mehrotra updated OAK-1645:
---------------------------------
Attachment: OAK-1645-1.patch
[patch|^OAK-1645-1.patch] implementing the above approach. Changes can also be
seen at [1]
Major change done wrt design is that code logic has to explicitly define the
ReadPreferences where read preference needs to be {{ReadPreference.primary()}}
or {{ReadPreference.primaryPrefered()}}. For all other places the DB level
default value would be used. This value if not set defaults to
{{ReadPreference.primary()}}.
Based on principal outlined above (in bug description) code flow would decide
to is read from Slave is Ok or not. If read can be read from slave/secondary
then readPreference as specified in DB setting would be used.
User would then be able to specify the read preference as per requirement via
MongoURI [2]. So if a user wants that reads from secondaries should prefer
secondary with tag {{dc:ny,rack:1}} otherwise they go to other secondary then
he can specify that via following mongouri
bq.
mongodb://example1.com,example2.com,example3.com/?readPreference=secondary&readPreferenceTags=dc:ny,rack:1&readPreferenceTags=dc:ny&readPreferenceTags=
Couple of points to note
# DocumentNodeStore now expose a setting {{maxReplicationLagInSecs}}. This
determines the duration beyond it can be safely assumed that state on secondary
would be consistent with primary and its safe to read from them. For example
while reading /foo/bar we check for modifiedTime of /foo. If its modified time
is before {{currentTime - maxReplicationLagInSecs}} then it can be assumbed
that /foo/bar has also not been modified and its value can be safely read from
secondary
# Previous Doc created while splitting are now fetched with
maxAge=Integer.MAX_VALUE. As split docs are immutable then it is safe to read
them from secondary. If a read from secondary fails (say due to split doc yet
not reached secondary) the read is tried against primary
# BlobGC now does not specify a ReadPreference and instead rely on default DB
level preference
# In case its not safe to read from secondary we use
{{ReadPreference.primaryPreferred()}} instead of {{ReadPreference.primary()}}.
This should allow read to proceed even in case where primary is still being
elected. {color:brown}need Review{color}
*How to enable*
By default if a replica set is configured like
_mongodb://example1.com,example2.com_. then also reads would be directed to
primary. To enable reads from secondary user should specify the readPreference
as part of mongo uri
[~mreutegg], [~tmueller], [~amitj_76] Kindly review the patch
[1]
https://github.com/chetanmeh/jackrabbit-oak/compare/apache:eb9f3e7d8fd27128be4d429ef90dd4ad5b8d7ff9...OAK-1645
[2]
http://docs.mongodb.org/manual/reference/connection-string/#read-preference-options
> Route find queries to Mongo secondary in MongoDocumentStore
> -----------------------------------------------------------
>
> Key: OAK-1645
> URL: https://issues.apache.org/jira/browse/OAK-1645
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: mongomk
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.1
>
> Attachments: OAK-1645-1.patch
>
>
> Currently MongoDocumentStores routes all find query to primary. In some case
> it is possible to route the call to secondary safely
> *1. Make use of Max Age*
> Find call takes a maxAge parameter
> {code}
> find(Collection<T> collection, String key, int maxCacheAge)
> {code}
> If the maxAge is high then its safe to route the call to secondary as the
> caller explicitly does not want the latest version. This would be specially
> useful in fetching split documents as such docs are immutable. So logic can
> first check in secondary and if not found can make a call to primary
> *2. Make use of modified time of parent*
> When fetch a path its possible to check if the parent exist in the cache or
> not. if parent is present in cache we can make use of its {{modified}} time.
> If the modified time is old it indicates that subtree under it has also not
> been modified. So call for such child can be routed to secondary
> In both cases we need to have a time interval defined to switch the logic to
> secondary call
--
This message was sent by Atlassian JIRA
(v6.2#6252)