[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-05-24 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488777#comment-16488777
 ] 

Marcel Reutegger commented on OAK-6087:
---

Documentation updated here: 
https://jackrabbit.apache.org/oak/docs/nodestore/document/mongo-document-store.html

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Fix For: 1.10, 1.9.3
>
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-05-08 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467305#comment-16467305
 ] 

Marcel Reutegger commented on OAK-6087:
---

For completeness, without latency, numbers look slightly better when client 
sessions are disabled:

{noformat}
# latency 0ms
# ConcurrentCreateNodesTest 
# oak.mongo.clientSession = false
Oak-Mongo  131463176350542034509
  17
Oak-Mongo  131603183344342484641
  18
Oak-Mongo  131643185355844284587
  17

# oak.mongo.clientSession = true
Oak-Mongo  134153577393346634764
  16
Oak-Mongo  132673324370747325432
  16
Oak-Mongo  130073129345945304988
  17
{noformat}


> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-05-08 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467283#comment-16467283
 ] 

Marcel Reutegger commented on OAK-6087:
---

I ran some more tests on a three node replica-set. The setup is basically what 
Tomek built [here|https://github.com/trekawek/global-mongo]. The members in the 
replica-set were also tagged. The primary and secondary in the first datacenter 
with {{dc:dc1}} and the other (near) secondary with {{dc:dc2}}. The latency is 
only between Oak and the first datacenter where the primary and a secondary are 
located. The connection to the other secondary does not have additional latency.

All were performed with
{noformat}
w=majority=secondaryPreferred=dc:dc2=
{noformat}
and each benchmark was run three times.

The ConcurrentReadWriteTest measures read operations while the system is under 
load with 20 more readers and one writer. The client session doesn't show a 
significant impact on the minimum and 10 percentile timings, but the 90 
percentile is already noticeably better and maximum time drops significantly.

{noformat}
# latency 5ms
# ConcurrentReadWriteTest  C min 10% 50% 90% max
   N 
# oak.mongo.clientSession = false
Oak-Mongo  1 102 191 269 4252291
 173
Oak-Mongo  1 137 201 280 5632300
 164
Oak-Mongo  1 119 188 278 4351980
 178

## oak.mongo.clientSession = true
Oak-Mongo  1 145 207 285 436 654
 196
Oak-Mongo  1 126 183 261 3691273
 215
Oak-Mongo  1 106 196 268 3911246
 205
{noformat}

ConcurrentCreateNodesTest adds nodes with 20 concurrent writers. The test is 
write heavy, but also performs many read operations, because the 
DocumentNodeStore must check whether a node already exists when it is added. 
Timings are slightly better with 5ms latency to the primary and the difference 
becomes bigger with higher latency (e.g. 20 ms).

{noformat}
# latency 5ms
# ConcurrentCreateNodesTest 
# oak.mongo.clientSession = false
Oak-Mongo  136703672413646654756
  15
Oak-Mongo  137353761403943884617
  15
Oak-Mongo  137403747407946144957
  15

# oak.mongo.clientSession = true
Oak-Mongo  133563411383643354537
  16
Oak-Mongo  133643396374840064220
  17
Oak-Mongo  135003550381443734463
  16
{noformat}

{noformat}
# latency 20ms
# ConcurrentCreateNodesTest 
# oak.mongo.clientSession = false
Oak-Mongo  169406940724279007900
   9
Oak-Mongo  169606960740179707970
   9
Oak-Mongo  169726972717377297729
   9

# oak.mongo.clientSession = true
Oak-Mongo  156105617587761596191
  11
Oak-Mongo  156615679595062966313
  11
Oak-Mongo  15503553861456649
  10
{noformat}


> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read 

[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-05-03 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462246#comment-16462246
 ] 

Marcel Reutegger commented on OAK-6087:
---

bq. if I understand this correctly, this means that we might pick a lagging 
nearest secondary which can block our query until it catches up to our client 
session?

Yes, this is indeed possible and we should probably have a way to detect this. 
We could configure the MongoDocumentStore with a write concern that includes 
the nearest secondary. This would ensure each write operation done by a 
MongoDocumentStore is immediately available on the secondary. However the 
performance penalty would just be passed to the write operation.

bq. Imo, we should see that we still are ok perf wise!?

Agreed. The tests I've done so far show a slight impact on performance when 
every thing is on the same machine. However, depending on the deployment and 
how close a secondary is to Oak, the performance may even improve. E.g. because 
latency to the secondary is lower than to the primary.

I'll perform some more tests and report the results. 

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-04-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458422#comment-16458422
 ] 

Tomek Rękawek commented on OAK-6087:


With regards to OAK-3865: big +1, if MongoDB now provides this kind of 
consistency we should use it.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-04-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454316#comment-16454316
 ] 

Vikas Saurabh commented on OAK-6087:


[~mreutegg], quoting from \[0]
{quote}
If you query a secondary that hasn’t yet caught up to that point in time, 
according to the Lamport Clock, then your query blocks until the secondary 
replicates to that point.
{quote}
Maybe that blob is a bit old, but if I understand this correctly, this means 
that we _might_ pick a lagging nearest secondary which can block our query 
until it catches up to our client session? Otoh, our current impl, afaik, goes 
for secondary reads if all secondaries are caught up - while that would go to 
primary often, but won't block a query afaict.

I'm not saying that it's not a good idea to go for causally consistent session 
api - just that our access pattern might change a bit. Imo, we should see that 
we still are ok perf wise!?

\[0]: 
https://emptysqua.re/blog/driver-features-for-mongodb-3-6/#causal-consistency

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-04-26 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454258#comment-16454258
 ] 

Marcel Reutegger commented on OAK-6087:
---

Proposed changes in [^OAK-6087.patch]. The patch was created from the current 
state of the github branch mentioned earlier in this issue.

The patch introduces a new line to the travis job matrix. The module 
oak-store-document is built and tested against a MongoDB replicaset with a read 
preference set to secondary-preferred.
MongoDB client sessions with causal consistency are used by default when 
running on MongoDB 3.6 or newer. The 'feature' can be disabled by setting 
{{-Doak.mongo.clientSession=false}}. The MongoDocumentStore then falls back to 
the Oak 1.8 behaviour.
Some tests had to be changed to run on a MongoDB 3.6 replica set and a 
readPreference of secondaryPreferred. Those tests assumed causal consistent 
reads across multiple MongoDocumentStore instance, which is not the case with 
secondaryPreferred.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
> Attachments: OAK-6087.patch
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-04-23 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449258#comment-16449258
 ] 

Chetan Mehrotra commented on OAK-6087:
--

+1 to use client sessions and remove time based custom heuristic to route the 
calls to secondary.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-04-23 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447775#comment-16447775
 ] 

Marcel Reutegger commented on OAK-6087:
---

[~tomek.rekawek], [~chetanm], using client sessions with causal consistency on 
MongoDB 3.6 looks very promising and would allow us to remove existing code you 
two wrote. More specifically, OAK-3865 and OAK-1645.

What do you think about removing code introduced with OAK-1645 and OAK-3865 
after this improvement is implemented? It would reduce the code base and 
complexity of the MongoDocumentStore implementation quite a bit. On the other 
hand, MongoDB 3.6 would be required to route reads to a secondary. Earlier 
versions of MongoDB will still work, but reads would all go to the primary.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-03-22 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409745#comment-16409745
 ] 

Marcel Reutegger commented on OAK-6087:
---

Work in progress available on github: 
https://github.com/mreutegg/jackrabbit-oak/tree/OAK-6087

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
>  Labels: scalability
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-03-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406035#comment-16406035
 ] 

Marcel Reutegger commented on OAK-6087:
---

MongoDB 3.6 with causal consistency using client sessions requires some more 
work in Oak because support for client sessions was only added to the 
{{com.mongodb.client}} API introduced with 3.0. The MongoDocumentStore still 
uses the legacy API in {{com.mongodb}}. E.g. {{com.mongodb.DB}} vs 
{{com.mongodb.client.MongoDatabase}}.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Priority: Major
>  Labels: scalability
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2018-02-21 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371486#comment-16371486
 ] 

Marcel Reutegger commented on OAK-6087:
---

Using MongoDB 3.6 [client sessions with causal 
consistency|https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/#client-sessions]
 may also be an option.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>Priority: Major
>  Labels: scalability
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6087) Avoid reads from MongoDB primary

2017-10-10 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198359#comment-16198359
 ] 

Marcel Reutegger commented on OAK-6087:
---

Reading entirely from a secondary could be achieved with a custom write 
concern. An Oak cluster node would be tied to a nearby secondary and the write 
concern would be use a tag set that requires the write to propagate to the 
secondary.

> Avoid reads from MongoDB primary
> 
>
> Key: OAK-6087
> URL: https://issues.apache.org/jira/browse/OAK-6087
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Marcel Reutegger
>  Labels: scalability
> Fix For: 1.8
>
>
> With OAK-2106 Oak now attempts to read from a MongoDB secondary when it 
> detects the requested data is available on the secondary.
> When multiple Oak cluster nodes are deployed on a MongoDB replica set, many 
> reads are still directed to the primary. One of the reasons why this is seen 
> in practice, are observers and JCR event listeners that are triggered rather 
> soon after a change happens and therefore read recently modified documents. 
> This makes it difficult for Oak to direct calls to a nearby secondary, 
> because changes may not yet be available there.
> A rather simple solution for the observers may be to delay processing of 
> changes until they are available on the near secondary.
> A more sophisticated solution discussed offline could hide the replica set 
> entirely and always read from the nearest secondary. Writes would obviously 
> still go to the primary, but only return when the write is available also on 
> the nearest secondary. This guarantees that any subsequent read is able to 
> see the preceding write.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)