[ 
https://issues.apache.org/jira/browse/OAK-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530655#comment-14530655
 ] 

Marcel Reutegger commented on OAK-2829:
---------------------------------------

I modified the existing ObservationTest benchmark in oak-run to run it 
concurrently with multiple Oak cluster nodes. I'm was able to reproduce a 
growing observation queue running two cluster nodes like this:

{noformat}
java -DsaveInterval=4 -DwriterCount=4 -Xmx1g -jar 
target/oak-run-1.4-SNAPSHOT.jar benchmark ObservationTest Oak-Mongo --db 
oak-observation --dropDBAfterTest false
{noformat}

Only one cluster node showed a growing observation queue. The other cluster 
node (started first) only had short spikes up to 100 or 200 entries in the 
queue.

Below graphs show 1) the queue size / number of external changes and 2) number 
of nodes and events created.

!graph.png|width=100%! 

Whenever the number of external changes goes down by one, that is the change 
processor dequeues an external change and starts processing it, the queue 
starts to grow. This is because local changes are added but can only be removed 
after the external change finished processing. The graph also shows pauses in 
event processing, which correspond with pauses for creating nodes. It looks 
like the pauses are related to background read operations. Nodes are created 
(and the queue starts to grow) again about the same time when a new external 
change is enqueued.

At about 90 seconds into the test (90'000 on the x-axis) the other cluster node 
finished the test. From this time on there are no more pauses. Probably because 
no more background reads are necessary. The queue size still does not decrease 
because local changes are compacted and turned into pseudo external changes. 
Those again trigger slow calculation of diffs filling the queue, and so on....

> Comparing node states for external changes is too slow
> ------------------------------------------------------
>
>                 Key: OAK-2829
>                 URL: https://issues.apache.org/jira/browse/OAK-2829
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core, mongomk
>            Reporter: Marcel Reutegger
>            Assignee: Marcel Reutegger
>             Fix For: 1.3.0
>
>         Attachments: graph.png
>
>
> Comparing node states for local changes has been improved already with 
> OAK-2669. But in a clustered setup generating events for external changes 
> cannot make use of the introduced cache and is therefore slower. This can 
> result in a growing observation queue, eventually reaching the configured 
> limit. See also OAK-2683.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to