[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Alex Parvulescu (Updated) (JIRA) Tue, 06 Dec 2011 04:30:05 -0800

     [ 
https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alex Parvulescu updated JCR-3162:
---------------------------------

    Attachment: JCR-3162-v3.patch

V3 comes with a complete redesign of the patch.

After further analysis we've decided to go with inspecting the incoming journal 
changes in the case of an initial index re-build.

I'll try to clarify. The scope of JCR-905 fix should *only* be for an initial 
index build. 
The initial indexing operation can cause doubles to appear, as some nodes can 
be seen by a slave before the ADD event has reached it. This happens because of 
shared storage between cluster nodes.
So, when a slave starts to re-index the repository content, it will include 
*everything* (potentially also nodes that is hasn't received a ADD event for 
yet). 
When the index finishes, the repository will continue its startup. A bit later, 
the cluster component will also initialize and consequently sync. This will 
pull in the ADD events that were pending in a newer revision, on the master.

The V3 tries to poll the changes before the cluster.sync call, and preemptively 
generate DELETE events for all the ADD events that it finds on the current 
workspace.
(this is similar to the JCR-905 patch, but with a much smaller scope).

 
Another feature introduced in the patch is to force flush the index after the 
initial index has been created.
This was artificially done in the original test case (no unit test though) by:
> However, when I debug clusternode 2 and have a breakpoint (i.e., a pause of a 
> few seconds at line 306 of RepositoryImpl.java - just before the clusternode 
> is started), then the resultset contains two results, both with the same UUID.

So forcing the index flush will correctly reproduce the original problem. And I 
think should be the correct behaviour of the original index creation.
On the other hand, not flushing the index will hide the problem because the 
indexing queue is smart enough to remove doubles.

But, flushing the index basically invalidates JCR-905, which is a bit 
unexpected (see attached patch, by switching the feature flags off).

On the code itself: I guess the AbstractJournal could use a bit of refactoring 
on the event polling side.


                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node 
> when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Reply via email to