[ 
http://issues.apache.org/jira/browse/JCR-169?page=comments#action_12432083 ] 
            
Marcel Reutegger commented on JCR-169:
--------------------------------------

Ian, thanks a lot for your comments.

Here are my current thoughts on clustering the search index in jackrabbit:

I think the prefered approach is to put the index into the repository itself. 
See: http://article.gmane.org/gmane.comp.apache.jackrabbit.devel/8530 and 
following messages
This would also allow us to distribute index updates to cluster nodes using the 
repository internal observation mechanism. e.g. the update of a deleted 
documents file or new index segments.

> I found the best indexing strategy was to have local copies of segments, 
> stored centrally as masters.

I agree. Specifically the design of lucene where index files are only created 
but never modified supports this approach very nicely.

> Im the search application, speed of update of segments is not that critical,
> you probably have a different requirement in JCR. 

JCR is more restrictive in that respect, at least if we want to be compliant 
with the specification. As soon as a node is created in the workspace it must 
be searchable using a query. For most real life systems this is not a hard 
requirement though. E.g. when a document is added to a repository, it usually 
doesn't matter if it is retrievable by query only after a couple of seconds and 
not right away.


> Make Jackrabbit clusterable
> ---------------------------
>
>                 Key: JCR-169
>                 URL: http://issues.apache.org/jira/browse/JCR-169
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Marcel Reutegger
>            Priority: Minor
>
> This jira issue discusses the technical implications on the current design of 
> Jackrabbit to introduce clustering.
> Particularly the following areas require thorough investigation:
> - SharedItemStateManager and its cache
>     - cache integrity
>     - cache design: look aside, write through?
>     - hook for distributed cache, interface?
>     - isolation level
>     - transaction integrity within Jackrabbit, interaction with transient 
> layer
> - VirtualItemStateProvider
>     - same strategy as SharedItemStateManager?
> - Search index
>     - single or per cluster node index?
> - Observation
> Please state more areas if needed.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to