Ian Boston wrote:
So, if you have 50x200MB of Lucene index... for example.... and wanted that to be accessible in a cluster environment, would Jackrabbit be a good place to put those segments ?
just to clarify, would this lucene index be 'application data', which is stored like regular content through the JCR api? Or do you mean the jackrabbit internal lucene segments?
The big killer for Lucene is the ability to seek efficiently on the central blob (I think), but presumably by choosing the right Binary storage strategy that comes partially for free ?
Jackrabbit always copies a binary to a temp file or into memory when the property value is accessed. That is, the seek would always be local. But as I already mentioned in another thread, JCR does not support random access on binary properties. A binary property returns a plain InputStream.
If this is the case, I could replace my, slightly odd, segment distribution mechanism with Jackrabbit.
yes, you certainly get a couple of goodies you otherwise don't have. e.g. observation on the index files ;)
Last question, Is JCR-169 being actively worked on ?
It doesn't have a high priority, but we are working on it on a conceptual level. discussions during coffee breaks, etc. Basically how the problems stated in JCR-169 can be solved and what needs to be changed in the core to implement the feature blocks in a clustered environment.
Is there an area where another pair of hands would help... I would like to be able to deploy Jackrabbit in a cluster.
One major area is how changes from one cluster node are distributed to other cluster nodes. Giota implemented something like a prototype, but I'm not sure what the current state is. See also this discussion: http://thread.gmane.org/gmane.comp.apache.jackrabbit.devel/6935
Or any other area mentioned in JCR-169, you can simply pick one ;) regards marcel
