Hello Marcel!
I'm mitziuro's collegue. I want to thank you for what you suggested
regarding JR's SearchIndex, it was very helpful !

With this configuration for a workspace's SearchIndex :
-------------------------
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
            <param name="path" value="${wsp.home}/index"/>
            <param name="textFilterClasses"
value="org.apache.jackrabbit.extractor.PlainTextExtractor,org.apache.jackrabbit.extractor.MsWordTextExtractor,org.apache.jackrabbit.extractor.MsExcelTextExtractor,org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,org.apache.jackrabbit.extractor.PdfTextExtractor,org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,org.apache.jackrabbit.extractor.RTFTextExtractor,org.apache.jackrabbit.extractor.HTMLTextExtractor,org.apache.jackrabbit.extractor.XMLTextExtractor"/>
            <param name="extractorPoolSize" value="2"/>
            <param name="supportHighlighting" value="true"/>
            <param name="respectDocumentOrder" value="true"/>
        </SearchIndex>
-------------------------
our application behaviour was something like this: 1. add a document, 2.
after saving the document, go to document listing, result: newly added
document wasn't there 3. another click on the document
listing page and voila, the document appeared, (sometimes you need to make
2-3 clicks on the document listing page).
So...my conclusion is that, even if text extraction takes longer than the
default extractorTimeout value (100ms), and a new background thread is
started to handler this,  newly added nodes still don't appear. Why is this
happening ? What else is keeping a lock on the node ?

Now, after I have commented out the textFilterClasses parameter from the
SearchIndex element, JR's works fine (I don't see anymore that latency). The
good news are that we don't need in our application for the moment text
extraction. But when we'll do need...the problem will reapear....


All the best,

Dan




On Fri, Sep 11, 2009 at 7:07 PM, Guo Du <[email protected]> wrote:

> On Fri, Sep 11, 2009 at 3:05 PM, Marcel Reutegger
> <[email protected]> wrote:
> > that's not quite correct. everything except text extraction is
> > guaranteed to be indexed as soon as the save (or transaction commit)
> > returns.
>
> Thanks for the knowledge!
>
> --Guo
>



-- 
Arcassis

Reply via email to