Hi Jérôme,

the logs indicate that jackrabbit still orders results in document order:

org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl - 1537
node(s) ordered in 331882 ms

I assume you forgot to also apply the configuration change to any existing workspace.xml.

keep in mind that the repository.xml file has two purposes:
- configure repository wide services like security, versioning, etc.
- provide a *template* configuration for new workspaces

if you change the workspace section in repository.xml you only modify the behaviour for newly created workspaces.

as an alternative you can also add an 'order by' clause to your query:

//*[jcr:contains(., 'foo')] order by jcr:score descending

this will force jackrabbit to order result nodes by relevance, instead of expensive document order.

regards
 marcel


Jérôme BENOIS wrote:
Hi All,

        Thanks for your response.
        
        I carried out some tests with 50000 nodes (small nodes with 3
properties), i create this in 25 minutes and my store weigh 2.5Go.

        And when i execute a simple query is still long : ~5 minutes.

I applied your suggestion about document order here : <?xml version="1.0" encoding="ISO-8859-1"?>
<Repository>
    <FileSystem
class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
        <param name="path" value="${rep.home}/repository"/>
                <param name="persistent" value="true"/>
</FileSystem> <Security appName="Jackrabbit">
        <AccessManager
class="org.apache.jackrabbit.core.security.SimpleAccessManager"/>
        <LoginModule
class="org.apache.jackrabbit.core.security.SimpleLoginModule">
           <param name="anonymousId" value="anonymous"/>
        </LoginModule>
    </Security>
    <Workspaces rootPath="${rep.home}/workspaces"
defaultWorkspace="default"/>
            <Workspace name="${wsp.name}">
                        <FileSystem
class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
                        <param name="path" value="${wsp.home}"/>
</FileSystem> <PersistenceManager
class="org.apache.jackrabbit.core.state.obj.ObjectPersistenceManager"/>
            <SearchIndex
class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
                    <param name="path"
value="${wsp.home}/index"/> <param name="autoRepair" value="false"/>
                    <param name="respectDocumentOrder" value="false"/>
</SearchIndex> </Workspace>
            <Versioning rootPath="${rep.home}/version">
                        <FileSystem
class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
                                <param name="path" value="${rep.home}/version"/>
</FileSystem> <PersistenceManager class="org.apache.jackrabbit.core.state.obj.ObjectPersistenceManager"/> </Versioning>
</Repository>

        And i use suversion version, i activated debug mode when i launched my
simple query :
DEBUG main org.apache.jackrabbit.core.query.lucene.QueryImpl - Executing
query: + Root node
+ Select properties: *
  + PathQueryNode
    + LocationStepQueryNode:  NodeTest={} Descendants=false Index=NONE
    + LocationStepQueryNode:  NodeTest=* Descendants=true Index=NONE
      + AndQueryNode
        + NodeTypeQueryNode:
Prop={http://www.jcp.org/jcr/1.0}primaryType
Value={http://www.jcp.org/jcr/nt/1.0}unstructured
        + AndQueryNode
          + RelationQueryNode: Op: LIKE Prop={}email Type=STRING Value=a
%

DEBUG main org.apache.jackrabbit.core.query.lucene.AbstractIndex -
merging segments _0 (1 docs) into _1 (1 docs)
DEBUG main org.apache.jackrabbit.core.query.lucene.AbstractIndex -
closing IndexWriter.
INFO main org.apache.jackrabbit.core.query.lucene.DocNumberCache -
size=60/1024, #accesses=1001, #hits=941, #misses=60, cacheRatio=95%
DEBUG Timer-2 org.apache.jackrabbit.core.query.lucene.MultiIndex -
Flushing index after being idle for 3615 ms.
DEBUG Timer-2 org.apache.jackrabbit.core.query.lucene.IndexMerger -
index added: name=_ii, numDocs=1
DEBUG Timer-2 org.apache.jackrabbit.core.query.lucene.MultiIndex -
Committed in-memory index in 2ms.
DEBUG IndexMerger org.apache.jackrabbit.core.query.lucene.AbstractIndex
- merging segments _0 (8416 docs) into _1 (8416 docs)
INFO IndexMerger org.apache.jackrabbit.core.query.lucene.IndexMerger -
merged 8416 documents in 4206 ms into _ih.
DEBUG IndexMerger org.apache.jackrabbit.core.query.lucene.IndexMerger -
replace indexes
DEBUG IndexMerger org.apache.jackrabbit.core.query.lucene.AbstractIndex
- closing IndexWriter.
DEBUG IndexMerger org.apache.jackrabbit.core.query.lucene.IndexMerger -
index added: name=_ih, numDocs=8416
DEBUG Timer-2 org.apache.jackrabbit.core.query.lucene.MultiIndex -
Flushing index after being idle for 3339 ms.
DEBUG main
org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl - 1537
node(s) ordered in 331882 ms
INFO main fr.openmodel.cms.imports.process.TestImportMgtProcess -
testInsert 3 contentUnits.size()=1537
INFO Thread-4
org.apache.jackrabbit.core.observation.ObservationManagerFactory -
Notification of EventListeners stopped.
DEBUG Thread-4 org.apache.jackrabbit.core.query.lucene.IndexMerger -
dispose IndexMerger
INFO IndexMerger org.apache.jackrabbit.core.query.lucene.IndexMerger -
IndexMerger terminated
DEBUG Thread-4 org.apache.jackrabbit.core.query.lucene.IndexMerger -
quit sent
DEBUG Thread-4 org.apache.jackrabbit.core.query.lucene.IndexMerger -
IndexMerger thread stopped
DEBUG Thread-4 org.apache.jackrabbit.core.query.lucene.IndexMerger -
merge queue size: 0
INFO Thread-4 org.apache.jackrabbit.core.query.lucene.SearchIndex -
Index closed: /opt/jackrabbit/repotest/workspaces/default/index
DEBUG Thread-4
org.apache.jackrabbit.core.observation.ObservationManagerImpl - removing
EventListener: [EMAIL PROTECTED]
DEBUG Thread-4
org.apache.jackrabbit.core.observation.ObservationManagerImpl - removing
EventListener: [EMAIL PROTECTED]
DEBUG Thread-4
org.apache.jackrabbit.core.observation.ObservationManagerImpl - removing
EventListener: [EMAIL PROTECTED]


Can you help me please ?

Thanks for your help,

Best Regards,
Jérôme.


Le jeudi 12 janvier 2006 à 12:08 +0100, Marcel Reutegger a écrit :

try disabling document order on query results:

<SearchIndex
class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
    [...]
    <param name="respectDocumentOrder" value="false"/>

</SearchIndex>


information about document order is not stored in the index, that mean if you have a large result set, the query handler has to load nodes from storage, which is expensive compared to index lookups.

regards
 marcel

Reply via email to