No property 'calais' is not used anywhere else. So if I use query without path info it will return the same result.
Marek > ------------ Původní zpráva ------------ > Od: Alessandro <[email protected]> > Předmět: Re: XPath query performance question > Datum: 03.2.2012 16:12:15 > ---------------------------------------- > If you were running the query without path restrictions, would it return more > than one node? In other words, outside the /companies tree, are there other > company nodes with the same calais attribute value? > Results are generated from the predicate, and then filtered by the path. > > Alessandro > > On Feb 3, 2012, at 7:13 AM, [email protected] wrote: > > > Hi, > > > > I have following use case: > > > > I have about 2000 company nodes under node companies: > > /companies/company[1] > > /companies/company[2] > > .... > > /companies/company[N] > > > > I query for one company by property value - exact match, no wildcards. And > result should contain just one node. For example I use query: > > > > > //companies/company[@calais='http://d.opencalais.com/er/company/ralg-tr1r/2c970a55-e08d-3af8-ad1d-3c46f341e749'] > > > > and then one call of NodeIterator.next to get unique (or first as there is > > no > constraint on uniqueness) result. So there is no big resultset. > > > > Property 'calais' is string type and when set it is unique ie. small number > > of > company nodes may have this property either empty or missing. Property value > can > be up to 100chars long if it can make any difference for index. > > > > When only one thread is running it takes 100-200ms. When 4 threads are > > running > it is about 500ms on average. I used > > profiler with sampling to get some profiling data. I seems to be too much > provided that number on nodes is not that high > > and it is using Lucene index. Calls of query.execute and nodeIterator.next > take both about the same time. > > When I checked thread dumps it uses Lucene index so it does not look like it > scans all nodes. > > > > Question: Is there any way how speedup this kind of lookup? The only way I > found so far is to incorporate the most often property used for lookup to node > path as session.getNode(path) is much faster. > > > > I use Jackrabbit 2.2.9 and Postgres 9.1 for saving all data but Lucene > > index. > It runs on JBoss 7. > > > > I searched for Jackrabbit XPath performance but no match for my use case: > > a) exact property match without like/wildcards > > b) small resultset - just one result item > > > > Thanks > > > > Marek > > > Marek Slama [email protected]
