Re: Re: XPath query performance question

mslama Fri, 03 Feb 2012 07:25:03 -0800

No property 'calais' is not used anywhere else. So if I use query without path 
info it will return the same result.


Marek

> ------------ Původní zpráva ------------
> Od: Alessandro <[email protected]>
> Předmět: Re: XPath query performance question
> Datum: 03.2.2012 16:12:15
> ----------------------------------------
> If you were running the query without path restrictions, would it return more
> than one node? In other words, outside the /companies tree, are there other
> company nodes with the same calais attribute value?
> Results are generated from the predicate, and then filtered by the path.
>
> Alessandro
>
> On Feb 3, 2012, at 7:13 AM, [email protected] wrote:
>
> > Hi,
> >
> > I have following use case:
> >
> > I have about 2000 company nodes under node companies:
> > /companies/company[1]
> > /companies/company[2]
> > ....
> > /companies/company[N]
> >
> > I query for one company by property value - exact match, no wildcards. And
> result should contain just one node. For example I use query:
> >
> >
> //companies/company[@calais='http://d.opencalais.com/er/company/ralg-tr1r/2c970a55-e08d-3af8-ad1d-3c46f341e749']
> >
> > and then one call of NodeIterator.next to get unique (or first as there is 
> > no
> constraint on uniqueness) result. So there is no big resultset.
> >
> > Property 'calais' is string type and when set it is unique ie. small number 
> > of
> company nodes may have this property either empty or missing. Property value 
> can
> be up to 100chars long if it can make any difference for index.
> >
> > When only one thread is running it takes 100-200ms. When 4 threads are 
> > running
> it is about 500ms on average. I used
> > profiler with sampling to get some profiling data. I seems to be too much
> provided that number on nodes is not that high
> > and it is using Lucene index. Calls of query.execute and nodeIterator.next
> take both about the same time.
> > When I checked thread dumps it uses Lucene index so it does not look like it
> scans all nodes.
> >
> > Question: Is there any way how speedup this kind of lookup? The only way I
> found so far is to incorporate the most often property used for lookup to node
> path as session.getNode(path) is much faster.
> >
> > I use Jackrabbit 2.2.9 and Postgres 9.1 for saving all data but Lucene 
> > index.
> It runs on JBoss 7.
> >
> > I searched for Jackrabbit XPath performance but no match for my use case:
> > a) exact property match without like/wildcards
> > b) small resultset - just one result item
> >
> > Thanks
> >
> > Marek
>
>
>

Marek Slama
[email protected]

Re: Re: XPath query performance question

Reply via email to