Re: Query Performance and Optimization

Christoph Kiehl Wed, 14 Mar 2007 06:33:54 -0800

Marcel Reutegger wrote:

Christoph Kiehl wrote:
Christoph Kiehl wrote:
I was digging a bit into Jackrabbit today and found another placewhere some caching did provide a substantial performance gain toqueries which check one attribute for more than one value (like/foo/[EMAIL PROTECTED]:bar='john' or foo:bar='doe']). The BitSet incalculateDocFilter() is right now created twice for the query above.On large repositories this takes about 200ms per BitSet on my machinefor a particular field. Caching these BitSets per IndexReader andfield in a WeakHashMap with the IndexReader as a key gave me somereal improvements.
agreed, this should definitively be cached per index segment and isdoable with reasonable effort.
I've created a jira issue: http://issues.apache.org/jira/browse/JCR-791


Are you working on this issue? Or should I try to implement something?

- I was referring to calculateDocFilter() inorg.apache.jackrabbit.core.query.lucene.MatchAllScorer- The achieved performance improvement varied between 30-60% dependingon the actual query
but that means your query is rather:

/foo/[EMAIL PROTECTED]:bar]

right?


Actually it's /foo/[EMAIL PROTECTED]:bar!='john']

@foo:bar='john' should be translated into a term query.

You are right. "="-comparisons translate into term queries whereas"!="-comparisons gets translated into MatchAllQueries.


It seems like if I rewrite the following query from

/foo/[EMAIL PROTECTED]:bar!='john' and @foo:bar!='doe']

to

/foo/*[not(@foo:bar='john' or @foo:bar='doe')]

I get a better performance. Can you confirm this?


Cheers,
Christoph

Re: Query Performance and Optimization

Reply via email to