Re: improving the scalability in searching

Christoph Kiehl Tue, 21 Aug 2007 12:40:49 -0700

Ard Schrijvers wrote:

So, WDOT about indexing properties in seperate lucene Fields, and about
possibly indexing more information of one property. My experience with
lucene, is that indexing tactically, eases querying a lot, and gains you lots
of performance. So, if you do agree on these changes, which I can try to
build in Jackrabbit, then I think these changes might validate a new
QueryHandler class to be build aside the old one. WDOT?

In general I think it's a good idea to have a 1:1 mapping of properties tolucene fields. It's just more natural and easier to understand as you said.

Performance wise I'm not sure if it will gain you "lots of performance". I justhad a quick look at the code and found the following places where I think theperformance will improve:

1. DerefQuery can directly query for matching documents instead of iteratingover all context hits.2. MatchAllScorer would perform better. But you made an even better suggestionhow to handle those in the future.

3. WildcardQuery will probably improve a bit because you have less terms.

4. Regarding sorting: We will still need our own sorting because we cache thedocument order per subreader whereas lucenes sorting only caches per readerwhich get invalidated after every write operation. But the initial cachecreation will be faster.

Overall I wouldn't expect a _much_ better performance. Or could you explain whatother performance improvements you expect?

But I would definitely like to see the 1:1 mapping, because some parts of thecode become better/easier to understand and even those small performanceimprovements are a gain.I wouldn't mind if you just start working on it ;) I'm sure Marcel is happy toanswer your questions, as am I if I'm able to ;)You could open a second issue for the 1:1 mapping. Then just use those twoissues and attach patches. I'll definitely review them and try to help.


Thanks a lot for your efforts!

Cheers,
Christoph

Re: improving the scalability in searching

Reply via email to