Ard Schrijvers wrote:

So, WDOT about indexing properties in seperate lucene Fields, and about
possibly indexing more information of one property. My experience with
lucene, is that indexing tactically, eases querying a lot, and gains you lots
of performance. So, if you do agree on these changes, which I can try to
build in Jackrabbit, then I think these changes might validate a new
QueryHandler class to be build aside the old one. WDOT?

In general I think it's a good idea to have a 1:1 mapping of properties to lucene fields. It's just more natural and easier to understand as you said.

Performance wise I'm not sure if it will gain you "lots of performance". I just had a quick look at the code and found the following places where I think the performance will improve:

1. DerefQuery can directly query for matching documents instead of iterating over all context hits. 2. MatchAllScorer would perform better. But you made an even better suggestion how to handle those in the future.
3. WildcardQuery will probably improve a bit because you have less terms.
4. Regarding sorting: We will still need our own sorting because we cache the document order per subreader whereas lucenes sorting only caches per reader which get invalidated after every write operation. But the initial cache creation will be faster.

Overall I wouldn't expect a _much_ better performance. Or could you explain what other performance improvements you expect?

But I would definitely like to see the 1:1 mapping, because some parts of the code become better/easier to understand and even those small performance improvements are a gain. I wouldn't mind if you just start working on it ;) I'm sure Marcel is happy to answer your questions, as am I if I'm able to ;) You could open a second issue for the 1:1 mapping. Then just use those two issues and attach patches. I'll definitely review them and try to help.

Thanks a lot for your efforts!

Cheers,
Christoph

Reply via email to