Ard Schrijvers wrote:
So, WDOT about indexing properties in seperate lucene Fields, and about
possibly indexing more information of one property. My experience with
lucene, is that indexing tactically, eases querying a lot, and gains you lots
of performance. So, if you do agree on these changes, which I can try to
build in Jackrabbit, then I think these changes might validate a new
QueryHandler class to be build aside the old one. WDOT?
In general I think it's a good idea to have a 1:1 mapping of properties to
lucene fields. It's just more natural and easier to understand as you said.
Performance wise I'm not sure if it will gain you "lots of performance". I just
had a quick look at the code and found the following places where I think the
performance will improve:
1. DerefQuery can directly query for matching documents instead of iterating
over all context hits.
2. MatchAllScorer would perform better. But you made an even better suggestion
how to handle those in the future.
3. WildcardQuery will probably improve a bit because you have less terms.
4. Regarding sorting: We will still need our own sorting because we cache the
document order per subreader whereas lucenes sorting only caches per reader
which get invalidated after every write operation. But the initial cache
creation will be faster.
Overall I wouldn't expect a _much_ better performance. Or could you explain what
other performance improvements you expect?
But I would definitely like to see the 1:1 mapping, because some parts of the
code become better/easier to understand and even those small performance
improvements are a gain.
I wouldn't mind if you just start working on it ;) I'm sure Marcel is happy to
answer your questions, as am I if I'm able to ;)
You could open a second issue for the 1:1 mapping. Then just use those two
issues and attach patches. I'll definitely review them and try to help.
Thanks a lot for your efforts!
Cheers,
Christoph