Ard Schrijvers wrote:
Christoph Kiehl wrote: 4. Regarding sorting: We will still need our own sorting because we cache
the document order per subreader whereas lucenes sorting only caches per
reader which get invalidated after every write operation. But the initial
cache creation will be faster.

That is a good point! I think in the sorting cache not the field prefix of
the terms where used, were they? If so, instead of performance gain, we might
gain quite some memory efficiency (though I am guessing here a little :-) )

Unfortunately it doesn't even help regarding memory consumption because we only cache the terms itself without the prefixes.

I think that beside all unit tests have to keep working, I might/should
include a performance unit test, to see if there are substantial gains.

Well, it would be great to have such a performance test but in my experience the repository you use to run your test against has to be at least of a certain size to give a notable difference. It's difficult to create such a repository in a portable way. It's too big to check into subversion and too big to create on the fly. It would be great to have some kind of reference repository. I thought about taking maybe a wikipedia snapshot (which are available for download) and pump this data into the repository. This will result in quite a big repository ...

I am not sure if there is an xpath equivalent to "give me all different
values of a property"...probably not, right?

I'm afraid not.

I wouldn't mind if you just start working on it ;) I'm sure Marcel is happy
to answer your questions, as am I if I'm able to ;) You could open a second
issue for the 1:1 mapping. Then just use those two issues and attach
patches. I'll definitely review them and try to help.

Ok. I'll file a jira issue on thursday for this, because tomorrow I am
occupied all day.

Great!

Cheers,
Christoph

Reply via email to