Hi Ard,
excellent work!
Ard Schrijvers wrote:
Christoph Kiehl wrote:
Very nice analysis! It's indeed a very tricky bug ;)
UUIDDocId should not use WeakReferences on the one hand and
equals() on the other hand.
Maybe we should better return the same instance of a
CombinedIndexReader in SearchIndex.getIndexReader() if
possible and use a "==" comparison in UUIDDocId instead?
Yes, this is IMO the best solution. We could have a static HashMap with
key-value pairs workspacename-combinedIndexReader, and in
SearchIndex.getIndexReader() return the combinedIndexReader from the
static hashmap, and on changing an index, clear the combinedIndexReader
from the hashmap. But, perhaps somebody has a much neater solution..?
:-)
caching the combined index reader works as long as there are no changes, but
that is rarely the case. things in a workspace will change frequently, which
means with every change the UUIDDocId must be recalculated.
instead of keeping a reference to the combined index reader the UUIDDocId should
rather keep a reference to the reader of the index segment, which returned the
document number for the uuid. with this change the document number does not have
to be recalculated just because the combined index reader changed.
But
that's just a quick guess. Unfortunately I hadn't the time to
really dig into it and I'm out of town until Wednesday. But
maybe Marcel could comment on this?
Also, I want to change the DocId.UUIDDocId String uuid into storing only
2 long's, the lsb and msb, since when re-using a combinedIndexReader
instance, the number of UUIDDocId can grow very large, implying quite a
bit more memory use.
WDOT? Shall I create two seperate JIRA issues for it, or just a single
one?
Please create two separate issues. We can get rid of the uuid string independent
of the other changes/optimizations.
regards
marcel