Duplicate entries in the index
------------------------------
Key: JCR-2357
URL: https://issues.apache.org/jira/browse/JCR-2357
Project: Jackrabbit Content Repository
Issue Type: Bug
Components: indexing
Affects Versions: 2.0-alpha9, 1.6.0, 1.5.0
Reporter: Benjamin Papez
Priority: Critical
We are very frequently having problems with queries, because the results are
always duplicated. The reason for it has been identified to be in the indexing,
because when we get the problem, we now look at the index with Luke, and we can
see that documents with the same UUID are there twice. This happens with
Jackrabbit 1.5.0, 1.6.0 and also with 2.0-alpha10.
I am able to recreate the problem quite regular. We have some event listeners,
which themselves call the node.save or session.save method. I made a breakpoint
in NodeIndexer.createDoc() and stop when there is a specific name in the
fullpath property. I can see that two listeners we wrote (
FieldReferenceListener and RulesListener) lead to the NodeIndexer.createDoc() -
below are the stacktraces (based on Jackrabbit 2.0-alpha10)
NodeIndexer.createDoc() line: 247
SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion)
line: 1122
SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635
SearchManager.onEvent(EventIterator) line: 459
EventConsumer.consumeEvents(EventStateCollection) line: 244
ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198
EventStateCollection.dispatch() line: 455
SharedItemStateManager$Update.end() line: 756
SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line:
1102
XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351
XAItemStateManager.update(ChangeLog) line: 354
XAItemStateManager(LocalItemStateManager).update() line: 326
SessionItemStateManager.update() line: 326
NodeImpl(ItemImpl).save() line: 1098
FieldReferenceListener.updateFullPath(Node) line: 193
FieldReferenceListener.onEvent(EventIterator) line: 101
EventConsumer.consumeEvents(EventStateCollection) line: 244
ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198
EventStateCollection.dispatch() line: 455
SharedItemStateManager$Update.end() line: 756
SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line:
1102
XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351
XAItemStateManager.update(ChangeLog) line: 354
XAItemStateManager(LocalItemStateManager).update() line: 326
SessionItemStateManager.update() line: 326
NodeImpl(ItemImpl).save() line: 1098
NodeIndexer.createDoc() line: 247
SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion)
line: 1122
SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635
SearchManager.onEvent(EventIterator) line: 459
EventConsumer.consumeEvents(EventStateCollection) line: 244
ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198
EventStateCollection.dispatch() line: 455
SharedItemStateManager$Update.end() line: 756
SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line:
1102
XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351
XAItemStateManager.update(ChangeLog) line: 354
XAItemStateManager(LocalItemStateManager).update() line: 326
SessionItemStateManager.update() line: 326
NodeImpl(ItemImpl).save() line: 1098
XASessionImpl(SessionImpl).save() line: 928
RulesListener.onEvent(EventIterator) line: 376
EventConsumer.consumeEvents(EventStateCollection) line: 244
ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198
EventStateCollection.dispatch() line: 455
SharedItemStateManager$Update.end() line: 756
SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line:
1102
XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351
XAItemStateManager.update(ChangeLog) line: 354
XAItemStateManager(LocalItemStateManager).update() line: 326
SessionItemStateManager.update() line: 326
NodeImpl(ItemImpl).save() line: 1098
Now when continuing to debug after the second NodeIndexer.createDoc() I see
that the node with the watched UUID is also passed within the remove collection
into SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) , but when
looking at the index with Luke before and after finishing this
SearchIndex.updateNodes, I see that the first document with the UUID was NOT
deleted and then we always have two documents with the same UUID.
Now it would be interesting, why the deletion failed. When debugging
DeleteNode.exceute and stopping at the watched UUID, I come down the following
methods:
CommittableIndexReader(IndexReader).deleteDocuments(Term) line: 873
PersistentIndex(AbstractIndex).removeDocument(Term) line: 225
PersistentIndex.removeDocument(Term) line: 90
MultiIndex$DeleteNode.execute(MultiIndex) line: 1951
MultiIndex.executeAndLog(MultiIndex$Action) line: 1084
MultiIndex.update(Collection<NodeId>, Collection<Document>) line: 442
SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 645
What I can in IndexReader.deleteDocuments(Term) is that docs.next() returns
true, but docs.doc() then returns 0 as the doc-id, which I think is the cause
that the right doc-id is not deleted.
Any clues what goes wrong here or hints what else I could watch for on
debugging ?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.