[
https://issues.apache.org/jira/browse/OAK-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vikas Saurabh reassigned OAK-4358:
----------------------------------
Assignee: Marcel Reutegger (was: Vikas Saurabh)
[~mreutegg], As discussed, I was trying to do optimized iteration in
getChanges(prop, minRev) to then change the logic in getNewestRevision to do
max(getChanges(_revision), getChanges(_commitRoot)). Here's a test that I was
working with (adaptation of existing {{getChangesMixedClusterIds}}):
{code}
@Test
public void getChangesMixedClusterIdsTooManyPrevDocsRead() throws Exception {
final int numChanges = 200;
Random random = new Random();
final List<RevisionVector> splitHeads = Lists.newArrayList();
final int SPLIT_DOC_IDX = 4;//the test fails for 0 too... I'm just being
conservative though!
MemoryDocumentStore store = new MemoryDocumentStore() {
@Override
public <T extends Document> T find(Collection<T> collection,
String key) {
String path = Utils.getPathFromId(key);
if (path.startsWith("p")) {
RevisionVector minRevBeingRead =
splitHeads.get(splitHeads.size() - SPLIT_DOC_IDX - 1);
String maxRevStrInSplitTree = path.substring(2,
path.length()-2);
Revision maxRevInSplitTree =
Revision.fromString(maxRevStrInSplitTree);
assertFalse("Previous doc (" + key + ") read for min rev " +
minRevBeingRead
, !minRevBeingRead.isRevisionNewer(maxRevInSplitTree));
}
return super.find(collection, key);
}
};
DocumentNodeStore ns1 = createTestStore(store, 1, 0);
DocumentNodeStore ns2 = createTestStore(store, 2, 0);
List<DocumentNodeStore> nodeStores = Lists.newArrayList(ns1, ns2);
for (int i = 0; i < numChanges; i++) {
DocumentNodeStore ns =
nodeStores.get(random.nextInt(nodeStores.size()));
ns.runBackgroundOperations();
NodeBuilder builder = ns.getRoot().builder();
builder.setProperty("p", i);
merge(ns, builder);
ns.runBackgroundOperations();
if (random.nextDouble() < 0.2) {
RevisionVector splitHead = ns.getHeadRevision();
splitHeads.add(splitHead);
for (UpdateOp op : SplitOperations.forDocument(
getRootDocument(store), ns, splitHead,
Predicates.<String>alwaysFalse(), 2)) {
store.createOrUpdate(NODES, op);
}
}
}
NodeDocument doc = getRootDocument(store);
RevisionVector minRevBeingRead = splitHeads.get(splitHeads.size() -
SPLIT_DOC_IDX - 1);
Lists.newArrayList(doc.getChanges("p", minRevBeingRead));
ns1.dispose();
ns2.dispose();
}
{code}
I think the assertion that NO previous doc should be read which is strictly
older than minRevVector got a little bit too strict (as fixing that seemed like
we'd need to introduce some min concept in reading {{PropertyHistory}}).
In the end, I think it'd be beyond my comfort level to fix the issue. If the
test seems fine, I can post similar one for getNewestRevision as well.
Assigning the issue to you though (would be trying some stuff anyway in the
mean time)
> Stale cluster ids can potentially lead to lots of previous docs traversal in
> NodeDocument.getNewestRevision
> -----------------------------------------------------------------------------------------------------------
>
> Key: OAK-4358
> URL: https://issues.apache.org/jira/browse/OAK-4358
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: documentmk
> Reporter: Vikas Saurabh
> Assignee: Marcel Reutegger
>
> When some (actual test case and conditions still being investigated) of the
> following conditions are met:
> * There are property value changes from different cluster id
> * There are very old and stale cluster id (probably older incarnations of
> current node itself)
> * A parallel background split removes all _commitRoot, _revision entries such
> that the latest one (which is less that baseRev) is very old
> , finding newest revision traverses a lot of previous docs. Since root
> document gets split a lot and is a very common commitRoot (thus participating
> during checkConflicts in lot of commits), the issue can slow down commits by
> a lot
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)