[
https://issues.apache.org/jira/browse/OAK-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcel Reutegger resolved OAK-3653.
-----------------------------------
Resolution: Fixed
Fix Version/s: (was: 1.4)
1.3.14
Depending issue OAK-3634 is fixed. Resolving this issue as well.
> Incorrect last revision of cached node state
> --------------------------------------------
>
> Key: OAK-3653
> URL: https://issues.apache.org/jira/browse/OAK-3653
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: documentmk
> Reporter: Vikas Saurabh
> Assignee: Marcel Reutegger
> Fix For: 1.3.14
>
> Attachments: Capture.JPG, Capture1.JPG, Capture2.JPG, diagram.png,
> oplog.json
>
>
> While installing a package in one of the systems (Oak-1.0.16), we observed
> broken workflow model. Upon further investigation, it was found that
> {{baseVersion}} of the model was pointing to a node in version storage which
> wasn't visible on the instance. This node was available in mongo and could
> also be seen on a different instance of the cluster.
> The node was created on the same instance where it wasn't visible (cluster id
> - 3)
> Further investigation showed that for some (yet unknown) reason,
> {{lastRevision}} for cached node state of node's parent at revision where the
> node was created was stale (an older revision) which must have led to invalid
> children list and hence unavailable node.
> Attaching a few files which were captured during the investigation:
> * [^oplog.json] - oplog for +-10s for r151191c7601-0-3
> * [^Capture.JPG]- snapshot of groovy console output using script at \[0] to
> list node children cache entry for parent
> * [^Capture1.JPG]- snapshot of groovy console output using script at \[1] to
> traverse to invisible node
> * [^Capture2.JPG]- Almost same as Capture.jpg but this time for node cache
> instead (using script at \[2])
> The node which wasn't visible in mongo:
> {noformat}
> db.nodes.findOne({_id:
> "7:/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e/1.7"})
> {
> "_id" :
> "7:/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e/1.7",
> "_deleted" : {
> "r151191c7601-0-3" : "false"
> },
> "_commitRoot" : {
> "r151191c7601-0-3" : "0"
> },
> "_lastRev" : {
> "r0-0-3" : "r151191c7601-0-3"
> },
> "jcr:primaryType" : {
> "r151191c7601-0-3" : "\"nam:nt:version\""
> },
> "jcr:uuid" : {
> "r151191c7601-0-3" :
> "\"40234f1d-2435-4c6b-9962-20af22b1c948\""
> },
> "_modified" : NumberLong(1447825270),
> "jcr:created" : {
> "r151191c7601-0-3" : "\"dat:2015-11-17T21:41:14.058-08:00\""
> },
> "_children" : true,
> "jcr:predecessors" : {
> "r151191c7601-0-3" :
> "[\"ref:6cecc77b-3020-4a47-b9cc-084a618aa957\"]"
> },
> "jcr:successors" : {
> "r151191c7601-0-3" : "\"[0]:Reference\""
> },
> "_modCount" : NumberLong(1)
> }
> {noformat}
> Last revision entry of cached node state of
> {{/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e}}
> (parent) at revision {{r151191c7601-0-3}} was {{r14f035ebdf8-0-3}} (the last
> revs were propagated correctly up till root for r151191c7601-0-3).
> As a work-around, doing {{cache.invalidate(k)}} for node cache (inside the
> loop of script at \[2]) revived the visibility of the node.
> *Disclaimer*: _The issue was reported and investigated on a cluster backed by
> mongo running on Oak-1.0.16. But, the problem itself doesn't seem like any
> which have been fixed in later versions_
> \[0]: {noformat}
> def printNodeChildrenCache(def path) {
> def session =
> osgi.getService(org.apache.sling.jcr.api.SlingRepository.class).loginAdministrative(null)
> try {
> def rootNode = session.getRootNode()
> def cache = rootNode.sessionDelegate.root.store.nodeChildrenCache
> def cacheMap = cache.asMap()
> cacheMap.each{k,v ->
> if (k.toString().startsWith(path + "@")) {
> println "${k}:${v}"
> }
> }
> } finally {
> session.logout()
> }
> }
>
> printNodeChildrenCache("/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e")
> {noformat}
> \[1]: {noformat}
> def traverse(def path) {
> def session =
> osgi.getService(org.apache.sling.jcr.api.SlingRepository.class).loginAdministrative(null)
> try {
> def rootNode = session.getRootNode()
> def nb = rootNode.sessionDelegate.root.store.root
> def p = ''
> path.tokenize('/').each() {
> p = p + '/' + it
> nb = nb.getChildNode(it)
> println "${p} ${nb}"
> }
> } finally {
> session.logout()
> }
> }
> traverse("/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e/1.7")
> {noformat}
> \[2]: {noformat}
> def printNodeChildrenCache(def path) {
> def session =
> osgi.getService(org.apache.sling.jcr.api.SlingRepository.class).loginAdministrative(null)
> try {
> def rootNode = session.getRootNode()
> def cache = rootNode.sessionDelegate.root.store.nodeCache
> def cacheMap = cache.asMap()
> cacheMap.each{k,v ->
> if (k.toString().startsWith(path + "@")) {
> println "${k}:${v}"
> println "lastRev -> ${v.lastRevision}"
> }
> }
> } finally {
> session.logout()
> }
> }
>
> printNodeChildrenCache("/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e")
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)