[
https://issues.apache.org/jira/browse/OAK-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348198#comment-14348198
]
Chetan Mehrotra edited comment on OAK-2570 at 3/5/15 6:02 AM:
--------------------------------------------------------------
Thanks [~mreutegg] for taking care of this!
The testcase was failing due to 2 reason
* Executor not working properly in OSGi env used in the test. Fixed with
OAK-2579
* Lucene index changes were now not sync. So added a retry loop to fix that
Reapplied the commit on trunk. Also to clarify there was no issue with actual
fix done for this issue. The issue was with testcase and how we configured Oak
in OSGi env for that testcase. Would wait for buildbot report before proceeding
to merge this to branch
was (Author: chetanm):
The testcase was failing due to 2 reason
* Executor not working properly in OSGi env used in the test. Fixed with
OAK-2579
* Lucene index changes were now not sync. So added a retry loop to fix that
Reapplied the commit on trunk
> Open indexes in IndexTracker non blocking way
> ---------------------------------------------
>
> Key: OAK-2570
> URL: https://issues.apache.org/jira/browse/OAK-2570
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: oak-lucene
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.1.8, 1.0.13
>
> Attachments: OAK-2570.patch
>
>
> Currently {{IndexTracker}} opens the indexes in same thread where the commit
> happens as it acts as an {{Observer}}. If opening the index takes long time
> it would block the commit thread and might introduce delay in other operation
> on same thread.
> For example {{DocumentNodeStore}} invokes the {{Observer}} in
> {{backgroundRead}} and that call would not complete unless the {{Observer}}
> returns back. For a system to be responsive the {{backgroundRead}} has to be
> quick as it holds the write lock. Due to current implementation at time
> system turn unresponsive.
> Looking at the threaddump from one such unresponsive system it appears that
> slowness is due to Lucene index taking time to reopen.
> Thread which hold the {{backgroundOperationLock}} for {{read}} is
> {noformat}
> "pool-9-thread-1" Id=101 in RUNNABLE
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:146)
> at
> org.apache.jackrabbit.core.data.LazyFileInputStream.open(LazyFileInputStream.java:104)
> at
> org.apache.jackrabbit.core.data.LazyFileInputStream.read(LazyFileInputStream.java:163)
> at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
> at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:218)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
> at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
> at org.apache.lucene.util.fst.FST.<init>(FST.java:318)
> at org.apache.lucene.util.fst.FST.<init>(FST.java:304)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.<init>(BlockTreeTermsReader.java:484)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader.<init>(BlockTreeTermsReader.java:176)
> at
> org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
> at
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
> at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
> at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
> at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
> at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
> at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.<init>(IndexNode.java:94)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.open(IndexNode.java:62)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker$1.leave(IndexTracker.java:96)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:105)
> - locked
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker@6243264e
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:61)
> at
> org.apache.jackrabbit.oak.spi.commit.CompositeObserver.contentChanged(CompositeObserver.java:53)
> - locked org.apache.jackrabbit.oak.spi.commit.CompositeObserver@2687fd48
> at
> org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher.contentChanged(ChangeDispatcher.java:79)
> - locked org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher@2a0595bb
> at
> org.apache.jackrabbit.oak.plugins.document.CommitQueue.afterTrunkCommit(CommitQueue.java:131)
> - locked org.apache.jackrabbit.oak.plugins.document.CommitQueue@54a93d8a
> at
> org.apache.jackrabbit.oak.plugins.document.CommitQueue.done(CommitQueue.java:83)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.done(DocumentNodeStore.java:586)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:187)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:91)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:39)
> at
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$InMemory.merge(AbstractNodeStoreBranch.java:525)
> at
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.merge(AbstractNodeStoreBranch.java:318)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.merge(DocumentNodeStoreBranch.java:135)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.merge(DocumentRootBuilder.java:159)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.merge(DocumentNodeStore.java:1339)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.mergeWithConcurrencyCheck(AsyncIndexUpdate.java:420)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:389)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:300)
> - locked
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@401daa3f
> at
> org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:207)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Further {{DocumentNodeStore background thread}} is also waiting to acquire
> the write lock
> {noformat}
> "DocumentNodeStore background thread" Id=82 in WAITING on
> lock=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@6990d994
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
> at
> org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:156)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:1662)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundOperations(DocumentNodeStore.java:1456)
> - locked
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore@350b4c49
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundOperation.execute(DocumentNodeStore.java:2064)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:2039)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Now as javadoc of {{ReentrantReadWriteLock}} the readLock would not be
> granted if there is a waiter for write lock. Hence in such a case no further
> reads would be possible and system would not respond well
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)