Chetan Mehrotra created OAK-2570:
------------------------------------
Summary: Open indexes in IndexTracker non blocking way
Key: OAK-2570
URL: https://issues.apache.org/jira/browse/OAK-2570
Project: Jackrabbit Oak
Issue Type: Improvement
Components: oak-lucene
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Fix For: 1.2, 1.0.12
Currently {{IndexTracker}} opens the indexes in same thread where the commit
happens as it acts as an {{Observer}}. If opening the index takes long time it
would block the commit thread and might introduce delay in other operation on
same thread.
For example {{DocumentNodeStore}} invokes the {{Observer}} in
{{backgroundRead}} and that call would not complete unless the {{Observer}}
returns back. For a system to be responsive the {{backgroundRead}} has to be
quick as it holds the write lock. Due to current implementation at time system
turn unresponsive.
Looking at the threaddump from one such unresponsive system it appears that
slowness is due to Lucene index taking time to reopen.
Thread which hold the {{backgroundOperationLock}} for {{read}} is
{noformat}
"pool-9-thread-1" Id=101 in RUNNABLE
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at
org.apache.jackrabbit.core.data.LazyFileInputStream.open(LazyFileInputStream.java:104)
at
org.apache.jackrabbit.core.data.LazyFileInputStream.read(LazyFileInputStream.java:163)
at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:218)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
at
org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
at org.apache.lucene.util.fst.FST.<init>(FST.java:318)
at org.apache.lucene.util.fst.FST.<init>(FST.java:304)
at
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.<init>(BlockTreeTermsReader.java:484)
at
org.apache.lucene.codecs.BlockTreeTermsReader.<init>(BlockTreeTermsReader.java:176)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
at
org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
at
org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.<init>(IndexNode.java:94)
at
org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.open(IndexNode.java:62)
at
org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker$1.leave(IndexTracker.java:96)
at
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
at
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
at
org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
at
org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:105)
- locked
org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker@6243264e
at
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:61)
at
org.apache.jackrabbit.oak.spi.commit.CompositeObserver.contentChanged(CompositeObserver.java:53)
- locked org.apache.jackrabbit.oak.spi.commit.CompositeObserver@2687fd48
at
org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher.contentChanged(ChangeDispatcher.java:79)
- locked org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher@2a0595bb
at
org.apache.jackrabbit.oak.plugins.document.CommitQueue.afterTrunkCommit(CommitQueue.java:131)
- locked org.apache.jackrabbit.oak.plugins.document.CommitQueue@54a93d8a
at
org.apache.jackrabbit.oak.plugins.document.CommitQueue.done(CommitQueue.java:83)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.done(DocumentNodeStore.java:586)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:187)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:91)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:39)
at
org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$InMemory.merge(AbstractNodeStoreBranch.java:525)
at
org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.merge(AbstractNodeStoreBranch.java:318)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.merge(DocumentNodeStoreBranch.java:135)
at
org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.merge(DocumentRootBuilder.java:159)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.merge(DocumentNodeStore.java:1339)
at
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.mergeWithConcurrencyCheck(AsyncIndexUpdate.java:420)
at
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:389)
at
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:300)
- locked org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@401daa3f
at
org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
at org.quartz.core.JobRunShell.run(JobRunShell.java:207)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}
Further {{DocumentNodeStore background thread}} is also waiting to acquire the
write lock
{noformat}
"DocumentNodeStore background thread" Id=82 in WAITING on
lock=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@6990d994
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
at
org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:156)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:1662)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundOperations(DocumentNodeStore.java:1456)
- locked
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore@350b4c49
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundOperation.execute(DocumentNodeStore.java:2064)
at
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:2039)
at java.lang.Thread.run(Thread.java:745)
{noformat}
Now as javadoc of {{ReentrantReadWriteLock}} the readLock would not be granted
if there is a waiter for write lock. Hence in such a case no further reads
would be possible and system would not respond well
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)