[
https://issues.apache.org/jira/browse/OAK-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112788#comment-15112788
]
Tom Blackford commented on OAK-3813:
------------------------------------
That is possibly the case, by I can confirm that, as Alex states, the fragility
in the Indexing still exists in later Oak versions (in this case 1.2.9).
> Exception in datastore leads to async index stop indexing new content
> ---------------------------------------------------------------------
>
> Key: OAK-3813
> URL: https://issues.apache.org/jira/browse/OAK-3813
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: lucene
> Affects Versions: 1.2.2
> Reporter: Alexander Klimetschek
> Priority: Critical
>
> We are using an S3 based datastore and that (for some other reasons)
> sometimes starts to miss certain blobs and throws an exception, see below.
> Unfortunately, it seems that this blocks the indexing of any new content - as
> the index will try again and again to index that missing binary and fail at
> the same point.
> It would be great if the indexing process could be more resilient against
> error like this. (I think the datastore implementation should probably not
> propagate that exception to the outside but just log it, but that's a
> separate issue).
> This is seen with oak 1.2.2. I had a look at the [latest version on
> trunk|https://github.com/apache/jackrabbit-oak/blob/d5da738aa6b43424f84063322987b765aead7813/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java#L427-L431]
> but it seems the behavior has not changed since then.
> {noformat}
> 17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5]
> org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job
> execution of
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error
> occurred while obtaining InputStream for blobId
> [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
> java.lang.RuntimeException: Error occurred while obtaining InputStream for
> blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
> at
> org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
> at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
> at
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(Lucene41PostingsReader.java:75)
> at
> org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430)
> at
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:195)
> at
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
> at
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
> at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
> at
> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
> at
> org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279)
> at
> org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191)
> at
> org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182)
> at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155)
> at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123)
> at
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988)
> at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932)
> at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190)
> at
> org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221)
> at
> org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:367)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:312)
> at
> org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException:
> org.apache.jackrabbit.core.data.DataStoreException: Could not length of
> dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
> at
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:465)
> at
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:297)
> at
> org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47)
> ... 34 common frames omitted
> Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not
> length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
> at
> org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:474)
> at
> org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:669)
> at
> org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:467)
> at
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:474)
> at
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:463)
> ... 36 common frames omitted
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found
> (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID:
> E29ADB7F4BE7E12F)
> at
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
> at
> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
> at
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
> at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736)
> at
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
> at
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005)
> at
> org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:467)
> ... 40 common frames omitted
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)