[ 
https://issues.apache.org/jira/browse/OAK-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112788#comment-15112788
 ] 

Tom Blackford commented on OAK-3813:
------------------------------------

That is possibly the case, by I can confirm that, as Alex states, the fragility 
in the Indexing still exists in later Oak versions (in this case 1.2.9).

> Exception in datastore leads to async index stop indexing new content
> ---------------------------------------------------------------------
>
>                 Key: OAK-3813
>                 URL: https://issues.apache.org/jira/browse/OAK-3813
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.2.2
>            Reporter: Alexander Klimetschek
>            Priority: Critical
>
> We are using an S3 based datastore and that (for some other reasons) 
> sometimes starts to miss certain blobs and throws an exception, see below. 
> Unfortunately, it seems that this blocks the indexing of any new content - as 
> the index will try again and again to index that missing binary and fail at 
> the same point.
> It would be great if the indexing process could be more resilient against 
> error like this. (I think the datastore implementation should probably not 
> propagate that exception to the outside but just log it, but that's a 
> separate issue).
> This is seen with oak 1.2.2. I had a look at the [latest version on 
> trunk|https://github.com/apache/jackrabbit-oak/blob/d5da738aa6b43424f84063322987b765aead7813/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java#L427-L431]
>  but it seems the behavior has not changed since then.
> {noformat}
> 17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5] 
> org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job 
> execution of 
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error 
> occurred while obtaining InputStream for blobId 
> [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
> java.lang.RuntimeException: Error occurred while obtaining InputStream for 
> blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
>       at 
> org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49)
>       at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84)
>       at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216)
>       at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
>       at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
>       at 
> org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
>       at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
>       at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
>       at 
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(Lucene41PostingsReader.java:75)
>       at 
> org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430)
>       at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:195)
>       at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
>       at 
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
>       at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
>       at 
> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
>       at 
> org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279)
>       at 
> org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191)
>       at 
> org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182)
>       at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155)
>       at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123)
>       at 
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988)
>       at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932)
>       at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894)
>       at 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169)
>       at 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190)
>       at 
> org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221)
>       at 
> org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63)
>       at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56)
>       at 
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:367)
>       at 
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:312)
>       at 
> org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
>       at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> org.apache.jackrabbit.core.data.DataStoreException: Could not length of 
> dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
>       at 
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:465)
>       at 
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:297)
>       at 
> org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47)
>       ... 34 common frames omitted
> Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not 
> length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
>       at 
> org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:474)
>       at 
> org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:669)
>       at 
> org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:467)
>       at 
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:474)
>       at 
> org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:463)
>       ... 36 common frames omitted
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found 
> (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 
> E29ADB7F4BE7E12F)
>       at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
>       at 
> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
>       at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
>       at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005)
>       at 
> org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:467)
>       ... 40 common frames omitted
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to