Cluster has 1 zookeeper node and 3 solr nodes. There is only one collection
with 3 shards. Data is continuously indexed using SolrJ API. System is
running on AWS and I am taking backup on EFS (Elastic File System).

Observed behavior:
If indexing is not in progress, I take a backup of cluster using collection
API, backup succeeds and restore works as expected.

snapshotscli.sh works as expected if I first take snapshot of index while
indexing is in progress and then take backup. There is no error during
restore.

However, I get error most of the time if I try to restore collection from
the backup taken using collection API when indexing was still in progress.
Error is always missing segment and I can see that segment its trying to
read during restore does not exist in the backup shard directory.

Also, Is there a way to take snapshot of solr cloud using collection api?
User guide only has documentation to take snapshot of core using collection
api.

2017-09-08 19:47:22.592 WARN
(parallelCoreAdminExecutor-5-thread-8-processing-n:ec2-34-201-149-27.compute-1.amazonaws.com:8983_solr
t1cloudbackuponefs-r2187461299681393 RESTORECORE) [   ] o.a.s.h.RestoreCore
Could not switch to restored index. Rolling back to the current index
org.apache.lucene.index.CorruptIndexException: Unexpected file read error
while reading index.
(resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/solr/data/t1cloud3_shard2_replica0/data/restore.20170908194722131/segments_y")))
    at
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:290)
    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:930)
    at
org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:118)
    at
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:93)
    at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:248)
    at
org.apache.solr.update.DefaultSolrCoreState.changeWriter(DefaultSolrCoreState.java:211)
    at
org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:220)
    at
org.apache.solr.update.DirectUpdateHandler2.newIndexWriter(DirectUpdateHandler2.java:726)
    at org.apache.solr.handler.RestoreCore.doRestore(RestoreCore.java:108)
    at
org.apache.solr.handler.admin.RestoreCoreOp.execute(RestoreCoreOp.java:65)
    at
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:384)
    at
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:388)
    at
org.apache.solr.handler.admin.CoreAdminHandler.lambda$handleRequestBody$0(CoreAdminHandler.java:182)
    at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
    at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.NoSuchFileException:
/var/solr/data/t1cloud3_shard2_replica0/data/restore.20170908194722131/_
4m.si
    at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at
sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
    at java.nio.channels.FileChannel.open(FileChannel.java:287)
    at java.nio.channels.FileChannel.open(FileChannel.java:335)
    at
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)
    at
org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:192)
    at
org.apache.lucene.store.Directory.openChecksumInput(Directory.java:137)
    at
org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:89)
    at
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357)
    at
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)
    ... 17 more

Reply via email to