Snapshots are failing

Martin Forssen Mon, 16 Mar 2015 08:43:56 -0700

Hello,

I'm experimenting with snapshots to S3, but I'm having no luck. The cluster 
consists of 8 nodes (i2.2xlarge). The index I'm trying to snapshot is 
2.91T, has 16 shards and 1 replica. I shoudl perhaps also mention that this 
is running Elasticsearch version 1.1.1.


Initially when I initiate the snapshot process everything looks good. But 
after a while shards start failing. In the logs I can find messages like 
this:
[2015-02-05 07:57:59,029][WARN ][index.merge.scheduler    ] [machine_name] 
[cluster_day][13] failed to merge
java.io.IOException: Map failed
    at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:849)
    at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:283)
    at 
org.apache.lucene.store.MMapDirectory$MMapIndexInput.<init>(MMapDirectory.java:228)
    at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:195)
    at 
org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:80)
    at 
org.elasticsearch.index.store.Store$StoreDirectory.openInput(Store.java:473)
    at 
org.apache.lucene.codecs.lucene46.Lucene46FieldInfosReader.read(Lucene46FieldInfosReader.java:52)
    at 
org.apache.lucene.index.SegmentReader.readFieldInfos(SegmentReader.java:215)
    at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:95)
    at 
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
    at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4273)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3743)
    at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
    at 
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
    at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
Caused by: java.lang.OutOfMemoryError: Map failed
    at sun.nio.ch.FileChannelImpl.map0(Native Method)
    at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:846)
    ... 14 more

After a while this stops happening but the snapshot has failed:
[2015-02-05 13:26:25,232][WARN ][snapshots                ] [machine_name] 
[[cluster_day][13]] [rf_es_snapshot:cluster_full] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException: 
[cluster_day][13] Failed to snapshot
        at 
org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:100)
        at 
org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:694)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
Caused by: org.elasticsearch.index.engine.EngineClosedException: 
[cluster_day][13] CurrentState[CLOSED]
        at 
org.elasticsearch.index.engine.internal.InternalEngine.ensureOpen(InternalEngine.java:900)
        at 
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:746)
        at 
org.elasticsearch.index.engine.internal.InternalEngine.snapshotIndex(InternalEngine.java:1045)
        at 
org.elasticsearch.index.shard.service.InternalIndexShard.snapshotIndex(InternalIndexShard.java:618)
        at 
org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:83)
        ... 4 more

There have been a number of other exceptions in between but most seem to be 
related to out of memory. Should really a snpshot require so much memory?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a940847c-9005-47a5-b8af-79ecb3f3b864%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Snapshots are failing

Reply via email to