lucene version is 6.3.0
filesystem is xfs.
And this always happen at 00:02 06:02 12:02 18:02,
it's very strange 




------------------ ???????? ------------------
??????: "251922566"<251922...@qq.com>;
????????: 2018??6??5??(??????) ????6:50
??????: "java-user"<java-user@lucene.apache.org>;

????: can anybody give some suggest about this elasticsearch shard failed 
problem? thanks




Elasticsearch version (bin/elasticsearch --version): 5.1.1

Plugins installed: [] no

JVM version (java -version): 1.8.0_77

OS version (uname -a if on a Unix-like system): CentOS Linux release 7.2.1511 
(Core)

Description of the problem including expected versus actual behavior:
when using update api ,highly concurrency , primary shard and replication shard 
all failed.
And this happened many times in 2 machines. So I think tihs is a bug.

Provide logs (if relevant):

[2018-04-27T12:02:22,797][WARN ][o.e.c.a.s.ShardStateAction] [172.20.3.2] 
[analytics_profile_12014][7] received shard failed for shard id 
[[analytics_profile_12014][7]], allocation id [xIEoF3JaTLWQz6X2KxMWRA], primary 
term [0], message [shard failure, reason [refresh failed]], failure 
[EOFException[read past EOF: 
MMapIndexInput(path="/mnt/esdata2/nodes/0/indices/fzJHAFdQQQO5zPL70D2b6g/7/index/_fqb1.cfe")]]
java.io.EOFException: read past EOF: 
MMapIndexInput(path="/mnt/esdata2/nodes/0/indices/fzJHAFdQQQO5zPL70D2b6g/7/index/_fqb1.cfe")
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status 
indeterminate: remaining=0, please run checkindex for more details 
(resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/esdata2/nodes/0/indices/fzJHAFdQQQO5zPL70D2b6g/7/index/_fqb1.cfe")))
org.elasticsearch.action.FailedNodeException: Failed node 
[BbfFMNRpRvW5p8LDs3rquQ]
at 
org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1.handleException(TransportNodesAction.java:219)
 ~[elasticsearch-5.1.1.jar:5.1.1]
at 
org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:984)
 ~[elasticsearch-5.1.1.jar:5.1.1]
at 
org.elasticsearch.transport.TcpTransport.lambda$handleException$17(TcpTransport.java:1314)
 ~[elasticsearch-5.1.1.jar:5.1.1]
at 
org.elasticsearch.transport.TcpTransport.handleException(TcpTransport.java:1312)
 [elasticsearch-5.1.1.jar:5.1.1]
Caused by: org.elasticsearch.transport.RemoteTransportException: 
[172.20.3.2_1][172.20.3.2:9301][internal:cluster/nodes/indices/shard/store[n]]
Caused by: org.elasticsearch.ElasticsearchException: Failed to list store 
metadata for shard [[analytics_action_12014_201804][15]]
Caused by: org.apache.lucene.index.CorruptIndexException: failed engine 
(reason: [corrupt file (source: [index])]) (resource=preexisting_corruption)
Caused by: java.io.IOException: failed engine (reason: [corrupt file (source: 
[index])])
Caused by: org.apache.lucene.index.CorruptIndexException: compound sub-files 
must have a valid codec header and footer: file is too small (0 bytes) 
(resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/esdata1/nodes/0/indices/cBECbko7SMKP3oXsTGi_kg/15/index/_2kqi.fdx")))
[2018-04-27T12:02:22,800][WARN ][o.e.c.a.s.ShardStateAction] [172.20.3.2] 
[analytics_profile_12014][18] received shard failed for shard id 
[[analytics_profile_12014][18]], allocation id [7TieFxLRRZ-28uOsPFr1yQ], 
primary term [0], message [shard failure, reason [refresh failed]], failure 
[EOFException[read past EOF: 
MMapIndexInput(path="/mnt/esdata2/nodes/0/indices/fzJHAFdQQQO5zPL70D2b6g/18/index/_fhe2.cfe")]]
java.io.EOFException: read past EOF: 
MMapIndexInput(path="/mnt/esdata2/nodes/0/indices/fzJHAFdQQQO5zPL70D2b6g/18/index/_fhe2.cfe")
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status 
indeterminate: remaining=0, please run checkindex for more details 
(resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/esdata2/nodes/0/indices/fzJHAFdQQQO5zPL70D2b6g/18/index/_fhe2.cfe")))

Reply via email to