Hi everyone,

We had an OOM event earlier this morning. This has caused one of our shards
to lose all it's replicas and it's leader is still in a down state. We have
restarted the Java process (solr) and it's still in a down state. Logs
below:

```
Feb 25, 2021 @ 11:46:43.000 2021-02-25 00:46:43.268 WARN
 (updateExecutor-3-thread-1-processing-n:10.0.10.43:8983_solr
x:search-collection-2018-10-30_shard2_5_replica_n1480
c:search-collection-2018-10-30 s:shard2_5 r:core_node1481)
[c:search-collection-2018-10-30 s:shard2_5 r:core_node1481
x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:40.000 2021-02-25 00:46:40.759 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:35.000 2021-02-25 00:46:35.761 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:33.000 2021-02-25 00:46:33.270 WARN
 (updateExecutor-3-thread-2-processing-n:10.0.10.43:8983_solr
x:search-collection-2018-10-30_shard2_5_replica_n1480
c:search-collection-2018-10-30 s:shard2_5 r:core_node1481)
[c:search-collection-2018-10-30 s:shard2_5 r:core_node1481
x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:30.000 2021-02-25 00:46:30.759 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:25.000 2021-02-25 00:46:25.761 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:23.000 2021-02-25 00:46:23.279 WARN
 (updateExecutor-3-thread-1-processing-n:10.0.10.43:8983_solr
x:search-collection-2018-10-30_shard2_5_replica_n1480
c:search-collection-2018-10-30 s:shard2_5 r:core_node1481)
[c:search-collection-2018-10-30 s:shard2_5 r:core_node1481
x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
```

Questions:
1. Is there anything we can do to force this core to go live?
2. If the core is unrecoverable, is there a way to clear the core up such
that we can reindex only that shard?

Any other advice would be great too :)

Ash

-- 
**
** <https://www.canva.com/>Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.
Here are some resources 
<https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates>
 
that can help.
 <https://twitter.com/canva> <https://facebook.com/canva> 
<https://au.linkedin.com/company/canva> <https://twitter.com/canva>  
<https://facebook.com/canva>  <https://au.linkedin.com/company/canva>  
<https://instagram.com/canva>










Reply via email to