Hi Team,
We are using SOLR Cloud 8.5.1 with an external zookeeper ensemble.
Our SOLR cloud is able to index the document and also we are able to query
it fine.
Things which we are not able to do are delete a replica, refresh or reload
a SOLR collection and SOLR is not able to elect a leader unless we call
the force election api. These requests are timing out. Also when we
navigate to Cloud -> Tree It times out.
Based on the above scenario we are thinking the issue would be with the
zookeeper. The disk on the zookeeper is not full. The zookeeper logs are
clean too.
When we use the following API :
solr/admin/zookeeper?_=1643093184112&wt=json there is a huge json payload
around 29 MB. I see lot of entries like :
"children":[{
"text":"async_ids","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids"},
"children":[{
"text":"mn-.auto_add_replicas","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas"},
"children":[{
"text":"10b5b41dacc37T5k2wrt070k8u35h1pn0c2g71e","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F10b5b41dacc37T5k2wrt070k8u35h1pn0c2g71e"}},{
"text":"10b5ca7fbcaaaT5k2wrt070k8u35h1pn0c2g75c","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F10b5ca7fbcaaaT5k2wrt070k8u35h1pn0c2g75c"}},{
"text":"14eaadc9927efT719odunodrwduibr6kj4maeye","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F14eaadc9927efT719odunodrwduibr6kj4maeye"}},{
"text":"25ce75a10fda7T719odunodrwduibr6kj4mghfr","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F25ce75a10fda7T719odunodrwduibr6kj4mghfr"}},{
"text":"31e5214376d15T3t5raecs7i0z3b8d65njpu7yo","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F31e5214376d15T3t5raecs7i0z3b8d65njpu7yo"}},{
"text":"3938f7e8a80cT5k2wrt070k8u35h1pn0c2faut","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F3938f7e8a80cT5k2wrt070k8u35h1pn0c2faut"}},{
"text":"467afc30c27cT5k2wrt070k8u35h1pn0c2fauu","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F467afc30c27cT5k2wrt070k8u35h1pn0c2fauu"}},{
"text":"8f05cf4cdd0fT5k2wrt070k8u35h1pn0c2fd1u","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F8f05cf4cdd0fT5k2wrt070k8u35h1pn0c2fd1u"}},{
"text":"8f09c7a0ccb7T5k2wrt070k8u35h1pn0c2fd1v","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2F8f09c7a0ccb7T5k2wrt070k8u35h1pn0c2fd1v"}},{
"text":"c4481ed67da0T5k2wrt070k8u35h1pn0c2febs","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2Fc4481ed67da0T5k2wrt070k8u35h1pn0c2febs"}},{
"text":"c4534d2ba96bT5k2wrt070k8u35h1pn0c2febx","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-.auto_add_replicas%2Fc4534d2ba96bT5k2wrt070k8u35h1pn0c2febx"}}]},{
"text":"mn-1633033814359","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-1633033814359"}},{
"text":"mn-1633033834807","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-1633033834807"}},{
"text":"mn-1633033845201","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-1633033845201"}},{
"text":"mn-1633033855382","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-1633033855382"}},{
"text":"mn-1633033865539","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-1633033865539"}},{
"text":"mn-1633033875814","a_attr":{
"href":"admin/zookeeper?detail=true&path=%2Foverseer%2Fasync_ids%2Fmn-1633033875814"}},{
I am not sure if this is a problem. If anyone has seen the same behavior
can you please share some information on next steps to follow to recover
from this state.
Regards,
Nikhilesh