On 4/12/2021 1:08 AM, Rekha Sekhar wrote:
We have a solrCloud setup in Kubernetes with 2 Solr instances and 3
ZooKeeper instances with 1 shard. It is configured with 8G persistent
storage for each Solr and Zookeeper. The Memory allocated for Solr is 16G
with 10G Heap size. There are a max of 2.5million records indexed. There
scheduler client which will call the Solr with url -
/update/json?wt=json&commit=true - to do the add/update/delete operations.
Occasionally there will be a huge update/delete happens with 1 million
records which will call the api (/update/json?wt=json&commit=true ) with
500 documents at a time, but this is called in multiple threads. Everything
works fine 1 week, but suddenly we saw errors in Solr.log which makes the
solr in an error state and I had to restart one of the solr node.
So you're issuing a manual commit with every batch of 500 documents and
sending those batches in parallel?
That's a LOT of commit operations. Commits are just about all Solr will
be doing with that kind of setup. Which will leave very few system
resources for handling updates or queries.
I would remove all commits from the client side and go with an automatic
server-side setup like the following in solrconfig.xml:
<autoCommit>
<maxTime>120000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>60000</maxTime>
</autoSoftCommit>
You can look at the example solrconfig.xml files to figure out where it
goes. In later versions of Solr I think it can be found in the
updateHandler section.
That setup will make commits happen at much more reasonable intervals,
which might clear up the whole problem.
If that setup doesn't help, a screenshot like the ones mentioned here
can be very helpful for us to make determinations about your memory setup:
https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue
Thanks,
Shawn