I'm late to the dance but FWIW, we also experienced some similar swap-like issues when we upgraded from Centos 7.6 to Centos 7.9 (this was Solr 8.3) - some of the Solr nodes would end up reading from disk like crazy, and query response times would suffer accordingly. At one point we had 1/2 the nodes (with 1 set of replicas) on 7.6 and the other 1/2 (with 2nd replicas) on 7.9, and could see disk reads and io waits an order of magnitude higher on 7.9, with all other things being equal.
We never really solved it: after countless weeks of testing various configurations, we threw up our hands and started migrating everything to Amazon Linux 2 (there were other reasons for that, but this was a definite driver). We also have some servers still hosting RedHat Enterprise 7.9 so far without issues but these are also slated for migration in the coming weeks. On Tue, Oct 26, 2021 at 8:11 AM Paul Russell <paul.russ...@qflow.com> wrote: > I have a current SOLR cluster running SOLR 6.6 on RHEL 6 servers. All SOLR > instances use a 25G JVM on the RHEL 6 server configured with 64G of memory > managing a 900G collection. Measured response time to queries average about > 100ms. > > I am attempting to move the cluster to new RHEL 7 servers with the same > configuration (8 cores/ 64G memory) and having performance issues. > > On the RHEL 7 servers the kswapd0 process is consuming up to 30% of the CPU > and response time is being measured at 500-1000 ms for queries. > > I tried using the vm.swappiness setting at both 0 and 1 and have been > unable to change the behavior. If I trim the SOLR JVM to 16Gb response > times get better and GC logs show the JVM is operating correctly.. > > Has anyone else had a similar issue? I have tried upgrading to SOLR 7.7.2 > as part of the process and that hasn't helped. > > Any suggestions? >