Hi All,

I've noticed on some of my solr nodes that the disk usage is increasing over time. After checking the output of lsof I found hundreds of references to deleted index files being held by solr. This totaled 24GB on a 16GB index. A restart of solr can obviously fix this but this is not an ideal solution. We are running solr 5.4.0 on OpenJDK 1.8.0_91. We are using the Concurrent Mark Sweep GC although I've also seen the same problem on nodes using the G1 GC. Our update handler has autoCommit and softAutoCommit enabled (at different intervals). We are using solr cloud and have multiple shards with 2 nodes each in our collections. I've not seen any pattern between this appearing on leaders or replicas. Not all my nodes appear to be exhibiting the problem either. Our usage pattern does involve a lot of churn in our index with the majority of documents being updated/deleted every day.

Searching JIRA and the web in general I could only find references to this sort of problem when running solr in tomcat. Can anyone suggest a reason why this might be happening or a way I can manage it without needing to restart solr?

Example lsof output:
java 1100 s123 DEL REG 202,3 8919406 /home/s123/solr/data/uk_shard2_replica1/data/index/_3m9s.fdt java 1100 s123 DEL REG 202,3 8919159 /home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk.tvd java 1100 s123 DEL REG 202,3 8919150 /home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk_Lucene50_0.tim java 1100 s123 DEL REG 202,3 8919094 /home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1_Lucene50_0.tim java 1100 s123 DEL REG 202,3 8919103 /home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1.tvd

Regards,
Gavin.

Reply via email to