Thanks Kent for your info. We are not doing any faceting, sorting, or much else. My guess is that most of the memory increase is just the data structures created when parts of the frq and prx files get read into memory. Our frq files are about 77GB and the prx files are about 260GB per shard and we are running 3 shards per machine. I suspect that the document cache and query result cache don't take up that much space, but will try a run with those caches set to 0, just to see.
We have dual 4 core processors and 74GB total memory. We want to leave a significant amount of memory free for OS disk caching. We tried increasing the memory from 20GB to 28GB and adding the -XXMaxGCPauseMillis=1000 flag but that seemed to have no effect. Currently I'm testing using the ConcurrentMarkSweep and that's looking much better although I don't understand why it has sized the Eden space down into the 20MB range. However, I am very new to Java memory management. Anyone know if when using ConcurrentMarkSweep its better to let the JVM size the Eden space or better to give it some hints? Once we get some decent JVM settings we can put into production I'll be testing using termIndexInterval with Solr 1.4.1 on our test server. Tom -----Original Message----- From: Grant Ingersoll [mailto:gsing...@apache.org] >.What are your current GC settings? Also, I guess I'd look at ways you can >reduce the heap size needed. >> Caching, field type choices, faceting choices. >>Also could try playing with the termIndexInterval which will load fewer terms >>into memory at the cost of longer seeks. >>At some point, though, you just may need more shards and the resulting >>smaller indexes. How many CPU cores do you have on each machine?