Hello, I'm trying to move a VuFind installation from an ailing physical server into a virtualized environment, and I'm running into performance problems. VuFind is a Solr 1.4.1-based application with fairly large and complex records (many stored fields, many words per record). My particular installation contains about a million records in the index, with a total index size around 6GB.
The virtual environment has more RAM and better CPUs than the old physical box, and I am satisfied that my Java environment is well-tuned. My index is optimized. Searches that hit the cache respond very well. The problem is that non-cached searches are very slow - the more keywords I add, the slower they get, to the point of taking 6-12 seconds to come back with results on a quiet box and well over a minute under stress testing. (The old box still took a while for equivalent searches, but it was about twice as fast as the new one). My gut feeling is that disk access reading the index is the bottleneck here, but I know little about the specifics of Solr's internals, so it's entirely possible that my gut is wrong. Outside testing does show that the the virtual environment's disk performance is not as good as the old physical server, especially when multiple processes are trying to access the same file simultaneously. So, two basic questions: 1.) Would you agree that I'm dealing with a disk bottleneck, or are there some other factors I should be considering? Any good diagnostics I should be looking at? 2.) If the problem is disk access, is there anything I can tune on the Solr side to alleviate the problems? Thanks, Demian