Hello,

I'm trying to move a VuFind installation from an ailing physical server into a 
virtualized environment, and I'm running into performance problems.  VuFind is 
a Solr 1.4.1-based application with fairly large and complex records (many 
stored fields, many words per record).  My particular installation contains 
about a million records in the index, with a total index size around 6GB.

The virtual environment has more RAM and better CPUs than the old physical box, 
and I am satisfied that my Java environment is well-tuned.  My index is 
optimized.  Searches that hit the cache respond very well.  The problem is that 
non-cached searches are very slow - the more keywords I add, the slower they 
get, to the point of taking 6-12 seconds to come back with results on a quiet 
box and well over a minute under stress testing.  (The old box still took a 
while for equivalent searches, but it was about twice as fast as the new one).

My gut feeling is that disk access reading the index is the bottleneck here, 
but I know little about the specifics of Solr's internals, so it's entirely 
possible that my gut is wrong.  Outside testing does show that the the virtual 
environment's disk performance is not as good as the old physical server, 
especially when multiple processes are trying to access the same file 
simultaneously.

So, two basic questions:


1.)    Would you agree that I'm dealing with a disk bottleneck, or are there 
some other factors I should be considering?  Any good diagnostics I should be 
looking at?

2.)    If the problem is disk access, is there anything I can tune on the Solr 
side to alleviate the problems?

Thanks,
Demian

Reply via email to