On Mon, 2010-11-15 at 06:35 +0100, lu.rongbin wrote: > In addition,my index has only two store fields, id and price, and other > fields are index. I increase the document and query cache. the ec2 > m2.4xLarge instance is 8 cores, 68G memery. all indexs size is about 100G.
Looking at http://aws.amazon.com/ec2/instance-types/ I can see that Amazon recommends using "EBS to get improved storage I/O performance for disk bound applications". As Lucene/Solr is very often I/O bound (or more precisely random access I/O bound), you might consider the EBS option. I found this article that looks very relevant: http://www.coreyhulen.org/?p=326 It is about Cassandra (a database), but I'm guessing that the I/O pattern is fairly similar to Lucene/Solr with a lot random access reads. Extrapolating wildly it would seem that disk I/O latency is a problem with Amazon's cloud, at least compared with the obvious choice of using SSD on a local machine. If this holds true, some things you could try would be better warming of your searches, holding (part of) your index in RAM, switching to EBS or ... moving away from the cloud. All this is assuming that it really is I/O that is your problem. Have you looked at CPU-load vs. I/O wait while issuing a batch of queries? Disclaimer: I have no experience with Amazon's cloud service.