Re: my index has 500 million docs ,how to improve solr search performance？

Toke Eskildsen Mon, 15 Nov 2010 02:01:42 -0800

On Mon, 2010-11-15 at 06:35 +0100, lu.rongbin wrote:
> In addition,my index has only two store fields, id and price, and other
> fields are index. I increase the document and query cache. the ec2
> m2.4xLarge instance is 8 cores, 68G memery. all indexs size is about 100G.


Looking at http://aws.amazon.com/ec2/instance-types/ I can see that
Amazon recommends using "EBS to get improved storage I/O performance for
disk bound applications". As Lucene/Solr is very often I/O bound (or
more precisely random access I/O bound), you might consider the EBS
option.

I found this article that looks very relevant:
http://www.coreyhulen.org/?p=326
It is about Cassandra (a database), but I'm guessing that the I/O
pattern is fairly similar to Lucene/Solr with a lot random access reads.

Extrapolating wildly it would seem that disk I/O latency is a problem
with Amazon's cloud, at least compared with the obvious choice of using
SSD on a local machine. If this holds true, some things you could try
would be better warming of your searches, holding (part of) your index
in RAM, switching to EBS or ... moving away from the cloud.


All this is assuming that it really is I/O that is your problem. Have
you looked at CPU-load vs. I/O wait while issuing a batch of queries?


Disclaimer: I have no experience with Amazon's cloud service.

Re: my index has 500 million docs ,how to improve solr search performance？

Reply via email to