Thanks all.

I've the same index with a bit different schema and 200M documents,
installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size
of index is about 1.5TB, have many updates every 5 minutes, complex queries
and faceting with response time of 100ms that is acceptable for us.

Toke Eskildsen,

Is the index updated while you are searching? *No*
Do you do any faceting or other heavy processing as part of a search? *No*
How many hits does a search typically have and how many documents are
returned? *The test for QTime only with no documents returned and No. of
hits varying from 50,000 to 50,000,000.*
How many concurrent searches do you need to support? How fast should the
response time be? *May be 100 concurrent searches with 100ms with facets.*

Does splitting the shard to two shards on the same node so every shard will
be on a single EBS Volume better than using LVM?

Thanks

On Mon, Dec 29, 2014 at 2:00 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
wrote:

> Mahmoud Almokadem [prog.mahm...@gmail.com] wrote:
> > We've installed a cluster of one collection of 350M documents on 3
> > r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
> > about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS
> > General purpose (1x1TB + 1x500GB) on each instance. Then we create
> logical
> > volume using LVM of 1.5TB to fit our index.
>
> Your search speed will be limited by the slowest storage in your group,
> which would be your 500GB EBS. The General Purpose SSD option means (as far
> as I can read at http://aws.amazon.com/ebs/details/#piops) that your
> baseline of 3 IOPS/MB = 1500 IOPS, with bursts of 3000 IOPS. Unfortunately
> they do not say anything about latency.
>
> For comparison, I checked the system logs from a local test with our 21TB
> / 7 billion documents index. It used ~27,000 IOPS during the test, with
> mean search time a bit below 1 second. That was with ~100GB RAM for disk
> cache, which is about ½% of index size. The test was with simple term
> queries (1-3 terms) and some faceting. Back of the envelope: 27,000 IOPS
> for 21TB is ~1300 IOPS/TB. Your indexes are 1.1TB, so 1.1*1300 IOPS ~= 1400
> IOPS.
>
> All else being equal (which is never the case), getting 1-3 second
> response times for a 1.1TB index, when one link in the storage chain is
> capped at a few thousand IOPS, you are using networked storage and you have
> little RAM for caching, does not seem unrealistic. If possible, you could
> try temporarily boosting performance of the EBS, to see if raw IO is the
> bottleneck.
>
> > The response time is about 1 and 3 seconds for simple queries (1 token).
>
> Is the index updated while you are searching?
> Do you do any faceting or other heavy processing as part of a search?
> How many hits does a search typically have and how many documents are
> returned?
> How many concurrent searches do you need to support? How fast should the
> response time be?
>
> - Toke Eskildsen
>

Reply via email to