server count and performance

Chris Neal Tue, 27 Jan 2015 15:11:47 -0800

Hi all,

I've seen lots of posts about this, and want to make sure I'm understanding
correctly.

Background:

- Our cluster has 6 servers. They are Dell R720xd with 64GB RAM,
2xE5-2600v2 CPU (2 sockets, 6 cores/socket), 16TB disk
- Elasticsearch is set to have 6 shards, and 1 replica, giving two
shards per server. I'm giving ES 32GB heaps on Java 1.7 with G1 GC.

I'm concerned about the size of our indexes. Right now, we store all data
in one index per day, with various types within that to separate data.

The indexes are averaging about 50GB/day (not including replicas). Shard
size is 8GB each.

We have a LOT more data to index. At least 20x more. Should I be
concerned with indexes of that size (~1000GB) and shards of that size
(~160GB)? Is it merely a question of having enough hardware, or is there
more to it?

I'm considering splitting the data into a different indexing strategy so
that the index size is smaller, but there are more of them. The result is
the amount of data is the same, so I'm not sure if that will do anything or
not.

If I'm optimizing for searching, does querying multiple smaller indices
perform better than querying fewer larger ones?

Thank you for your time.
Chris

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAND3Dpgr78LJ%3DcWb0ZbyHZqMin4tDSVPvjG%3D_PYgsQym9EzZ%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Shard size / Index number / server count and performance

Reply via email to