I meant tens of shards per node. So if you have N nodes with I indices
which have S shards and R replicas, that would be (I * S * (1 + R)) / N.

One shard per node is optimal but doesn't allows for growth: if you add one
more node, you cannot spread the indexing work load, that is why it is
common to have a few shards per node in order to allow elasticsearch to
spread the load in case you would introduce a new node in your cluster to
improve your cluster capacity.


On Mon, Aug 25, 2014 at 12:07 AM, Chris Neal <[email protected]>
wrote:

> Adrien,
>
> Thanks so much for the response.  It was very helpful.  I will check out
> those links on capacity planning for sure.
>
> One followup question.  You mention that tens of shards per node would be
> ok.  Are you meaning tens of shards from tens of indexes?  Or tens of
> shards for a single index?  Right now I have two servers configured with
> the index getting 2 shards (one per server), and 1 replica (per server).
>
> Chris
>
>
> On Fri, Aug 22, 2014 at 5:58 PM, Adrien Grand <
> [email protected]> wrote:
>
>> Hi Chris,
>>
>> Usually, the problem is not that much in terms of indices but shards,
>> which are the physical units of data storage (an index being a logical view
>> over several shards).
>>
>> Something to beware of is that shards typically have some constant
>> overhead (disk space, file descriptors, memory usage) that does not depend
>> on the amount of data that they store. Although it would be ok to have up
>> to a few tens of shards per nodes, you should avoid to have eg. thousands
>> of shards per node.
>>
>> if you plan on always adding a filter for a specific application in your
>> search requests, then splitting by application makes sense since this will
>> make the filter useless at search time, you will just need to query the
>> application-specific index. On the other hand if you don't filter by
>> application, then splitting data by yourself into smaller indices would be
>> pretty equivalent to storing everything in a single index with a higher
>> number of shards.
>>
>> You might want to check out the following resources that talk about
>> capacity planning:
>>  - http://www.elasticsearch.org/videos/big-data-search-and-analytics/
>>  -
>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/capacity-planning.html
>>
>>
>>
>> On Fri, Aug 22, 2014 at 9:08 PM, Chris Neal <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> As the subject says, I'm wondering about index size vs. number of
>>> indexes.
>>>
>>> I'm indexing many application log files, currently with an index by day
>>> for all logs, which will make a very large index.  For just a few
>>> applications in Development, the index is 55GB a day (across 2 servers).
>>>  In prod with all applications, it will be "much more than that".  1TB a
>>> day maybe?
>>>
>>> I'm wondering if there is value in splitting the indexes by day and by
>>> application, which would produce more indexes per day, but they would be
>>> smaller, vs. value in having a single, mammoth index by day alone.
>>>
>>> Is it just a resource question?  If I have enough RAM/disk/CPU to
>>> support a "mammoth" index, then I'm fine?  Or are there other reasons to
>>> (or to not) split up indexes?
>>>
>>> Very much appreciate your time.
>>> Chris
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAND3DphfsYx0LW0M-yvLWGauRSzVWG0etaBkiTrN7zVafq7tMA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAND3DphfsYx0LW0M-yvLWGauRSzVWG0etaBkiTrN7zVafq7tMA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Adrien Grand
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5i7AAnasMYZgR83aTXvELan%3DkR6OLvGYKfs9d5Subi4A%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5i7AAnasMYZgR83aTXvELan%3DkR6OLvGYKfs9d5Subi4A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAND3Dph9Z1My%2B2%2BQ-NM-sWNn2vT1qktDi6%2BmR-b9rFN-Xc-_pw%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAND3Dph9Z1My%2B2%2BQ-NM-sWNn2vT1qktDi6%2BmR-b9rFN-Xc-_pw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5KGu34xCh6e5PKFm30U8mNAf-0acd7%3DQMAVuriL3msyA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to