Re: Shard count and plugin questions

[email protected] Thu, 05 Jun 2014 10:55:48 -0700

Thanks for raising the questions, I will come back later in more detail.

Just a quick note, the idea about "shards scale write" and "replica scale
read" is correct, but Elasticsearch is also "elastic" which means it
"scales out", by adding node hardware. The shard/replica scale pattern
finds its limits in a node hardware, because shards/replica are tied to
machines, and there are the hard resource constraints, mostly disk I/O and
memory related.


In the end, you can take as a rule of thumb:

- add replica to scale "read" load
- add new indices (i.e. new shards) to scale "write" load
- and add nodes to scale out the whole cluster for both read and write load

More later,

Jörg





On Thu, Jun 5, 2014 at 7:17 PM, Todd Nine <[email protected]> wrote:

> Hey Jörg,
>   Thank you for your response.  A few questions/points.
>
> In our use cases, the inability to write or read is considered a downtime.
>  Therefore, I cannot disable writes during expansion.  Your alias points
> raise
> some interesting research I need to do, and I have a few follow up
> questions.
>
>
>
> Our systems are fully multi tenant.  We currently intend to have 1 index
> per application.  Each application can have a large number of types within
> their index. Each cluster could potentially hold 1000 or more
> applications/indexes.  Most users will never need more than 5 shards.
> Some users are huge power users, billions of documents for a type with
> several large types.  These are the users I am concerned with.
>
> From my understanding and experimentation, Elastic Search has 2 primary
> mechanisms for tuning performance to handle the load.  First is the shard
> count.  The higher the shard count, the more writes you can accept.  Each
> shard has a master which accepts the write, and replicates the write to
> it's replicas.  For high write throughput, you increase the count of shards
> to distribute the load across more nodes.  For read throughput, you
> increase the replica count.  This gives you higher performance on read,
> since you now have more than 1 node per shard you can query to get results.
>
>
>
> Per your suggestion, rather than than copy the documents from an index
> with 5 shards to an index with 10 shards, I can theoretically create a new
> index then add it the alias. For instance, I envision this in the following
> way.
>
>
> Aliases:
> app1-read
> app1-write
>
>
> Initial creation:
>
> app1-read -> Index: App1-index1 (5 shards, 2 replicas)
> app1-write -> Index: App1-index1
>
> User begins to have too much data in App1-index1.  The 5 shards are
> causing hotspots. The following actions take place.
>
> 1) Create App1-index2 (5 shards, 2 replicas)
>
> 2) Update app1-read -> App1-index1 and App1-index2
>
> 3) Update app1-read -> App1-index1 and App1-index2
>
>
> I have some uncertainty around this I could use help with.
>
> Once the new index has been added, how are requests routed?  For instance,
> if I have a document "doc1" in the App1-index1, and I delete it after
> adding the new index, is the alias smart enough to update App1-index1, or
> will it broadcast the operation to both indexes?  In other words, if I
> create an alias with 2 or more indexes, will the alias perform routing, or
> is it a broadcast to all indexes.  How does this scale in practice?
>
>
> In my current understanding, using an alias on read is simply going to be
> the same as if you doubled shards.  If you have 10 shard (replication 2)
> index, this is functionally equivalent to aliases that aggregate 2 indexes
> with 5 shards and 2 replicas each.  All 10 shards (or one of their
> replicas) would need to be read to aggregate the results, correct?
>
>
>
> Lastly, we have a Multi Region requirement.  We are by eventually
> consistent by design between regions.  We want documents written in one
> region to be replicated to another for query.  All documents are immutable.
>  This is by design, so we don't get document version collisions between
> data centers.  What are some current mechanisms in use in production
> environments to replicate indexes across regions?  I just can't seem to
> find any.  Rivers was my initial thinking so regions can pull data from
> other regions.  However if this isn't a good fit, what is?
>
> Thanks guys!
> Todd
>
>
>
>
>
> On Thu, Jun 5, 2014 at 9:21 AM, [email protected] <
> [email protected]> wrote:
>
>> The knapsack plugin does not come with a downtime. You can increase
>> shards on the fly by copying an index over to another index (even on
>> another cluster). The index should be write disabled during copy though.
>>
>> Increasing replica level is a very simple command, no index copy required.
>>
>> It seems you have a slight misconception about controlling replica
>> shards. You can not start dedicated copy actions only from the replica. (By
>> setting _preference for search, this works for queries).
>>
>> Maybe I do not understand your question, but what do you mean by "dual
>> writes"? And why would you "move" an index?
>>
>> Please check the index aliases. The concept of index aliases allow
>> redirecting index names in the API by a simple atomic command.
>>
>> It will be tough to monitor an outgrowing index since there is no clear
>> indication of the type "this cluster capacity is full because the index is
>> too large or overloaded, please add your nodes now". In real life, heaps
>> will fill up here and there, latency will increase, all of a sudden queries
>> or indexing will congest now and then. If you encounter this, you have no
>> time to copy an old index to a new one - the copy process also takes
>> resources, and the cluster may not have enough. You must begin to add nodes
>> way before capacity limit is reached.
>>
>> Instead of copying an index, which is a burden, you should consider
>> managing a bunch of indices. If an old index is too small, just start a new
>> one which is bigger and has more shards and spans more nodes, and add them
>> to the existing set of indices. With index alias you can combine many
>> indices into one index name. This is very powerful.
>>
>> If you can not estimate the data growth rate, I recommend also to use a
>> reasonable number of shards from the very start. Say, if you expect 50
>> servers to run an ES node on, then simply start with 50 shards on a small
>> number of servers, and add servers over time. You won't have to bother
>> about shard count for a very long time if you choose such a strategy.
>>
>> Do not think about rivers, they are not built for such use cases. Rivers
>> are designed as a "play tool" for fetching data quickly from external
>> sources, for demo purpose. They are discouraged for serious production use,
>> they are not very reliable if they run unattended.
>>
>> Jörg
>>
>>
>> On Thu, Jun 5, 2014 at 7:33 AM, Todd Nine <[email protected]> wrote:
>>
>>>
>>>
>>>
>>>> 2) https://github.com/jprante/elasticsearch-knapsack might do what you
>>>> want.
>>>>
>>>
>>> This won't quite work for us.  We can't have any down time, so it seems
>>> like an A/B system is more appropriate.  What we're currently thinking is
>>> the following.
>>>
>>> Each index has 2 aliases, a read and a write alias.
>>>
>>> 1) Both read and write aliases point to an initial index. Say shard
>>> count 5 replication 2 (ES is not our canonical data source, so we're ok
>>> with reconstructing search data)
>>>
>>> 2) We detect via monitoring we're going to outgrow an index. We create a
>>> new index with more shards, and potentially a higher replication depending
>>> on read load.  We then update the write alias to point to both the old and
>>> new index.  All clients will then being dual writes to both indexes.
>>>
>>> 3) While we're writing to old and new, some process (maybe a river?)
>>> will begin copying documents updated < the write alias time from the old
>>> index to the new index.  Ideally, it would be nice if each replica could
>>> copy only it's local documents into the new index.  We'll want to throttle
>>> this as well.  Each node will need additional operational capacity
>>> to accommodate the dual writes as well as accepting the write of the "old"
>>> documents.  I'm concerned if we push this through too fast, we could cause
>>> interruptions of service.
>>>
>>>
>>> 4) Once the copy is completed, the read index is moved to the new index,
>>> then the old index is removed from the system.
>>>
>>> Could such a process be implemented as a plugin?  If the work can happen
>>> in parallel across all nodes containing a shard we can increase the
>>> process's speed dramatically.  If we have a single worker, like a river, it
>>> might possibly take too long.
>>>
>>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/4qO5BZSxWhc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcxRGjZr%3DS9%3DSaOZYddWyNxjvFNJF41T_hHe7inoZOwQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcxRGjZr%3DS9%3DSaOZYddWyNxjvFNJF41T_hHe7inoZOwQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CA%2Byzqf_wMHGZcK%2Bxozm%3Dxz4zh5prKmuob%3DBbXKzrHoVgiOk1wA%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CA%2Byzqf_wMHGZcK%2Bxozm%3Dxz4zh5prKmuob%3DBbXKzrHoVgiOk1wA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEiKbTqhbYM2eeCCh5Y%2BD%3D1-isbzVPkK7gTeLa4X96spQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shard count and plugin questions

Reply via email to