Hi Brad

I agree with what Mark and Zachary have said and will expand on these.

Firstly, shard and index level operations in ElasticSearch are 
peer-to-peer. Single-shard operations will affect at most 2 nodes, the node 
receiving the request and the node hosting an instance of the shard 
(primary or replica depending on the operation). Multi-shard operations 
(such a searches) will affect from one to (N  +1) nodes where N is the 
number of shards in the index. 

So from an index/shard operation perspective there is no reason to split 
into two clusters. The key issue with index/shard operations is that the 
cluster is able to handle the traffic volume. So if you do decide to split 
out into to two clusters you will need to look at the load profile for each 
of your client types to determine how much raw processing power you need in 
each cluster. It may be that a 10:20 split is more optimum than a 15:15 
split between clusters to balance request traffic, and therefore CPU 
utilisation, across all nodes. If you go with one cluster this is not an 
issue because you can move shards between nodes to balance the request 
traffic.    

Larger clusters also imply more work for the cluster master in managing the 
cluster. This comes down to the number of nodes that the master has to 
communicate with, and manage, and the size of the cluster state. A cluster 
with 30 nodes is not too large for a master to keep track of. There will be 
an increase in network traffic associated with the increase in volume of 
master-to-worker and worker-to-master pings used to detect the 
presence/absence of nodes. This can be offset by reducing the respective 
ping intervals.

In a large cluster it is good practice to have a group of dedicated master 
nodes, say 3, from which the master is elected. These nodes do not host any 
user data meaning that cluster management is not compromised by high user 
request traffic. 

The size of the cluster state may be more of an issue. The cluster state 
comprises all of the information about the cluster configuration. The 
cluster state has records for each node, index, document mapping, shard, 
etc. Whenever there is a change to the cluster state it is first made by 
the master which then sends the updated cluster state to each worker node. 
Note that the entire cluster state is sent, not just the changes! It is 
therefore highly desirable to limit that frequency of changes to the 
cluster state, primarily by minimizing dynamic field mapping updates, and 
the overall size of the cluster state, primarily by minimizing the number 
of indices. 

In your proposed model the size of the cluster state associated the set of 
60 shared month indices will be larger than that of one set of 60 dedicated 
month indices by virtue of having 100 shards to 6. However, it may not be 
much bigger because there will be much more metadata associated with 
defining the index structure, notably the field mappings for all document 
types in the index, than the metadata defining the shards of the index. So 
it may well be that the size of the cluster state associated with 60 
"shared" month indices plus N sets of 60 "dedicated" indices is not much 
more than that of (N + 1) sets of 60 "dedicated" indices. So there may not 
be much point in splitting to two clusters. A quick way to look at this for 
your actual data model is to:
  1. Set up an index in ES with mappings for all document types and 6 
shards and 0 replicas,
  2. Retrieve the index metadata JSON using ES admin API,  
  3. Increase the number of replicas to 16 (102 shards total),
  4. Retrieve the index metadata JSON using ES admin API,
  5. Compare the two JSON documents from 2 and 4. 

As state above it is desirable to minimize the number of indices. Each 
shard is a Lucene index which consumes memory and requires open file 
descriptors from the OS for segment data files and Lucene index level 
files. You may find yourself running out of memory and/or file descriptors 
if you are not careful. 

I understand you are looking for a design that will cater for on disc data 
volume. Given that your data is split into monthly indices it may well be 
that no one index, either "shared" or "dedicated" will reach that volume in 
one month. There may also be seasonal factors to consider whereby one or 
two months have much higher volumes than others. I have read/heard about 
cases where a monthly index architecture was implemented but later scraped 
for a single index approach because the month-to-month variation in volume 
was detrimental to overall system resource utilisation and performance.

In you case think about whether monthly indices are really appropriate. An 
alternative model is to partition one years worth of data into a set of 
indices bounded by size rather than time. In this model a new index is 
started on Jan 01 and data is added to it until it reaches some predefined 
size limit, at which point a new index is created to accept new data from 
that point on. This is repeated until year end at which point you might end 
up with data for Jan/Feb/Mar and half of Apr in index 2014-01, the rest of 
Apr plus May - Oct in index 2014-02 and Nov/Dec in index 2014-03. This way 
you end up using the smallest number of indices within the constraints of 
manageable shard size and overall user data volume. This may also be a 
better approach than indices with 100 shards. This does, however, come at 
the cost of more complexity when it comes to accessing data across multiple 
indices, but this is a one off development cost rather than an ongoing 
maintenance cost.

Also, rather than focusing so much on shard size look at the number and 
size of the segment files comprising a shard. Remember that Lucene segments 
are essentially immutable (except for marking deletes) after they are 
written to disc. This means that some index management operations may 
reduce down to simply copying/moving one or two segment files across the 
network, rather than all segment files for a given shard. For example, when 
allocating shards to nodes ElasticSearch tries to allocate shards to nodes 
that already have some of that shard's segment data. The new incremental 
backup/restore in ES V1 also takes advantage of this segment immutability. 
With this in mind you might be able to support shards > 5G consisting of 
more segment files of bounded size rather than fewer segment files of 
unbounded size (and ultimately a single 5G segment file).    

Hope this helps,
Regards
Mauri


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5072d2c9-a418-4afc-82e6-d2b8926d82c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to