It seems to me that most people arguing this have trivial scalability 
requirements.  Not trying to be rude by saying that btw.  But shard 
splitting is really the only way to scale from 250GB indexed to 500TB 
indexed.  

On Thursday, December 11, 2014 4:58:42 PM UTC-8, Andrew Selden wrote:
>
> I would agree that shard splitting is not the best approach. Much better 
> to design for expansion by building in layers of indirection into your 
> application through the techniques of over-sharding, index aliasing, and 
> multiple indices.
>

Yes.. all those are lame attempts at shard splitting.

Over sharding is wasteful, it might not have a significant performance 
impact in practice if you only have a few shards, but if you only add a few 
you're not goign to be able to increase your capacity.

Using multiple indexes is just a way to cheat by adding more shards in a 
round about fashion, your runtime query performance will suffer because of 
this.

>
> First, you can allocate more shards than you need when you create the 
> index. If you need 5 shards today, but think you might need 10 shards in 6 
> months, then just create the index with 10 shards. We call this 
> over-sharding. There really is no penalty to doing this within reason. 
>

So you've only given yourself a 2x overhead in capacity.  That's not very 
elastic.

With shard splitting you can go from 2x to 10x to 100x without any wasted 
IO in over-indexing.
 

> Searching against 1 index with 50 shards is exactly the same as searching 
> against 50 indices with one shard. 
>

No it's not.. if the shards are on the same box you're paying a performance 
cost there.. If the indexes are small and fit in memory you won't feel it 
that much.
 

> Second, as others have mentioned, use multiple indices and hide them away 
> behind an alias. 
>

If each index has say 20 shards, and you have 10 indexes, then you have 200 
shards to run your query against.  This means queries that use all these 
indexes will get slower and slower.

The ideal situation is to shard split so that when you need more shards, 
you just split.

If ES had this feature today, no one would be arguing against shard 
splitting. It would just be common practice.  The only issue is that ES 
hasn't implemented it yet so it's not a viable solution.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c35d0b14-46a0-4baf-b06e-b5bb3ff43e5f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to