It seems to me that most people arguing this have trivial scalability requirements. Not trying to be rude by saying that btw. But shard splitting is really the only way to scale from 250GB indexed to 500TB indexed.
On Thursday, December 11, 2014 4:58:42 PM UTC-8, Andrew Selden wrote: > > I would agree that shard splitting is not the best approach. Much better > to design for expansion by building in layers of indirection into your > application through the techniques of over-sharding, index aliasing, and > multiple indices. > Yes.. all those are lame attempts at shard splitting. Over sharding is wasteful, it might not have a significant performance impact in practice if you only have a few shards, but if you only add a few you're not goign to be able to increase your capacity. Using multiple indexes is just a way to cheat by adding more shards in a round about fashion, your runtime query performance will suffer because of this. > > First, you can allocate more shards than you need when you create the > index. If you need 5 shards today, but think you might need 10 shards in 6 > months, then just create the index with 10 shards. We call this > over-sharding. There really is no penalty to doing this within reason. > So you've only given yourself a 2x overhead in capacity. That's not very elastic. With shard splitting you can go from 2x to 10x to 100x without any wasted IO in over-indexing. > Searching against 1 index with 50 shards is exactly the same as searching > against 50 indices with one shard. > No it's not.. if the shards are on the same box you're paying a performance cost there.. If the indexes are small and fit in memory you won't feel it that much. > Second, as others have mentioned, use multiple indices and hide them away > behind an alias. > If each index has say 20 shards, and you have 10 indexes, then you have 200 shards to run your query against. This means queries that use all these indexes will get slower and slower. The ideal situation is to shard split so that when you need more shards, you just split. If ES had this feature today, no one would be arguing against shard splitting. It would just be common practice. The only issue is that ES hasn't implemented it yet so it's not a viable solution. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c35d0b14-46a0-4baf-b06e-b5bb3ff43e5f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
