I think that most of these complications go away to a remarkable degree if you combine katta style random assignment of small shards.
The major simplifications there include: - no need to move individual documents, nor to split or merge shards, no need for search-server to search-server communications - search servers do search and nothing else - placement, balance, replication and query balancing policy is factored out of all real-time paths - real-time updates can be accommodated in the same framework with minimal changes to the shard management layer - the shard management is completely agnostic to the actual search semantics. On Thu, Jan 14, 2010 at 9:46 AM, Yonik Seeley <yo...@lucidimagination.com>wrote: > I'm actually starting to lean toward "slice" instead of "logical shard". > In the future we'll want to enable overlapping shards I think (due to > an Amazon Dynamo type of replication, or due to merging shards, etc), > and a separate word for a logical slice of the index seems desirable. > -- Ted Dunning, CTO DeepDyve