Custom routing is a nice improvement for 4.1, but if I understand you correctly it is probably not what you want to use.

If I understand you correctly you want to make a collection with a number of slices - one slice for each day (or other period) - and then make kinda "slicing window" where you create a new slice under this collection every day and delete the slice corresponding to "the oldest day". It is hard to create and delete slices under a particular collection. It is much easier to delete an entire collection. Therefore I suggest you make a collection for each day (or other period) and delete collection corresponding to "the oldest day". We do that in our system based on 4.0. We are doing one collection per month though. There is a limit to how much you can put into a single slice/shard before it becomes slower to index/search - that is part of the reason for sharding. With a collection-per-day solution you also get the opportunity to put as many documents into a collection/day as you want - it is just a matter of slicing into enough slices/shards and throw enough hardware into it. If you dont have a lot of data for each day, you can just have one or two slices/shards per day-collection.

We are running our Solr cluster across 10 4CPU-core/4GB-RAM machines and we are able to index over 1 billion documents (per month) into a collection with 40 shards (=40 slices because we are not using replication) - 4 shards on each Solr node in the cluster. We still do not know how the system will behave when we have and cross-search many (up to 24 since we are supposed to keep data for 2 years before we can throw it away) collections with 1+ billion documents each.

Regards, Per Steffensen

On 12/18/12 8:20 PM, Scott Stults wrote:
I'm going to be building a Solr cluster and I want to have a rolling set of
slices so that I can keep a fixed number of days in my collection. If I
send an update to a particular slice leader, will it always hash the unique
key and (probably) forward the doc to another leader?


Thank you,
Scott


Reply via email to