I am considering using SolrCloud, but I have a use case that I am not sure if it covers.
I would like to keep an index up to date in realtime, but also I would like to sometimes restate the past. The way that I would restate the past is to do batch processing over historical data. My idea is that I would have the Solr collection sharded by date range. As I move forward in time I would add more shards. For restating historical data I would have a separate process that actually indexes a shards worth of data. (This keeps the servers that are meant for production search from having to handle the load of indexing historically.) I would then move the index files to the solr servers and register the newly created index with the server replacing the existing shards. I used to be able to do something similar pre-SolrCloud by using the core admin. But this did not have the benefit of having one search for the entire "collection". I had to manually query each of the cores to get the full search index. Essentially the question is: 1- is it possible to shard by date range in this way? 2- is it possible to swap out the index used by a shard? 3- is there a different way I should be thinking of this? Max