1M docs/month shouldn't make Solr break a sweat. If it really worries you and you're indexing in a big batch, index during off hours. At very worst, if you're ingesting them all at once you might have to throttle the indexing a bit.
Frankly, most of the time acquiring the documents from the system of record is where the bottleneck is and Solr easily handles the indexing load. The other advantage is that if you use implicit routing rather than a composite ID, you can add shards to your collection one at a time as required, for time-series data that's an elegant way to "age out" old documents. Best, Erick On Sat, Jul 1, 2017 at 8:57 AM, mganeshs <mgane...@live.in> wrote: > Hi Susheel, > > Currently we have around 20M documents already and we are expecting now on > that every month 1M of documents. > The reason why don't want to for time based implicit routing is that, all > documents will end up with recent shard and so indexing will be heavy for > the new shard, where as older shards will be used just for query purpose. > If we have default sharding, then load for indexing is distributed across > all the shards. That's the reason we would like to stick to default > sharding. But Join is the issue over here when default sharding is used :-( > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Allow-Join-over-two-sharded-collection-tp4343443p4343803.html > Sent from the Solr - User mailing list archive at Nabble.com.