Unsubscribe Sent from my iPhone
> On Jul 1, 2017, at 8:02 PM, Susheel Kumar <susheel2...@gmail.com> wrote: > > Depending on your use case people also use collection aliasing for time > series data. See below > > https://blog.cloudera.com/blog/2013/10/collection-aliasing-near-real-time-search-for-really-big-data/ > >> On Sat, Jul 1, 2017 at 7:13 PM, Susheel Kumar <susheel2...@gmail.com> wrote: >> >> As Eric said 1docs/month isn't a big deal. I have 45+ million docs in one >> shard but YMMV depending on other factors. >> >> Also there is lot of confusion in the terminology. The default routing is >> compositeID routing. The implicit routing which Eric mentioned is the >> manual routing. https://issues.apache.org/jira/browse/SOLR-6630 >> >> Which routing you are suggesting to use? Can you clarify again. Also >> what's your exact use case. Do you query old aged documents or you don't >> need to and most or all of your queries are supposed to go to shard with >> newer documents. >> >> Thanks, >> Susheel >> >> On Sat, Jul 1, 2017 at 12:14 PM, Erick Erickson <erickerick...@gmail.com> >> wrote: >> >>> 1M docs/month shouldn't make Solr break a sweat. If it really worries >>> you and you're indexing in a big batch, index during off hours. At >>> very worst, if you're ingesting them all at once you might have to >>> throttle the indexing a bit. >>> >>> Frankly, most of the time acquiring the documents from the system of >>> record is where the bottleneck is and Solr easily handles the indexing >>> load. >>> >>> The other advantage is that if you use implicit routing rather than a >>> composite ID, you can add shards to your collection one at a time as >>> required, for time-series data that's an elegant way to "age out" old >>> documents. >>> >>> Best, >>> Erick >>> >>>> On Sat, Jul 1, 2017 at 8:57 AM, mganeshs <mgane...@live.in> wrote: >>>> Hi Susheel, >>>> >>>> Currently we have around 20M documents already and we are expecting now >>> on >>>> that every month 1M of documents. >>>> The reason why don't want to for time based implicit routing is that, >>> all >>>> documents will end up with recent shard and so indexing will be heavy >>> for >>>> the new shard, where as older shards will be used just for query >>> purpose. >>>> If we have default sharding, then load for indexing is distributed >>> across >>>> all the shards. That's the reason we would like to stick to default >>>> sharding. But Join is the issue over here when default sharding is used >>> :-( >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://lucene.472066.n3.nabble >>> .com/Allow-Join-over-two-sharded-collection-tp4343443p4343803.html >>>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >> >>