Re: Allow Join over two sharded collection

Erick Erickson Sat, 01 Jul 2017 09:15:45 -0700

1M docs/month shouldn't make Solr break a sweat. If it really worries
you and you're indexing in a big batch, index during off hours. At
very worst, if you're ingesting them all at once you might have to
throttle the indexing a bit.


Frankly, most of the time acquiring the documents from the system of
record is where the bottleneck is and Solr easily handles the indexing
load.

The other advantage is that if you use implicit routing rather than a
composite ID, you can add shards to your collection one at a time as
required, for time-series data that's an elegant way to "age out" old
documents.

Best,
Erick

On Sat, Jul 1, 2017 at 8:57 AM, mganeshs <mgane...@live.in> wrote:
> Hi Susheel,
>
> Currently we have around 20M documents already and we are expecting now on
> that every month 1M of documents.
> The reason why don't want to for time based implicit routing is that, all
> documents will end up with recent shard and so indexing will be heavy for
> the new shard, where as older shards will be used just for query purpose.
> If we have default sharding, then load for indexing is distributed across
> all the shards. That's the reason we would like to stick to default
> sharding. But Join is the issue over here when default sharding is used :-(
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Allow-Join-over-two-sharded-collection-tp4343443p4343803.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Allow Join over two sharded collection

Reply via email to