Right. That’s the whole point of hashing the <uniqueKey> in the first place. I’ve never seen much imbalance in how documents are distributed using compositeId, maybe a percent or two.
Do be aware that you can’t really extrapolate from, say, 100 docs over 10 shards. With such low numbers you can get some anomalies but as you add docs that’ll smooth out. The only time that’s not true is if you use the “Multi-level composite ID”. Here’s a blog https://lucidworks.com/post/multi-level-composite-id-routing-solrcloud/ I _strongly_ recommend you do _not_ use this unless and until you have a demonstrable use-case. Other than time series data (which uses implicit routing anyway), when people try to control what shards a doc lands on they usually cause themselves unnecessary trouble ;) Best, Erick > On Jun 30, 2019, at 3:06 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote: > > @Erick > > Actually, i thought further and realized what you were saying. I am hoping > to rely on the murmur3 hash of the routing key to find the destination > shard. > > > > On Sun, Jun 30, 2019 at 3:32 AM Nawab Zada Asad Iqbal <khi...@gmail.com> > wrote: > >> Hi Erick, >> >> I plan to use the composite-id routing. And I can use the same routing >> part of the key to determine the shard number in ADDREPLICA command (using >> the route parameter). I think this solution will work for me. >> >> >> Thanks >> Nawab >> >> >> >> On Sat, Jun 29, 2019 at 8:55 AM Erick Erickson <erickerick...@gmail.com> >> wrote: >> >>> What’s your basis for thinking that some shard will get more queries? >>> Unless you’re using implicit routing, you really have no control over >>> either where docs end up or thus where queries go. >>> >>> If you do somehow know some shards get more queries, one strategy is to >>> simply have more _replicas_ for those shards with the ADDREPLICA >>> collections API command. >>> >>> >>>> On Jun 29, 2019, at 7:00 AM, Shawn Heisey <apa...@elyograg.org> wrote: >>>> >>>> On 6/29/2019 12:23 AM, Nawab Zada Asad Iqbal wrote: >>>>> is it possible to specify different number of replicas for different >>>>> shards? i.e if I expect some shard to get more queries , i can add more >>>>> replicas to that shard alone, instead of adding replicas for all the >>>>> shards. >>>> >>>> On initial collection creation, I don't think that is possible -- the >>> number of replicas requested will apply to every shard. But you can add >>> and remove replicas on shards after collection creation, so this is >>> achievable. >>>> >>>> Thanks, >>>> Shawn >>> >>>