Re: Document routing with composted router

Shawn Heisey Fri, 09 Apr 2021 16:51:49 -0700

On 4/9/2021 4:38 PM, Natarajan, Rajeswari wrote:

Trying to understand how solr is co-locating documents with a prefix using 
composite id router scheme.


Created a collection with 2 shards with composite id router. Published 3 docs , 2 docs with  prefix 
 "tenant1!" in the docId field and 1 doc with prefix "tenant2!" in the docId.
Queried the collections with shards=shard1 and shards=shard2 parameter.

Saw that 3 documents are placed in shard1 and on shard2 there are no documents. 
 Is there a certain threshold number of docs  to be present in shard1 ,before 
shard2 is considered.

According to https://sematext.com/blog/solrcloud-large-tenants-and-routing/ , 
documents with first level prefix will be routed to one shard.  Is it a 
possibility to send documents of one tenant to occupy one shard in a collection 
in composite id router scheme.

Composite routing like that does not exactly let you choose which shardswill be used.


Here's a relevant quote from the reference guide:

'So "IBM/3!12345" will take 3 bits from the shard key and 29 bits fromthe unique doc id, spreading the tenant over 1/8th of the shards in thecollection. Likewise if the num value was 2 it would spread thedocuments across 1/4th the number of shards. At query time, you includethe prefix(es) along with the number of bits into your query with the_route_ parameter (i.e., q=solr&_route_=IBM/3!) to direct queries tospecific shards.'

The part before the ! is hashed as is the part after the ! character.The hash bits are then combined, and that full hash decides which shardwill get the document.

You can't say "use these specific shards" with that capability. Thetenant part just tells Solr to only use a certain reduced number ofshards, but because it utilizes hashing to figure out which shards touse, there's never any guarantee that tenant1 will choose differentshards from tenant2. So you cannot use this to accomplish your originalgoal of determining the index size of a single tenant within a collection.


Thanks,
Shawn

Re: Document routing with composted router

Reply via email to