On 4/9/2021 4:38 PM, Natarajan, Rajeswari wrote:
Trying to understand how solr is co-locating documents with a prefix using 
composite id router scheme.

Created a collection with 2 shards with composite id router. Published 3 docs , 2 docs with  prefix 
 "tenant1!" in the docId field and 1 doc with prefix "tenant2!" in the docId.
Queried the collections with shards=shard1 and shards=shard2 parameter.

Saw that 3 documents are placed in shard1 and on shard2 there are no documents. 
 Is there a certain threshold number of docs  to be present in shard1 ,before 
shard2 is considered.

According to https://sematext.com/blog/solrcloud-large-tenants-and-routing/ , 
documents with first level prefix will be routed to one shard.  Is it a 
possibility to send documents of one tenant to occupy one shard in a collection 
in composite id router scheme.

Composite routing like that does not exactly let you choose which shards will be used.

Here's a relevant quote from the reference guide:

'So "IBM/3!12345" will take 3 bits from the shard key and 29 bits from the unique doc id, spreading the tenant over 1/8th of the shards in the collection. Likewise if the num value was 2 it would spread the documents across 1/4th the number of shards. At query time, you include the prefix(es) along with the number of bits into your query with the _route_ parameter (i.e., q=solr&_route_=IBM/3!) to direct queries to specific shards.'

The part before the ! is hashed as is the part after the ! character. The hash bits are then combined, and that full hash decides which shard will get the document.

You can't say "use these specific shards" with that capability. The tenant part just tells Solr to only use a certain reduced number of shards, but because it utilizes hashing to figure out which shards to use, there's never any guarantee that tenant1 will choose different shards from tenant2. So you cannot use this to accomplish your original goal of determining the index size of a single tenant within a collection.

Thanks,
Shawn

Reply via email to