On 3/14/2018 9:26 AM, Chris Ulicny wrote:
We've been looking at using implicit for one of our collections, and there
seems to be some weird behavior that we're not sure whether it was expected
or not.

Is it recommended to use a uniqueKey for implicit routing? Is the following
behavior intended?

We have encountered the following issue. Create a collection with two
shards (S1,S2), implicit routing, with "id" as uniqueKey, and router.field
as "routingfield". If we index

{"id":"id1","routingfield":"S1"}

It goes into shard S1. Then if we need to reindex the document with a
different "routingfield" value:

{"id":"id1","routingfield":"S2"}

It goes into shard S2. However, when you select the document in a query, it
seems that both of those documents exist, but get deduped on return since
selecting all documents only ever returns a single document. Adding [shard]
to the fl list results in the document coming from S1 some of the time and
S2 the rest.

Trying to use /get with just the id results in a NullReferenceException.
Adding the _route_ parameter in works, but both documents can be retrieved.

This is a common misconception with the implicit router. That name is a completely correct summary of what the router does, but it is one of those "overloaded" words in the English language that is often not completely understood.

A better name for "implicit" would actually be "manual." By using this router, you have told Solr not to worry about routing -- that you're going to handle it, and that you're going to make sure every document is unique across all shards.  Then you indexed the same document to two shards -- intentionally.  Solr isn't going to prevent that -- there's nothing it can do to prevent it without making all indexing a LOT slower.

If you want Solr to handle routing for you, then you must use the compositeId router.  With that router, you do not get to specify which shard contains your document, and you cannot add shards after the collection is created.  Later you can SPLIT shards, but you can't add them.

Thanks,
Shawn

Reply via email to