[ https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334957#comment-15334957 ]
Shikha Somani commented on SOLR-8297: ------------------------------------- *Any* option is introduced to support existing cloud join scenario i.e. where fromCollection is singly sharded. If asserting Any’s behavior is the only concern, will write test cases for thorough verification. Below is a scenario which resembles real world and will write test case according to it. *Scenario*: There are 2 collections in a 2 node cluster: * product_category: It has values like books, toys, etc. _Singly sharded_ * sale: Holds information about current sale. Sale and product collection are related, sale collection contains ‘product key’. _Multi sharded_ *Query*: Find sale information with product information: {!join from=id to =productKey fromCollection= product_category} *Cluster information*: ||Node1| ||Node2|| || |Product_category_shard1_replica1|80000000-7fffffff|Product_category_shard1_replica2|80000000-7fffffff| |Sale_shard1_replica1|0-7fffffff|Sale_shard2_replica1|80000000-ffffffff| With this scenario join can be applied between Sale and Product_category only with “Any” condition only otherwise range check will fail, preventing join query. > Allow join query over 2 sharded collections: enhance functionality and > exception handling > ----------------------------------------------------------------------------------------- > > Key: SOLR-8297 > URL: https://issues.apache.org/jira/browse/SOLR-8297 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Affects Versions: 5.3 > Reporter: Paul Blanchaert > > Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail > Khludnev. > A) exception handling: > The exception "SolrCloud join: multiple shards not yet supported" thrown in > the function findLocalReplicaForFromIndex of JoinQParserPlugin is not > triggered correctly: In my use-case, I've a join on a facet.query and when my > results are only found in 1 shard and the facet.query with the join is > querying the last replica of the last slice, then the exception is not thrown. > I believe it's better to verify the nr of slices when we want to verify the > "multiple shards not yet supported" exception (so exception is thrown when > zkController.getClusterState().getSlices(fromIndex).size()>1). > B) functional enhancement: > I would expect that there is no problem to perform a cross-core join over > sharded collections when the following conditions are met: > 1) both collections are sharded with the same replicationFactor and numShards > 2) router.field of the collections is set to the same "key-field" (collection > of "fromindex" has router.field = "from" field and collection joined to has > router.field = "to" field) > The router.field setup ensures that documents with the same "key-field" are > routed to the same node. > So the combination based on the "key-field" should always be available within > the same node. > From a user perspective, I believe these assumptions seem to be a "normal" > use-case in the cross-core join in SolrCloud. > Hope this helps -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org