[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2018-07-31 Thread jyoti Tiwari (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563802#comment-16563802
 ] 

jyoti Tiwari edited comment on SOLR-8297 at 7/31/18 3:08 PM:
-

Erick:

i guess i am fulfilling all condition to run solr join query given by you in 
points A) exception handling and B) functional enhancement ,

details:

solr version: 4

total nodes: 2

total collection: 2, let say engineeringlogs1, engineeringlogs2

total core in one node: engineeringlogs1_shard1_replica1 and 
engineeringlogs2_shard1_replica1

 total core in 2nd node: engineeringlogs1_shard2_replica1 and 
engineeringlogs2_shard2_replica1

trying to join over over one node between two core, 
(engineeringlogs1_shard1_replica1 and engineeringlogs2_shard1_replica1)

query:

on one default core: engineeringlogs1_shard1_replica1

{!join from=Maximum_Battery_Charge to=Check_Battery_Charge 
fromIndex=engineeringlogs2_shard1_replica1}Initial_Battery_Charge: "87%"

 

but stil this issue is coming

find below error:

"error": \{ "metadata": [ "error-class", 
"org.apache.solr.common.SolrException", "root-error-class", 
"org.apache.solr.common.SolrException" ], "msg": "Cross-core join: no such core 
engineeringlogs2_shard1_replica1", "code": 400 } }

Please get me a solution how i can resolve this issue.

 


was (Author: jyoti609):
Erick:

i guess i am fulfilling all condition to run solr join query given by you in 
points A) exception handling and B) functional enhancement ,

details:

solr version: 4

total nodes: 2

total collection: 2, let say engineeringlogs1, engineeringlogs2

total core in one node: engineeringlogs1_shard1_replica1 and 
engineeringlogs2_shard1_replica1

 total core in 2nd node: engineeringlogs1_shard2_replica1 and 
engineeringlogs2_shard2_replica1

trying to join over over one node between two core, 
(engineeringlogs1_shard1_replica1 and engineeringlogs2_shard1_replica1)

query:

on one default core: engineeringlogs1_shard1_replica1

{!join from=Maximum_Battery_Charge to=Check_Battery_Charge 
fromIndex=engineeringlogs2_shard1_replica1}

Initial_Battery_Charge: "87%"

but stil this issue is coming

find below error:

"error": \{ "metadata": [ "error-class", 
"org.apache.solr.common.SolrException", "root-error-class", 
"org.apache.solr.common.SolrException" ], "msg": "Cross-core join: no such core 
engineeringlogs2_shard1_replica1", "code": 400 } }

Please get me a solution how i can resolve this issue.

 

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>Priority: Major
> Attachments: SOLR-8297.patch, SOLR-8297_Latest.patch
>
>
> h2. Proposal
> h3. General Idea
> Approach [~shikhasomani]'s range check algorithm to the most cases
> h3. Join behavior depending on router types of joined collections
> || to\\from ||CompositeId||Implicit||
> ||CompositeId| shard range check, see table below | allow |
> ||Implicit| allow | shard to shard |
> h3. CompositeId to CompositeId join behaviour for certain number of shards
>  
> || to\\from ||single||>1||
> ||single| allow (as is) | allow (range check) |
> ||>1| allow (as is) | per shard range check |
> h3. Rules from the tables above
> * joining from/to CompositeId and Implicit is blindly allowed, it pick ups 
> any collocated replica, because users who do that probably understand what 
> they do.
> * when both sides are Implicit let's join shards by name. ie if request hits 
> collectionTO_shardY_replica2 at a node, the collocated 
> collectionFROM_shardY_replica* is expected.
> * when both sides are CompositeId
> ** from single shard to single shard - nobrainer, just needs collocated 
> replica;
> ** from multiple shards to single shard - all "from" shards (any it's 
> replicas) are picked for joining 
> ** from single shard to multiple shards - existing SOLR-4905 functionality
> ** from multiple to multiple - generic range check algorithm
> ### check that fromField and toField are router.keys in these collections
> ### take shard range for the current "to" collection replica (keep in mind 
> that request is distributed across "to" collection shards)   
> ### enumerate "from" collection shrads, find their subset which covers "to" 
> shard range (this allows to handle any number of shards at both sides)
> ### pickup collocated replicas of these "from" shard subset 
> h3. Caveat 
> this is quite sensitive to shard allocation (and/or replica placement) ie 
> failed "from" replica cannot be collocated with the required "to" shard.  
> h2. Initial 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2018-07-31 Thread jyoti Tiwari (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563802#comment-16563802
 ] 

jyoti Tiwari edited comment on SOLR-8297 at 7/31/18 3:06 PM:
-

Erick:

i guess i am fulfilling all condition to run solr join query given by you in 
points A) exception handling and B) functional enhancement ,

details:

solr version: 4

total nodes: 2

total collection: 2, let say engineeringlogs1, engineeringlogs2

total core in one node: engineeringlogs1_shard1_replica1 and 
engineeringlogs2_shard1_replica1

 total core in 2nd node: engineeringlogs1_shard2_replica1 and 
engineeringlogs2_shard2_replica1

trying to join over over one node between two core, 
(engineeringlogs1_shard1_replica1 and engineeringlogs2_shard1_replica1)

query:

on one default core: engineeringlogs1_shard1_replica1

{!join from=Maximum_Battery_Charge to=Check_Battery_Charge 
fromIndex=engineeringlogs2_shard1_replica1}

Initial_Battery_Charge: "87%"

but stil this issue is coming

find below error:

"error": \{ "metadata": [ "error-class", 
"org.apache.solr.common.SolrException", "root-error-class", 
"org.apache.solr.common.SolrException" ], "msg": "Cross-core join: no such core 
engineeringlogs2_shard1_replica1", "code": 400 } }

Please get me a solution how i can resolve this issue.

 


was (Author: jyoti609):
Erick:

i guess i am fulfilling all conditonPlease given in points A) exception 
handling and B) B) functional enhancement ,

details:

solr version: 4

total nodes: 2

total collection: 2, let say engineeringlogs1, engineeringlogs2

total core in one node: engineeringlogs1_shard1_replica1 and 
engineeringlogs2_shard1_replica1

 total core in 2nd node: engineeringlogs1_shard2_replica1 and 
engineeringlogs2_shard2_replica1

trying to join over over one node between two core, 
(engineeringlogs1_shard1_replica1 and engineeringlogs2_shard1_replica1)

query:

on one default core: engineeringlogs1_shard1_replica1

{!join from=Maximum_Battery_Charge to=Check_Battery_Charge 
fromIndex=engineeringlogs2_shard1_replica1}Initial_Battery_Charge: "87%"

but stil this issue is coming

find below error:

"error": \{ "metadata": [ "error-class", 
"org.apache.solr.common.SolrException", "root-error-class", 
"org.apache.solr.common.SolrException" ], "msg": "Cross-core join: no such core 
engineeringlogs2_shard1_replica1", "code": 400 } }

Please get me a solution how i can resolve this issue.

 

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>Priority: Major
> Attachments: SOLR-8297.patch, SOLR-8297_Latest.patch
>
>
> h2. Proposal
> h3. General Idea
> Approach [~shikhasomani]'s range check algorithm to the most cases
> h3. Join behavior depending on router types of joined collections
> || to\\from ||CompositeId||Implicit||
> ||CompositeId| shard range check, see table below | allow |
> ||Implicit| allow | shard to shard |
> h3. CompositeId to CompositeId join behaviour for certain number of shards
>  
> || to\\from ||single||>1||
> ||single| allow (as is) | allow (range check) |
> ||>1| allow (as is) | per shard range check |
> h3. Rules from the tables above
> * joining from/to CompositeId and Implicit is blindly allowed, it pick ups 
> any collocated replica, because users who do that probably understand what 
> they do.
> * when both sides are Implicit let's join shards by name. ie if request hits 
> collectionTO_shardY_replica2 at a node, the collocated 
> collectionFROM_shardY_replica* is expected.
> * when both sides are CompositeId
> ** from single shard to single shard - nobrainer, just needs collocated 
> replica;
> ** from multiple shards to single shard - all "from" shards (any it's 
> replicas) are picked for joining 
> ** from single shard to multiple shards - existing SOLR-4905 functionality
> ** from multiple to multiple - generic range check algorithm
> ### check that fromField and toField are router.keys in these collections
> ### take shard range for the current "to" collection replica (keep in mind 
> that request is distributed across "to" collection shards)   
> ### enumerate "from" collection shrads, find their subset which covers "to" 
> shard range (this allows to handle any number of shards at both sides)
> ### pickup collocated replicas of these "from" shard subset 
> h3. Caveat 
> this is quite sensitive to shard allocation (and/or replica placement) ie 
> failed "from" replica cannot be collocated with the required "to" shard.  
> h2. Initial Description
> Enhancement based on 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2018-07-31 Thread jyoti Tiwari (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563522#comment-16563522
 ] 

jyoti Tiwari edited comment on SOLR-8297 at 7/31/18 11:52 AM:
--

Hi Shikha, i need your help on one issue i am facing currently on solr4 join 
query.
 i am trying to make cross join query for distributed collection on cloud i.e 
engineeringlogs_shard1_replica1 (core1)and 
engineeringlogs2_shard1_replica1(core2)  for one node, but after join querying 
on collection engineeringlogs2 with engineeringlogs:

{!join from=Maximum_Battery_Charge to=Maximum_Battery_Charge 
fromIndex=engineeringlogs_shard1_replica1}

Time_Diff_Start_End_BC:"1281"

it is giving error: Cross-core join: no such core 
engineeringlogs_shard1_replica1] with root cause

please help me on this issue, whether i can make this cross join query on 
single node on solr4.x or i need to upgrade solr version or i am making wrong 
solr query.
 Please help


was (Author: jyoti609):
Hi Shikha, i need your help on one issue i am facing currently on solr4 join 
query.
i am trying to make cross join query for distributed collection on cloud i.e 
engineeringlogs_shard1_replica1 and engineeringlogs2_shard1_replica1 for one 
node, but after join querying: {!join from=Maximum_Battery_Charge 
to=Maximum_Battery_Charge 
fromIndex=engineeringlogs_shard1_replica1}Time_Diff_Start_End_BC:"1281"

it is giving error: Cross-core join: no such core 
engineeringlogs_shard1_replica1] with root cause 


please help me on this issue, whether i can make this cross join query on 
single node on solr4.x or i need to upgrade solr version or i am making wrong 
solr query.
Please help

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>Priority: Major
> Attachments: SOLR-8297.patch, SOLR-8297_Latest.patch
>
>
> h2. Proposal
> h3. General Idea
> Approach [~shikhasomani]'s range check algorithm to the most cases
> h3. Join behavior depending on router types of joined collections
> || to\\from ||CompositeId||Implicit||
> ||CompositeId| shard range check, see table below | allow |
> ||Implicit| allow | shard to shard |
> h3. CompositeId to CompositeId join behaviour for certain number of shards
>  
> || to\\from ||single||>1||
> ||single| allow (as is) | allow (range check) |
> ||>1| allow (as is) | per shard range check |
> h3. Rules from the tables above
> * joining from/to CompositeId and Implicit is blindly allowed, it pick ups 
> any collocated replica, because users who do that probably understand what 
> they do.
> * when both sides are Implicit let's join shards by name. ie if request hits 
> collectionTO_shardY_replica2 at a node, the collocated 
> collectionFROM_shardY_replica* is expected.
> * when both sides are CompositeId
> ** from single shard to single shard - nobrainer, just needs collocated 
> replica;
> ** from multiple shards to single shard - all "from" shards (any it's 
> replicas) are picked for joining 
> ** from single shard to multiple shards - existing SOLR-4905 functionality
> ** from multiple to multiple - generic range check algorithm
> ### check that fromField and toField are router.keys in these collections
> ### take shard range for the current "to" collection replica (keep in mind 
> that request is distributed across "to" collection shards)   
> ### enumerate "from" collection shrads, find their subset which covers "to" 
> shard range (this allows to handle any number of shards at both sides)
> ### pickup collocated replicas of these "from" shard subset 
> h3. Caveat 
> this is quite sensitive to shard allocation (and/or replica placement) ie 
> failed "from" replica cannot be collocated with the required "to" shard.  
> h2. Initial Description
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-06-16 Thread Shikha Somani (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334957#comment-15334957
 ] 

Shikha Somani edited comment on SOLR-8297 at 6/16/16 11:30 PM:
---

*Any* option is introduced to support existing cloud join scenario i.e. where 
_fromCollection is singly sharded_. If asserting Any’s behavior is the only 
concern, will write test cases for thorough verification. Below is a scenario 
which resembles real world and will write test case according to it.

*Scenario*: 
There are 2 collections in a 2 node cluster:
* product_category: It has values like books, toys, etc. _Singly sharded_
* sale: Holds information about current sale. Sale and product collection are 
related, sale collection contains ‘product key’. _Multi sharded_

*Query*: Find sale information with product information:
{!join from=id to =productKey fromCollection= product_category}

*Cluster information*:

||Node1| ||Node2|| ||
|Product_category_shard1_replica1|8000-7fff|Product_category_shard1_replica2|8000-7fff|
|Sale_shard1_replica1|0-7fff|Sale_shard2_replica1|8000-|

With this scenario join can be applied between Sale and Product_category only 
with “Any” condition only otherwise range check will fail, preventing join 
query.


was (Author: shikhasomani):
*Any* option is introduced to support existing cloud join scenario i.e. where 
fromCollection is singly sharded. If asserting Any’s behavior is the only 
concern, will write test cases for thorough verification. Below is a scenario 
which resembles real world and will write test case according to it.

*Scenario*: 
There are 2 collections in a 2 node cluster:
* product_category: It has values like books, toys, etc. _Singly sharded_
* sale: Holds information about current sale. Sale and product collection are 
related, sale collection contains ‘product key’. _Multi sharded_

*Query*: Find sale information with product information:
{!join from=id to =productKey fromCollection= product_category}

*Cluster information*:

||Node1| ||Node2|| ||
|Product_category_shard1_replica1|8000-7fff|Product_category_shard1_replica2|8000-7fff|
|Sale_shard1_replica1|0-7fff|Sale_shard2_replica1|8000-|

With this scenario join can be applied between Sale and Product_category only 
with “Any” condition only otherwise range check will fail, preventing join 
query.

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 
> use-case in the cross-core join in SolrCloud.
> Hope this helps



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-06-13 Thread Shikha Somani (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328320#comment-15328320
 ] 

Shikha Somani edited comment on SOLR-8297 at 6/13/16 9:33 PM:
--

Below are two proposed solutions to “Allow join query over 2 sharded 
collections” i.e. fixing the broken functionality in Solr 5.x. It is not an 
enhancement for supporting join on multiple shards present on same jvm.

*Proposed solution*: Two possible solutions:
*1. Distributed join with Range*: This will allow join with greater flexibility 
by considering range instead of shard name while selecting fromCollection 
replica. The current implementation requires fromCollection to be singly 
sharded, with this solution fromCollection can be either singly sharded, 
equally sharded (as toCollection) or it can overlap with toCollection range.

* *Solution details*: A new parameter “joinMode” will be introduced. This 
parameter will govern on what basis replica will be selected based on range.
Possible values of joinMode:
** *Exact*: The “fromCollection” shard range should exactly match with 
“toCollection” shard present on that node then only join will be applied 
between two collections. This is the _default_ value
** *Overlap*: Shard range of “fromCollection” should overlap with 
“toCollection” on given node. 
** *Any*: This option will not consider range check, it will pick any replica 
of fromCollection that is present on that node and apply join

*2. Non-distributed join*: The same way join worked in Solr 4.x. Client will 
mention exact replica of “fromCollection” with which join will be applied. It 
is required to pass  “distrib=false” in query parameters

If either of the solution is fine will submit a PR for that.


was (Author: shikhasomani):
Below are two proposed solutions to “Allow join query over 2 sharded 
collections” i.e. fixing the broken functionality in Solr 5.x. It is not an 
enhancement for supporting join on multiple shards present on same jvm.

*Proposed solution*: Two possible solutions:
# *Distributed join with Range*: This will allow join with greater flexibility 
by considering range instead of shard name (rigid criteria) while selecting 
fromCollection replica. The current implementation requires fromCollection to 
be singly sharded, with this solution fromCollection can be either singly 
sharded, equally sharded (as toCollection) or it can overlap with toCollection 
range.

** *Solution details*: A new parameter “joinMode” will be introduced. This 
parameter will govern on what basis replica will be selected based on range.
Possible values of joinMode:
#**Exact*: The “fromCollection” shard range should exactly match with 
“toCollection” shard present on that node then only join will be applied 
between two collections. This is the _default_ value
#**Overlap*: Shard range of “fromCollection” should overlap with “toCollection” 
on given node. 
#**Any*: This option will not consider range check, it will pick any replica of 
fromCollection that is present on that node and apply join
#*Non-distributed join*: The same way it worked in Solr 4.x. Client will 
mention exact replica of “fromCollection” with which join will be applied. It 
is required to pass  “distrib=false” in query parameters

If this solution is fine will submit a PR for this fix.

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/6/16 5:44 AM:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1:   coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983


needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)


was (Author: shukla.sus...@gmail.com):
I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983


needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/6/16 5:44 AM:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983


needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)


was (Author: shukla.sus...@gmail.com):
I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

{html}


coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

{html}

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/6/16 5:43 AM:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

{html}


coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

{html}

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)


was (Author: shukla.sus...@gmail.com):
I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-03 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268311#comment-15268311
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/3/16 2:09 PM:
-

This patch may not work for multiple shards hosted on the same jvm since 
nodename for different shards would be same. e.g. consider below configuration 
where maxShardsPerNode=4.
coll1: shard1: 
 :8983
  :7574
 shard2: 
  :8983
  :7574
coll2: shard1: 
  :8983
  :7574
 shard2: 
  :8983
  :7574
To fix it, I had to match the shard name + nodename


was (Author: shukla.sus...@gmail.com):
This patch will not work for multiple shards hosted on the same jvm since 
nodename for different shards would be same. e.g. consider below configuration 
where maxShardsPerNode=4.
coll1: shard1: :8983
  :7574
 shard2: :8983
  :7574
coll2: shard1: :8983
  :7574
 shard2: :8983
  :7574

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 
> use-case in the cross-core join in SolrCloud.
> Hope this helps



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-04-29 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264662#comment-15264662
 ] 

Mikhail Khludnev edited comment on SOLR-8297 at 4/29/16 8:13 PM:
-

To be honest, this fix exceeds my understanding of the SolrCloud. Can you 
extend existing {{DistribJoinFromCollectionTest}} to cover this scenario?


was (Author: mkhludnev):
To be honest, this fix exceed my understanding of the SolrCloud. Can you extend 
existing {{DistribJoinFromCollectionTest}} to cover this scenario?

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 
> use-case in the cross-core join in SolrCloud.
> Hope this helps



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org