[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/6/16 5:44 AM:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1:   coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983


needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)


was (Author: shukla.sus...@gmail.com):
I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983


needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/6/16 5:44 AM:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983


needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)


was (Author: shukla.sus...@gmail.com):
I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

{html}


coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

{html}

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions 

[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/6/16 5:43 AM:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

{html}


coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

{html}

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)


was (Author: shukla.sus...@gmail.com):
I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem 

[jira] [Commented] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-05 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273650#comment-15273650
 ] 

Susmit Shukla commented on SOLR-8297:
-

I had the exact same requirement as mentioned in B) functional enhancements. I 
implemented it by extending the JoinQParserPlugin and registering the parser in 
solrconfig.xml. I don't think the solution is ready for open source yet. Two 
reasons for that as Eric already mentioned -

- Enabling sharded join where both collections have to be equally sharded and 
replicated on the same router.field with same hash range distribution among 
named shards is a narrow use case 
- Solution is restricted to solr cloud layout where corresponding shards of 
'from' and 'to' collections run in the same jvm

Initially my impl was same as the above patch but it failed in a bigger 
deployment where multiple shards ran in same jvm. e.g it should support join 
for this layout-

coll1: coll2:
shard1::8983   shard1::8983
shard2::8983   shard2::8983

needed to match both shard name and node name for this case to work
overridden two methods: findLocalReplicaForFromIndex, createParser
to get current shard name - toShardId = 
queryRequest.getCore().getCoreDescriptor().getCloudDescriptor().getShardId();
queryRequest (SolrQueryRequest) member variable can set in the createParser 

toShardId.equals(slice.getName()) should be additional condition here - if 
(replica.getNodeName().equals(nodeName) && replica.getState() == 
Replica.State.ACTIVE)

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 
> use-case in the cross-core join in SolrCloud.
> Hope this helps



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-03 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268311#comment-15268311
 ] 

Susmit Shukla edited comment on SOLR-8297 at 5/3/16 2:09 PM:
-

This patch may not work for multiple shards hosted on the same jvm since 
nodename for different shards would be same. e.g. consider below configuration 
where maxShardsPerNode=4.
coll1: shard1: 
 :8983
  :7574
 shard2: 
  :8983
  :7574
coll2: shard1: 
  :8983
  :7574
 shard2: 
  :8983
  :7574
To fix it, I had to match the shard name + nodename


was (Author: shukla.sus...@gmail.com):
This patch will not work for multiple shards hosted on the same jvm since 
nodename for different shards would be same. e.g. consider below configuration 
where maxShardsPerNode=4.
coll1: shard1: :8983
  :7574
 shard2: :8983
  :7574
coll2: shard1: :8983
  :7574
 shard2: :8983
  :7574

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 
> use-case in the cross-core join in SolrCloud.
> Hope this helps



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8297) Allow join query over 2 sharded collections: enhance functionality and exception handling

2016-05-03 Thread Susmit Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268311#comment-15268311
 ] 

Susmit Shukla commented on SOLR-8297:
-

This patch will not work for multiple shards hosted on the same jvm since 
nodename for different shards would be same. e.g. consider below configuration 
where maxShardsPerNode=4.
coll1: shard1: :8983
  :7574
 shard2: :8983
  :7574
coll2: shard1: :8983
  :7574
 shard2: :8983
  :7574

> Allow join query over 2 sharded collections: enhance functionality and 
> exception handling
> -
>
> Key: SOLR-8297
> URL: https://issues.apache.org/jira/browse/SOLR-8297
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.3
>Reporter: Paul Blanchaert
>
> Enhancement based on SOLR-4905. New Jira issue raised as suggested by Mikhail 
> Khludnev.
> A) exception handling:
> The exception "SolrCloud join: multiple shards not yet supported" thrown in 
> the function findLocalReplicaForFromIndex of JoinQParserPlugin is not 
> triggered correctly: In my use-case, I've a join on a facet.query and when my 
> results are only found in 1 shard and the facet.query with the join is 
> querying the last replica of the last slice, then the exception is not thrown.
> I believe it's better to verify the nr of slices when we want to verify the  
> "multiple shards not yet supported" exception (so exception is thrown when 
> zkController.getClusterState().getSlices(fromIndex).size()>1).
> B) functional enhancement:
> I would expect that there is no problem to perform a cross-core join over 
> sharded collections when the following conditions are met:
> 1) both collections are sharded with the same replicationFactor and numShards
> 2) router.field of the collections is set to the same "key-field" (collection 
> of "fromindex" has router.field = "from" field and collection joined to has 
> router.field = "to" field)
> The router.field setup ensures that documents with the same "key-field" are 
> routed to the same node. 
> So the combination based on the "key-field" should always be available within 
> the same node.
> From a user perspective, I believe these assumptions seem to be a "normal" 
> use-case in the cross-core join in SolrCloud.
> Hope this helps



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org