[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769634#comment-16769634 ] Christine Poerschke commented on SOLR-6730: --- {quote}... Looks like the feature suggested here could help with this as well, e.g. by adding a "seed" param if you want sticky replicas for a "session". Right? {quote} Correct. Thanks for re-surfacing this old ticket. I've opened SOLR-13258 for the stickiness and will close this ticket here out for clarity. > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765888#comment-16765888 ] Jan Høydahl commented on SOLR-6730: --- Hi, commenting on this old issue, as I had a discussion with a customer yesterday regarding replica affinity to utilise caches better. They fire more than 100 requests to draw a dashboard, and you get much better cache utilisation and faster load time with replicationFactor=1 than replicationFactor=3 since you need to warm three times as many caches. Looks like the feature suggested here could help with this as well, e.g. by adding a "seed" param if you want sticky replicas for a "session". Right? > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583110#comment-15583110 ] Christine Poerschke commented on SOLR-6730: --- Noble and I discussed offline re: SOLR-6730 and SOLR-8146 overlaps and differences. I will try to summarise here as bullet points, [~noble.paul] please add or correct if i missed or misunderstood something. * The use case and motivation for the {{select?replicaAffinity=(node|host)}} part of SOLR-6730 was to reduce the number of JVMs hit by a given search since the more JVMs are hit, the higher the chance of hitting a garbage collection pause in one of many JVMs. * The use case and motivation for the {{replicaAffinity.hostPriorities=...}} part of SOLR-6730 was to preferentially direct requests from the same user/source to certain areas of the cloud. ** The implementation of the {{replicaAffinity.hostPriorities=...}} approach requires configuration somewhere i.e. a list of which hosts to prioritise. ** No matter where it is stored, maintaining configuration can be cumbersome as collections and hosts change over time. * The objective of directing requests from the same user/source to certain areas of the cloud can be achieved without configuration, and the objective of reducing the number of JVMs hit by a search can pretty much be achieved that way also. ** Approach outline: *** Two numeric parameters ('seed' and 'mod') are optionally added to each request. *** The two parameters 'place' the requests within the cloud, e.g. for {{mod=9}} any seed between 0 and 8 would be valid and {{seed=6}} would 'place' the request with the 7th of 9 replicas, or more realistically the 3rd of 3 replicas. *** seed-plus-mod placement automatically adjusts when the number of replicas changes i.e. (seed=2,mod=6) would be 3rd-of-6 or 2nd-of-4 or 2nd-of-3 or 1st-of-2 placement. *** SOLR-6730 here would likely be abandoned in favour of the approach outlined. * What is common to SOLR-6730 and SOLR-8146: ** optional parameters would support changing of the existing behaviour ** existing behaviour is maintained if the optional parameters are not supplied * What is different between SOLR-6730 and SOLR-8146: ** point-of-use of the optional parameter is HttpShardHandler\[Factory\] for SOLR-6730 ** point-of-use of the optional parameter is CloudSolrClient (and HttpShardHandler\[Factory\]?) for SOLR-8146 * Next steps: 1. SOLR-8332 to factor HttpShardHandler\[Factory\]'s url shuffling out into a ReplicaListTransformer class 2. creation of additional ReplicaListTransformer implementations corresponding to the approach outlined above > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532530#comment-15532530 ] Noble Paul commented on SOLR-6730: -- [~cpoerschke] pls take a look at SOLR-8146 it's a more comprehensive strategy for replica affinity > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277183#comment-14277183 ] Christine Poerschke commented on SOLR-6730: --- Am happy to give basic tests for this a go. > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275569#comment-14275569 ] Mark Miller commented on SOLR-6730: --- Cool - I like it. I think a feature like this needs at least some basic tests added though. Any volunteers? > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206629#comment-14206629 ] ASF GitHub Bot commented on SOLR-6730: -- GitHub user cpoerschke opened a pull request: https://github.com/apache/lucene-solr/pull/104 select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support https://issues.apache.org/jira/i#browse/SOLR-6730 You can merge this pull request into a Git repository by running: $ git pull https://github.com/bloomberg/lucene-solr trunk-replica-affinity-feature Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/104.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #104 commit 66b56265bdefec7eb814bfb533c0ff19bb1dcdff Author: Christine Poerschke Date: 2014-08-12T10:32:57Z solr: select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support This commit also includes changes to reduce SearchHandler's overall use of ShardHandler objects. - solr: select?replicaAffinity=(node|host) support, select?replicaAffinity=host&replicaAffinity.hostPriorities=hostA,hostB=1,hostC=2,hostD=2,hostE=3 prioritisation support illustration: `4-hosts-x-2-ports=8-instances 8-shards 2-replica system` http://host1:port1/solr/collection1_shard1_replicaA/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host1:port2/solr/collection1_shard7_replicaA/ http://host2:port1/solr/collection1_shard2_replicaA/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host2:port2/solr/collection1_shard8_replicaA/ http://host3:port1/solr/collection1_shard1_replicaB/ http://host3:port1/solr/collection1_shard3_replicaB/ http://host3:port2/solr/collection1_shard5_replicaB/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host4:port1/solr/collection1_shard4_replicaB/ http://host4:port2/solr/collection1_shard6_replicaB/ http://host4:port2/solr/collection1_shard8_replicaB/ `.../select` plain will route sub-requests to a random selection of solr cores and so could potentially use all 8 JVM instances http://host1:port1/solr/collection1_shard1_replicaA/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host3:port1/solr/collection1_shard3_replicaB/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host4:port2/solr/collection1_shard6_replicaB/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=node` will route sub-requests to a random selection of solr cores whilst maintaining node affinity i.e. sub-requests that can go to the same solr instance will go to the same solr instance e.g. http://host1:port1/solr/collection1_shard1_replicaA/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host4:port1/solr/collection1_shard4_replicaB/ http://host3:port2/solr/collection1_shard5_replicaB/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=host` will route sub-requests to a random selection of solr cores whilst maintaining host affinity i.e. sub-requests that can go to the same host machine will go to the same host machine e.g. http://host1:port1/solr/collection1_shard1_replicaA/ http://host2:port1/solr/collection1_shard2_replicaA/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host1:port2/solr/collection1_shard7_replicaA/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=host&replicaAffinity=node` will route sub-requests to a random selection of solr cores whilst maintaining first host affinity and secondly node affinity (the latter clearly only applies if multiple JVMs on a given machine contain the same shard). If `replicaAffinity=host` is requested then optional `replicaAffinity.hostPriorities` are supported: `.../select?replicaAffinity=host&replicaAffinity.hostPriori