[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769634#comment-16769634 ] Christine Poerschke commented on SOLR-6730: --- {quote}... Looks like the feature suggested here could help with this as well, e.g. by adding a "seed" param if you want sticky replicas for a "session". Right? {quote} Correct. Thanks for re-surfacing this old ticket. I've opened SOLR-13258 for the stickiness and will close this ticket here out for clarity. > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765888#comment-16765888 ] Jan Høydahl commented on SOLR-6730: --- Hi, commenting on this old issue, as I had a discussion with a customer yesterday regarding replica affinity to utilise caches better. They fire more than 100 requests to draw a dashboard, and you get much better cache utilisation and faster load time with replicationFactor=1 than replicationFactor=3 since you need to warm three times as many caches. Looks like the feature suggested here could help with this as well, e.g. by adding a "seed" param if you want sticky replicas for a "session". Right? > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583110#comment-15583110 ] Christine Poerschke commented on SOLR-6730: --- Noble and I discussed offline re: SOLR-6730 and SOLR-8146 overlaps and differences. I will try to summarise here as bullet points, [~noble.paul] please add or correct if i missed or misunderstood something. * The use case and motivation for the {{select?replicaAffinity=(node|host)}} part of SOLR-6730 was to reduce the number of JVMs hit by a given search since the more JVMs are hit, the higher the chance of hitting a garbage collection pause in one of many JVMs. * The use case and motivation for the {{replicaAffinity.hostPriorities=...}} part of SOLR-6730 was to preferentially direct requests from the same user/source to certain areas of the cloud. ** The implementation of the {{replicaAffinity.hostPriorities=...}} approach requires configuration somewhere i.e. a list of which hosts to prioritise. ** No matter where it is stored, maintaining configuration can be cumbersome as collections and hosts change over time. * The objective of directing requests from the same user/source to certain areas of the cloud can be achieved without configuration, and the objective of reducing the number of JVMs hit by a search can pretty much be achieved that way also. ** Approach outline: *** Two numeric parameters ('seed' and 'mod') are optionally added to each request. *** The two parameters 'place' the requests within the cloud, e.g. for {{mod=9}} any seed between 0 and 8 would be valid and {{seed=6}} would 'place' the request with the 7th of 9 replicas, or more realistically the 3rd of 3 replicas. *** seed-plus-mod placement automatically adjusts when the number of replicas changes i.e. (seed=2,mod=6) would be 3rd-of-6 or 2nd-of-4 or 2nd-of-3 or 1st-of-2 placement. *** SOLR-6730 here would likely be abandoned in favour of the approach outlined. * What is common to SOLR-6730 and SOLR-8146: ** optional parameters would support changing of the existing behaviour ** existing behaviour is maintained if the optional parameters are not supplied * What is different between SOLR-6730 and SOLR-8146: ** point-of-use of the optional parameter is HttpShardHandler\[Factory\] for SOLR-6730 ** point-of-use of the optional parameter is CloudSolrClient (and HttpShardHandler\[Factory\]?) for SOLR-8146 * Next steps: 1. SOLR-8332 to factor HttpShardHandler\[Factory\]'s url shuffling out into a ReplicaListTransformer class 2. creation of additional ReplicaListTransformer implementations corresponding to the approach outlined above > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532530#comment-15532530 ] Noble Paul commented on SOLR-6730: -- [~cpoerschke] pls take a look at SOLR-8146 it's a more comprehensive strategy for replica affinity > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > - > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277183#comment-14277183 ] Christine Poerschke commented on SOLR-6730: --- Am happy to give basic tests for this a go. select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support - Key: SOLR-6730 URL: https://issues.apache.org/jira/browse/SOLR-6730 Project: Solr Issue Type: New Feature Reporter: Christine Poerschke If no shards parameter is supplied with a select request then sub-requests will go to a random selection of live solr nodes hosting shards for the collection of interest. All sub-requests must complete before results can be collated i.e. the slowest sub-request determines how fast the search completes. Use of optional replicaAffinity can reduce the number of JVMs hit by a given search (the more JVMs are hit, the higher the chance of hitting a garbage collection pause in one of many JVMs). Preferentially directing requests to certain areas of the cloud can also be useful for debugging or when some replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275569#comment-14275569 ] Mark Miller commented on SOLR-6730: --- Cool - I like it. I think a feature like this needs at least some basic tests added though. Any volunteers? select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support - Key: SOLR-6730 URL: https://issues.apache.org/jira/browse/SOLR-6730 Project: Solr Issue Type: New Feature Reporter: Christine Poerschke If no shards parameter is supplied with a select request then sub-requests will go to a random selection of live solr nodes hosting shards for the collection of interest. All sub-requests must complete before results can be collated i.e. the slowest sub-request determines how fast the search completes. Use of optional replicaAffinity can reduce the number of JVMs hit by a given search (the more JVMs are hit, the higher the chance of hitting a garbage collection pause in one of many JVMs). Preferentially directing requests to certain areas of the cloud can also be useful for debugging or when some replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6730) select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206629#comment-14206629 ] ASF GitHub Bot commented on SOLR-6730: -- GitHub user cpoerschke opened a pull request: https://github.com/apache/lucene-solr/pull/104 select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support https://issues.apache.org/jira/i#browse/SOLR-6730 You can merge this pull request into a Git repository by running: $ git pull https://github.com/bloomberg/lucene-solr trunk-replica-affinity-feature Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/104.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #104 commit 66b56265bdefec7eb814bfb533c0ff19bb1dcdff Author: Christine Poerschke cpoersc...@bloomberg.net Date: 2014-08-12T10:32:57Z solr: select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support This commit also includes changes to reduce SearchHandler's overall use of ShardHandler objects. - solr: select?replicaAffinity=(node|host) support, select?replicaAffinity=hostreplicaAffinity.hostPriorities=hostA,hostB=1,hostC=2,hostD=2,hostE=3 prioritisation support illustration: `4-hosts-x-2-ports=8-instances 8-shards 2-replica system` http://host1:port1/solr/collection1_shard1_replicaA/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host1:port2/solr/collection1_shard7_replicaA/ http://host2:port1/solr/collection1_shard2_replicaA/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host2:port2/solr/collection1_shard8_replicaA/ http://host3:port1/solr/collection1_shard1_replicaB/ http://host3:port1/solr/collection1_shard3_replicaB/ http://host3:port2/solr/collection1_shard5_replicaB/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host4:port1/solr/collection1_shard4_replicaB/ http://host4:port2/solr/collection1_shard6_replicaB/ http://host4:port2/solr/collection1_shard8_replicaB/ `.../select` plain will route sub-requests to a random selection of solr cores and so could potentially use all 8 JVM instances http://host1:port1/solr/collection1_shard1_replicaA/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host3:port1/solr/collection1_shard3_replicaB/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host4:port2/solr/collection1_shard6_replicaB/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=node` will route sub-requests to a random selection of solr cores whilst maintaining node affinity i.e. sub-requests that can go to the same solr instance will go to the same solr instance e.g. http://host1:port1/solr/collection1_shard1_replicaA/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host4:port1/solr/collection1_shard4_replicaB/ http://host3:port2/solr/collection1_shard5_replicaB/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=host` will route sub-requests to a random selection of solr cores whilst maintaining host affinity i.e. sub-requests that can go to the same host machine will go to the same host machine e.g. http://host1:port1/solr/collection1_shard1_replicaA/ http://host2:port1/solr/collection1_shard2_replicaA/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host1:port2/solr/collection1_shard7_replicaA/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=hostreplicaAffinity=node` will route sub-requests to a random selection of solr cores whilst maintaining first host affinity and secondly node affinity (the latter clearly only applies if multiple JVMs on a given machine contain the same shard). If `replicaAffinity=host` is requested then optional `replicaAffinity.hostPriorities` are supported: