Thanks for the feedback. I ended up posting a patch to JIRA (SOLR-2132<https://issues.apache.org/jira/browse/SOLR-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel>), although I've made a few changes since that patch. Already from our initial tests we've seen a 10% improvement in the 90% line for response times, which translates to a 50% improvement in the average time.
It would be nice to know more about the current plans for SolrCloud and it's future development road map. I've seen a few threads on here asking for more information, but it doesn't seem like a popular subject. I'll keep an eye on it though. Cheers, Mike On Wed, Sep 29, 2010 at 2:46 PM, Chris Hostetter <[email protected]>wrote: > > : 4. The first shard from a set (solr1a, solr1b) to successfully return is > : honored, and the other requests (solr1b, if solr1a responds first, for > : instance) are removed/ignored > : 5. The response is completed and returned as soon as one shard from each > set > : responds > > It seems like a useful feature to me ... i know some folks who have > (non Solr/Lucene based) custom search infrastructures that do roughly > the same thing. > > : 1. What are the known disadvantages to such a strategy? (we've thought of > a > : few, like sets being out of sync, but they don't bother us too much) > > you wind up burning a lot of CPU, but that's not a disadvantage as much sa > it is a trade off -- the whole point of doing something like this is that > you'd rather burn CPU (and wasting network IO) in order to improve your > worst case latency. > > : 2. What would this type of a feature be called? This way I can open a > Jira > : ticket for it > > no idea ... "redundent shard requests" comes to mind. > > : 3. Is there a preferred way to do this? My current patch (wich I can post > : soon) works in the HTTPClient portion of SearchHandler. I keep a hash map > of > : the shard sets and cancel the Future<ShardResponse>'s in the > corresponding > : set when each response comes back. > ... > : P.S I'd like to write a test for this feature but it wasn't clear from > the > : distributed test how to do so. Could somebody point me in the right > : direction (an existing test, perhaps) for how to accomplish this? > > I don't relaly have a good answer for either of those questions, but the > one thing i can suggest is thta you take a look at the SolrCloud branch > and think about how this functionality would integrate with that (both in > terms of implementation and in how SolrCloud unit tests work) > > As you mentioned: the current approach in SolrCloud is to load balance > against identical shards on mutiple nodes in the cluster, but that's not > contradictory with your idea: they can work in conjunction with eachother > (ie: imagine "shard1" has four physical instances: "shard1Ax", "shard1Ay", > "shard1Bq" and "shard1Bp" ... a request for "shard1" could trigger two > "redundent parallel shard requests" for "shard1A" and "shard1B" and each > of those requests could then load balance between the respecitve > underlying physical shards. > > > > -Hoss > > -- > http://lucenerevolution.org/ ... October 7-8, Boston > http://bit.ly/stump-hoss ... Stump The Chump! > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
