On Thu, Jan 14, 2010 at 2:43 PM, Yonik Seeley <yo...@lucidimagination.com> wrote: > On Thu, Jan 14, 2010 at 1:58 PM, Chris Hostetter > <hossman_luc...@fucit.org> wrote: >> : parameter we use for this. Suggestions? logicalshards=shard1,shard2? >> : lshards=shard1,shard2? slice=shard1,shard2? It doesn't seem like it >> : would be easy to reuse the "shards" parameter for this since it refers >> : to physical shard addresses. >> >> I haven't been following the SolrCloud stuff much, but from a client >> perspective is there really any difference between asking for a physical >> shard, vs asking for a logical shard (or slice name)? ... shouldn't the >> later case just result in a resolution from logical->physical w/o >> requiring the client code to know/care wether the String they have is a >> physical shard URL, or a slice name. > > That might be doable... but we would need to be able to tell the difference. > Perhaps we could always require a slash in a physical address > (localhost/context) and prohibit it in slice names? > > But... I think there's still a potentially bigger difference: today, > if shards is set, it means it's a distributed search (and shards is > removed for sub-requests). But the slice of the index being requested > may not have a one-to-one mapping with a full request on a solr core. > And shards may be able to move around, and so it seems important to be > able to declare what part of the index you're looking for when you're > querying a shard.
If we want to go this route for parameters (allowing use of both physical or logical shards in the shards param), I've updated the wiki with one way to do it: """ The presence of "shards" is what currently signals that a request is distributed, and distrib search removes this param for sub-requests. But with future micro-sharding or having a single core support multiple shards, the request will need to contain what shards are being requested. Reusing "shards" for this (per Hoss' suggestion) by allowing either physical urls or logical shards (slices) would require that either * a) The search component detect when it has all of the shards requested, and turn it into a non-distributed request (any error here could easily result in an infinite request loop until deadlock). It seems better to return a specific error if this node no longer contains the shard being queried in a non-distrib search. * b) Use a different distrib=true flag to indicate if this is a distributed search. This isn't back compatible though? Unless we also consider any request where shards contains a url to be distributed. http://localhost:8983/solr/collection1/select?shards=shard_200911,shard_200912,shard_201001&distrib=true If we adopt "distrib=true" then it should replace "shards=auto" in the other example URLs """ So the top-level distributed request shown above would resolve to potentially multiple sub-requests of the form http://localhost:1234/solr/collection1/select?shards=shard_200911 (note, distrib=true has been removed) http://localhost:1235/solr/collection1/select?shards=shard_200912 http://localhost:1236/solr/collection1/select?shards=shard_201001 -Yonik http://www.lucidimagination.com