I don't disagree with any of your points. If anything i think the problem is more egregious then you characterize it (especially in the case of requests for specific replicas that have been (re)moved -- IIRC not only does the Solr node return 404, but before doing that it forcibly refreshes the entire cluster state to se if there is a "new" collection it doesn't know about with that name)
The one thing i think you may be overlooking is in your comment that you'd like to see requests to specific cores go away -- presumably because you feel like shard specificity is enough for sub-requests? But being able to target a specific core with requests is kind of important for diagnosing bugs/discrepencies. Even in a perfectly functioning system, features like shards.preference depend on being able to route a request to a specific replica on a node -- not just any replica of that shard (ie: prefer PULL replicas) I don't have any strong objections to your "matrixized" path param, but I would suggest two alternative strawmen: * Long Term Strawman * In a "Post V2 API" type world, it seems like what we should probably be doing is switching to a completley different path prefix(es) for requests targetting a specific shard/replica? We already have "/api/c/<collection-name>/<handler-name>" -- it seems like ideally /api/c/* should *require* that the next portion of the path be an actual collection name, and when sub-requests are made, or when clients want to route requests to specific replicas, those requests should go to some *new* paths (that don't have the baggage of resolving/proxying collection level requests) Perhaps - /api/s/<collection-name>/<shard-name>/... "any replica of <shard-name> available on this solr node" - /api/r/<collection-name>/<replica-name>/... "the specific replica <replica-name> if it's on this solr node" * Short Term / Backcompat Strawman * Would (optional) HTTP headers like "X-Solr-Collection", & "X-Solr-Replica" be easier to adopt then matrixizing the URL path? If those headers don't exist, then the existing logic can all still run. If those headers do exist, then solr can compare the values of those headers with the path info to help optimize away some of the existing "Is this path a collection name or a core name" type logic (and/or narrow down which shard to pick from if it is a collection name) I'm not suggesting that these headers would *override* the path, just serve as hints to reduce the "search space" in HttpSolrCall... Example #0 GET /solr/yak/select?... * no hints what yak is * all existing hueristics apply Example #1 GET /solr/foo/select?... X-Solr-Replica: foo * foo is expected to be the name of a specific (local) replica * if a SolrCore named foo doesn't exist on the current node, just return 404, don't bother looking for a collection named foo Example #2 GET /solr/bar/select?... X-Solr-Collection: bar * bar is expected to be the name of a collection * if bar isn't a valid collection name, just return 404, don't bother checking for a local SolrCore named bar Example #3 GET /solr/bar/select?... X-Solr-Collection: bar X-Solr-Replica: foo * bar is expected to be the name of a collection * if bar isn't a valid collection name, just return 404, don't bother checking for a local SolrCore named bar * foo is expected to be the name of a specific (local) replica of the collection named bar * if a SolrCore named foo doesn't exist on the current node *OR* if a SolrCore named foo does exist, but isn't a replica of collection bar, just return 404, don't bother picking an arbitrary replica of collection bar Example #4 GET /solr/yak/select?... X-Solr-Collection: bar X-Solr-Replica: foo * neither hint matches path, return 404 : Date: Mon, 9 Dec 2024 23:42:45 -0500 : From: David Smiley <dsmi...@apache.org> : Reply-To: dev@solr.apache.org : To: dev@solr.apache.org : Subject: SolrCloud request routing URL structure : : In a number of circumstances, CloudSolrClient and various parts of Solr : (distributed search, distributed indexing), will create a request routed to : a specific core in SolrCloud. But the routing is almost always done : plainly, with the URL path like /solr/foo/handler and SolrCloud : (specifically HttpSolrCall) doesn't know if "foo" is a core or a : collection, so it tries both. Sometimes, the core once existed but doesn't : any longer due to replica rebalancing activities. The 404 response code is : rather sad. Depending on who the caller is and whether the request had a : payload (e.g. indexing), it may or may not know how to retry with an : updated ClusterState or even know if its ClusterState is stale. Payloads : are not retry-able. If the request somehow had clarity on the intended : shard, at least, Solr could then handle it locally or proxy it to a : suitable node, and use response headers containing a hint to the caller : that it might want to get a new ClusterState. : : A partial fix is for such requests to always add the "collection" parameter : when routing to a core. However, it's only suitable when any core of the : collection is a reasonable substitute if the preferred/original core : doesn't resolve. That'd work for indexing since it routes by payload : content, but not distributed-search (isShard=true) that demands a : particular shard. : : I'm not a fan of the choice of the very existence of the "collection" : parameter either[1]. I strongly think important routing information, : particularly the collection you are talking to (!), should be in the path. : A naively written proxy might have a security issue if its developers : didn't know that a request to a collection can be pointed at another that : wasn't intended to be accessible. : : I'd rather see a more holistic elegant refactoring instead of adding : another parameter. Here's a straw-man proposal that uses URL matrix : parameters to parameterize the routing before/separate from query : parameters. I'll show some examples (assume SolrCloud mode) : : Existing scenarios: : /solr/collectionName/handlerName : /solr/aliasName/handlerName : /solr/collection1,collection2,collection3/handlerName : /solr/coreName/handlerName (would like this to go away in SolrCloud) : New scenarios: : /solr/collectionName;s=shardName/handlerName : /solr/collectionName;s=shardName;r=replicaName/handlerName : /solr/collectionName;s=shardName;leader=true/handlerName : : If matrix parameters are present (presence of a semicolon), SolrCloud can : know collectionName is a collection name (and not an alias or a core). "s" : means shard name, "r" means replica name (which might rarely be used[2]). : The single-char choices are the same as used in our logging pattern for : MDC. "leader=true" for the leader of course. Matrix parameters are : extensible; we might see fit to add "x" for the core name or other : parameters similar to that of shards.preference[3] : : Any thoughts on this? : : Java variable name parameters might use the term "collSpec" or something to : indicate that the input isn't necessarily a collection. : : [1] "collection" param was added as part of SOLR-4497 for Collection : Aliasing but it wasn't necessary. Years later when aliasing was improved : (by me), the path component supported a comma delimited list. But : "collection" should probably have been deprecated. If you think not; what : am I missing? : [2] Specifying the replica *on a specific node* is probably always : redundant since there is very likely exactly one or zero replicas for the : shard. If there's more than one, either will do (they are replicas). It : could be interesting if the client could detect the redundancy and then be : more specific only then but that's probably unnecessary. I bet tests : overload replicas per shard on a node, however. : [3] : https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter : : ~ David Smiley : Apache Lucene/Solr Search Developer : http://www.linkedin.com/in/davidwsmiley : -Hoss http://www.lucidworks.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional commands, e-mail: dev-h...@solr.apache.org