I'd like to move forward on this soon. A colleague expressed concern about matrix-params being obscure so perhaps should be avoided because of that. However, to me it seems their use is well fitting to the problem. Also, I estimate this path will have relatively low impact on the codebase with respect to continuing to have Replica.getXXXUrl methods that give a base URL (thus without params) The URL (without params) will encode sufficient information that HttpSolrCall needs.
On Tue, Dec 10, 2024 at 5:36 PM Chris Hostetter <hossman_luc...@fucit.org> wrote: > > I don't disagree with any of your points. If anything i think the problem > is more egregious then you characterize it (especially in the case of > requests for specific replicas that have been (re)moved -- IIRC not only > does the Solr node return 404, but before doing that it forcibly refreshes > the entire cluster state to se if there is a "new" collection it doesn't > know about with that name) > > The one thing i think you may be overlooking is in your comment that you'd > like to see requests to specific cores go away -- presumably because > you feel like shard specificity is enough for sub-requests? But > being able to target a specific core with requests is kind of important > for diagnosing bugs/discrepencies. Even in a perfectly functioning > system, features like shards.preference depend on being able to route a > request to a specific replica on a node -- not just any replica of that > shard (ie: prefer PULL replicas) > > > I don't have any strong objections to your "matrixized" path param, but I > would suggest two alternative strawmen: > > * Long Term Strawman * > > In a "Post V2 API" type world, it seems like what we should probably be > doing is switching to a completley different path prefix(es) for requests > targetting a specific shard/replica? > > We already have "/api/c/<collection-name>/<handler-name>" -- it seems like > ideally /api/c/* should *require* that the next portion of the path be an > actual collection name, and when sub-requests are made, or when clients > want to route requests to specific replicas, those requests should go to > some *new* paths (that don't have the baggage of resolving/proxying > collection level requests) > > Perhaps > - /api/s/<collection-name>/<shard-name>/... > "any replica of <shard-name> available on this solr node" > > - /api/r/<collection-name>/<replica-name>/... > "the specific replica <replica-name> if it's on this solr node" > > > > * Short Term / Backcompat Strawman * > > Would (optional) HTTP headers like "X-Solr-Collection", > & "X-Solr-Replica" be easier to adopt then matrixizing > the URL path? > > If those headers don't exist, then the existing logic can all still run. > > If those headers do exist, then solr can compare the values of those > headers with the path info to help optimize away some of the existing "Is > this path a collection name or a core name" type logic (and/or narrow down > which shard to pick from if it is a collection name) > > I'm not suggesting that these headers would *override* the path, just > serve as hints to reduce the "search space" in HttpSolrCall... > > Example #0 > > GET /solr/yak/select?... > > * no hints what yak is > * all existing hueristics apply > > Example #1 > > GET /solr/foo/select?... > X-Solr-Replica: foo > > * foo is expected to be the name of a specific (local) replica > * if a SolrCore named foo doesn't exist on the current node, > just return 404, don't bother looking for a collection named foo > > Example #2 > > GET /solr/bar/select?... > X-Solr-Collection: bar > > * bar is expected to be the name of a collection > * if bar isn't a valid collection name, just return 404, > don't bother checking for a local SolrCore named bar > > Example #3 > > GET /solr/bar/select?... > X-Solr-Collection: bar > X-Solr-Replica: foo > > * bar is expected to be the name of a collection > * if bar isn't a valid collection name, just return 404, > don't bother checking for a local SolrCore named bar > * foo is expected to be the name of a specific (local) replica > of the collection named bar > * if a SolrCore named foo doesn't exist on the current node *OR* > if a SolrCore named foo does exist, but isn't a replica of > collection bar, just return 404, don't bother picking an > arbitrary replica of collection bar > > Example #4 > > GET /solr/yak/select?... > X-Solr-Collection: bar > X-Solr-Replica: foo > > * neither hint matches path, return 404 > > > > > : Date: Mon, 9 Dec 2024 23:42:45 -0500 > : From: David Smiley <dsmi...@apache.org> > : Reply-To: dev@solr.apache.org > : To: dev@solr.apache.org > : Subject: SolrCloud request routing URL structure > : > : In a number of circumstances, CloudSolrClient and various parts of Solr > : (distributed search, distributed indexing), will create a request routed > to > : a specific core in SolrCloud. But the routing is almost always done > : plainly, with the URL path like /solr/foo/handler and SolrCloud > : (specifically HttpSolrCall) doesn't know if "foo" is a core or a > : collection, so it tries both. Sometimes, the core once existed but > doesn't > : any longer due to replica rebalancing activities. The 404 response code > is > : rather sad. Depending on who the caller is and whether the request had a > : payload (e.g. indexing), it may or may not know how to retry with an > : updated ClusterState or even know if its ClusterState is stale. Payloads > : are not retry-able. If the request somehow had clarity on the intended > : shard, at least, Solr could then handle it locally or proxy it to a > : suitable node, and use response headers containing a hint to the caller > : that it might want to get a new ClusterState. > : > : A partial fix is for such requests to always add the "collection" > parameter > : when routing to a core. However, it's only suitable when any core of the > : collection is a reasonable substitute if the preferred/original core > : doesn't resolve. That'd work for indexing since it routes by payload > : content, but not distributed-search (isShard=true) that demands a > : particular shard. > : > : I'm not a fan of the choice of the very existence of the "collection" > : parameter either[1]. I strongly think important routing information, > : particularly the collection you are talking to (!), should be in the > path. > : A naively written proxy might have a security issue if its developers > : didn't know that a request to a collection can be pointed at another that > : wasn't intended to be accessible. > : > : I'd rather see a more holistic elegant refactoring instead of adding > : another parameter. Here's a straw-man proposal that uses URL matrix > : parameters to parameterize the routing before/separate from query > : parameters. I'll show some examples (assume SolrCloud mode) > : > : Existing scenarios: > : /solr/collectionName/handlerName > : /solr/aliasName/handlerName > : /solr/collection1,collection2,collection3/handlerName > : /solr/coreName/handlerName (would like this to go away in SolrCloud) > : New scenarios: > : /solr/collectionName;s=shardName/handlerName > : /solr/collectionName;s=shardName;r=replicaName/handlerName > : /solr/collectionName;s=shardName;leader=true/handlerName > : > : If matrix parameters are present (presence of a semicolon), SolrCloud can > : know collectionName is a collection name (and not an alias or a core). > "s" > : means shard name, "r" means replica name (which might rarely be used[2]). > : The single-char choices are the same as used in our logging pattern for > : MDC. "leader=true" for the leader of course. Matrix parameters are > : extensible; we might see fit to add "x" for the core name or other > : parameters similar to that of shards.preference[3] > : > : Any thoughts on this? > : > : Java variable name parameters might use the term "collSpec" or something > to > : indicate that the input isn't necessarily a collection. > : > : [1] "collection" param was added as part of SOLR-4497 for Collection > : Aliasing but it wasn't necessary. Years later when aliasing was improved > : (by me), the path component supported a comma delimited list. But > : "collection" should probably have been deprecated. If you think not; > what > : am I missing? > : [2] Specifying the replica *on a specific node* is probably always > : redundant since there is very likely exactly one or zero replicas for the > : shard. If there's more than one, either will do (they are replicas). It > : could be interesting if the client could detect the redundancy and then > be > : more specific only then but that's probably unnecessary. I bet tests > : overload replicas per shard on a node, however. > : [3] > : > https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter > : > : ~ David Smiley > : Apache Lucene/Solr Search Developer > : http://www.linkedin.com/in/davidwsmiley > : > > -Hoss > http://www.lucidworks.com/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org > >