I'd like to move forward on this soon.

A colleague expressed concern about matrix-params being obscure so perhaps
should be avoided because of that.  However, to me it seems their use is
well fitting to the problem.  Also, I estimate this path will have
relatively low impact on the codebase with respect to continuing to have
Replica.getXXXUrl methods that give a base URL (thus without params) The
URL (without params) will encode sufficient information that HttpSolrCall
needs.

On Tue, Dec 10, 2024 at 5:36 PM Chris Hostetter <hossman_luc...@fucit.org>
wrote:

>
> I don't disagree with any of your points.  If anything i think the problem
> is more egregious then you characterize it (especially in the case of
> requests for specific replicas that have been (re)moved -- IIRC not only
> does the Solr node return 404, but before doing that it forcibly refreshes
> the entire cluster state to se if there is a "new" collection it doesn't
> know about with that name)
>
> The one thing i think you may be overlooking is in your comment that you'd
> like to see requests to specific cores go away -- presumably because
> you feel like shard specificity is enough for sub-requests?  But
> being able to target a specific core with requests is kind of important
> for diagnosing bugs/discrepencies.  Even in a perfectly functioning
> system, features like shards.preference depend on being able to route a
> request to a specific replica on a node -- not just any replica of that
> shard (ie: prefer PULL replicas)
>
>
> I don't have any strong objections to your "matrixized" path param, but I
> would suggest two alternative strawmen:
>
> * Long Term Strawman *
>
> In a "Post V2 API" type world, it seems like what we should probably be
> doing is switching to a completley different path prefix(es) for requests
> targetting a specific shard/replica?
>
> We already have "/api/c/<collection-name>/<handler-name>" -- it seems like
> ideally /api/c/* should *require* that the next portion of the path be an
> actual collection name, and when sub-requests are made, or when clients
> want to route requests to specific replicas, those requests should go to
> some *new* paths (that don't have the baggage of resolving/proxying
> collection level requests)
>
> Perhaps
>  - /api/s/<collection-name>/<shard-name>/...
>    "any replica of <shard-name> available on this solr node"
>
>  - /api/r/<collection-name>/<replica-name>/...
>    "the specific replica <replica-name> if it's on this solr node"
>
>
>
> * Short Term / Backcompat Strawman *
>
> Would (optional) HTTP headers like "X-Solr-Collection",
> & "X-Solr-Replica" be easier to adopt then matrixizing
> the URL path?
>
> If those headers don't exist, then the existing logic can all still run.
>
> If those headers do exist, then solr can compare the values of those
> headers with the path info to help optimize away some of the existing "Is
> this path a collection name or a core name" type logic (and/or narrow down
> which shard to pick from if it is a collection name)
>
> I'm not suggesting that these headers would *override* the path, just
> serve as hints to reduce the "search space" in HttpSolrCall...
>
> Example #0
>
>  GET /solr/yak/select?...
>
>  * no hints what yak is
>  * all existing hueristics apply
>
> Example #1
>
>  GET /solr/foo/select?...
>  X-Solr-Replica: foo
>
>  * foo is expected to be the name of a specific (local) replica
>  * if a SolrCore named foo doesn't exist on the current node,
>    just return 404, don't bother looking for a collection named foo
>
> Example #2
>
>  GET /solr/bar/select?...
>  X-Solr-Collection: bar
>
>  * bar is expected to be the name of a collection
>  * if bar isn't a valid collection name, just return 404,
>    don't bother checking for a local SolrCore named bar
>
> Example #3
>
>  GET /solr/bar/select?...
>  X-Solr-Collection: bar
>  X-Solr-Replica: foo
>
>  * bar is expected to be the name of a collection
>  * if bar isn't a valid collection name, just return 404,
>    don't bother checking for a local SolrCore named bar
>  * foo is expected to be the name of a specific (local) replica
>    of the collection named bar
>  * if a SolrCore named foo doesn't exist on the current node *OR*
>    if a SolrCore named foo does exist, but isn't a replica of
>    collection bar, just return 404, don't bother picking an
>    arbitrary replica of collection bar
>
> Example #4
>
>  GET /solr/yak/select?...
>  X-Solr-Collection: bar
>  X-Solr-Replica: foo
>
>  * neither hint matches path, return 404
>
>
>
>
> : Date: Mon, 9 Dec 2024 23:42:45 -0500
> : From: David Smiley <dsmi...@apache.org>
> : Reply-To: dev@solr.apache.org
> : To: dev@solr.apache.org
> : Subject: SolrCloud request routing URL structure
> :
> : In a number of circumstances, CloudSolrClient and various parts of Solr
> : (distributed search, distributed indexing), will create a request routed
> to
> : a specific core in SolrCloud.  But the routing is almost always done
> : plainly, with the URL path like /solr/foo/handler and SolrCloud
> : (specifically HttpSolrCall) doesn't know if "foo" is a core or a
> : collection, so it tries both.  Sometimes, the core once existed but
> doesn't
> : any longer due to replica rebalancing activities.  The 404 response code
> is
> : rather sad.  Depending on who the caller is and whether the request had a
> : payload (e.g. indexing), it may or may not know how to retry with an
> : updated ClusterState or even know if its ClusterState is stale.  Payloads
> : are not retry-able.  If the request somehow had clarity on the intended
> : shard, at least, Solr could then handle it locally or proxy it to a
> : suitable node, and use response headers containing a hint to the caller
> : that it might want to get a new ClusterState.
> :
> : A partial fix is for such requests to always add the "collection"
> parameter
> : when routing to a core.  However, it's only suitable when any core of the
> : collection is a reasonable substitute if the preferred/original core
> : doesn't resolve.  That'd work for indexing since it routes by payload
> : content, but not distributed-search (isShard=true) that demands a
> : particular shard.
> :
> : I'm not a fan of the choice of the very existence of the "collection"
> : parameter either[1].  I strongly think important routing information,
> : particularly the collection you are talking to (!), should be in the
> path.
> : A naively written proxy might have a security issue if its developers
> : didn't know that a request to a collection can be pointed at another that
> : wasn't intended to be accessible.
> :
> : I'd rather see a more holistic elegant refactoring instead of adding
> : another parameter.  Here's a straw-man proposal that uses URL matrix
> : parameters to parameterize the routing before/separate from query
> : parameters.  I'll show some examples (assume SolrCloud mode)
> :
> :   Existing scenarios:
> : /solr/collectionName/handlerName
> : /solr/aliasName/handlerName
> : /solr/collection1,collection2,collection3/handlerName
> : /solr/coreName/handlerName  (would like this to go away in SolrCloud)
> :   New scenarios:
> : /solr/collectionName;s=shardName/handlerName
> : /solr/collectionName;s=shardName;r=replicaName/handlerName
> : /solr/collectionName;s=shardName;leader=true/handlerName
> :
> : If matrix parameters are present (presence of a semicolon), SolrCloud can
> : know collectionName is a collection name (and not an alias or a core).
> "s"
> : means shard name, "r" means replica name (which might rarely be used[2]).
> : The single-char choices are the same as used in our logging pattern for
> : MDC.  "leader=true" for the leader of course.  Matrix parameters are
> : extensible; we might see fit to add "x" for the core name or other
> : parameters similar to that of shards.preference[3]
> :
> : Any thoughts on this?
> :
> : Java variable name parameters might use the term "collSpec" or something
> to
> : indicate that the input isn't necessarily a collection.
> :
> : [1] "collection" param was added as part of SOLR-4497 for Collection
> : Aliasing but it wasn't necessary.  Years later when aliasing was improved
> : (by me), the path component supported a comma delimited list.  But
> : "collection" should probably have been deprecated.  If you think not;
> what
> : am I missing?
> : [2] Specifying the replica *on a specific node* is probably always
> : redundant since there is very likely exactly one or zero replicas for the
> : shard.  If there's more than one, either will do (they are replicas).  It
> : could be interesting if the client could detect the redundancy and then
> be
> : more specific only then but that's probably unnecessary.  I bet tests
> : overload replicas per shard on a node, however.
> : [3]
> :
> https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter
> :
> : ~ David Smiley
> : Apache Lucene/Solr Search Developer
> : http://www.linkedin.com/in/davidwsmiley
> :
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>

Reply via email to