Re: SolrCloud request routing URL structure

Jason Gerlowski Fri, 10 Jan 2025 06:34:14 -0800

Hi,

Sorry for the late reply on this thread.  I've been trying to take
time to understand the problem a little better before chiming in.
(FWIW I think this would be a great discussion for the Meetup next
week?)


I think much of what you described makes sense in a v1 context, but v2
has some nice building blocks for solving this, if not a complete
solution already.

Particularly the v2 paths avoid much of the ambiguity present in v1.
e.g. "/solr/???/select" is replaced in v2 with
"/collections/someCollName/select" and "/cores/someCoreName/select".
There's not currently a "shard-level" path for querying, but other
APIs currently exist at the
"/collections/someColl/shards/someShardName" path, so it's only a
small leap to offering querying there.  Together these paths could be
used much as Hoss described in the "Long Term Strawman" section of his
response.

Of course, v2 completion remains pretty distant and we may want a
solution in the interim.  Matrix-params and Hoss's "header hint" idea
both seem like reasonable approaches in that regard, though I'd
personally lean towards the header-based approach.

Best,

Jason


On Thu, Jan 2, 2025 at 9:46 AM David Smiley <[email protected]> wrote:
>
> I'd like to move forward on this soon.
>
> A colleague expressed concern about matrix-params being obscure so perhaps
> should be avoided because of that.  However, to me it seems their use is
> well fitting to the problem.  Also, I estimate this path will have
> relatively low impact on the codebase with respect to continuing to have
> Replica.getXXXUrl methods that give a base URL (thus without params) The
> URL (without params) will encode sufficient information that HttpSolrCall
> needs.
>
> On Tue, Dec 10, 2024 at 5:36 PM Chris Hostetter <[email protected]>
> wrote:
>
> >
> > I don't disagree with any of your points.  If anything i think the problem
> > is more egregious then you characterize it (especially in the case of
> > requests for specific replicas that have been (re)moved -- IIRC not only
> > does the Solr node return 404, but before doing that it forcibly refreshes
> > the entire cluster state to se if there is a "new" collection it doesn't
> > know about with that name)
> >
> > The one thing i think you may be overlooking is in your comment that you'd
> > like to see requests to specific cores go away -- presumably because
> > you feel like shard specificity is enough for sub-requests?  But
> > being able to target a specific core with requests is kind of important
> > for diagnosing bugs/discrepencies.  Even in a perfectly functioning
> > system, features like shards.preference depend on being able to route a
> > request to a specific replica on a node -- not just any replica of that
> > shard (ie: prefer PULL replicas)
> >
> >
> > I don't have any strong objections to your "matrixized" path param, but I
> > would suggest two alternative strawmen:
> >
> > * Long Term Strawman *
> >
> > In a "Post V2 API" type world, it seems like what we should probably be
> > doing is switching to a completley different path prefix(es) for requests
> > targetting a specific shard/replica?
> >
> > We already have "/api/c/<collection-name>/<handler-name>" -- it seems like
> > ideally /api/c/* should *require* that the next portion of the path be an
> > actual collection name, and when sub-requests are made, or when clients
> > want to route requests to specific replicas, those requests should go to
> > some *new* paths (that don't have the baggage of resolving/proxying
> > collection level requests)
> >
> > Perhaps
> >  - /api/s/<collection-name>/<shard-name>/...
> >    "any replica of <shard-name> available on this solr node"
> >
> >  - /api/r/<collection-name>/<replica-name>/...
> >    "the specific replica <replica-name> if it's on this solr node"
> >
> >
> >
> > * Short Term / Backcompat Strawman *
> >
> > Would (optional) HTTP headers like "X-Solr-Collection",
> > & "X-Solr-Replica" be easier to adopt then matrixizing
> > the URL path?
> >
> > If those headers don't exist, then the existing logic can all still run.
> >
> > If those headers do exist, then solr can compare the values of those
> > headers with the path info to help optimize away some of the existing "Is
> > this path a collection name or a core name" type logic (and/or narrow down
> > which shard to pick from if it is a collection name)
> >
> > I'm not suggesting that these headers would *override* the path, just
> > serve as hints to reduce the "search space" in HttpSolrCall...
> >
> > Example #0
> >
> >  GET /solr/yak/select?...
> >
> >  * no hints what yak is
> >  * all existing hueristics apply
> >
> > Example #1
> >
> >  GET /solr/foo/select?...
> >  X-Solr-Replica: foo
> >
> >  * foo is expected to be the name of a specific (local) replica
> >  * if a SolrCore named foo doesn't exist on the current node,
> >    just return 404, don't bother looking for a collection named foo
> >
> > Example #2
> >
> >  GET /solr/bar/select?...
> >  X-Solr-Collection: bar
> >
> >  * bar is expected to be the name of a collection
> >  * if bar isn't a valid collection name, just return 404,
> >    don't bother checking for a local SolrCore named bar
> >
> > Example #3
> >
> >  GET /solr/bar/select?...
> >  X-Solr-Collection: bar
> >  X-Solr-Replica: foo
> >
> >  * bar is expected to be the name of a collection
> >  * if bar isn't a valid collection name, just return 404,
> >    don't bother checking for a local SolrCore named bar
> >  * foo is expected to be the name of a specific (local) replica
> >    of the collection named bar
> >  * if a SolrCore named foo doesn't exist on the current node *OR*
> >    if a SolrCore named foo does exist, but isn't a replica of
> >    collection bar, just return 404, don't bother picking an
> >    arbitrary replica of collection bar
> >
> > Example #4
> >
> >  GET /solr/yak/select?...
> >  X-Solr-Collection: bar
> >  X-Solr-Replica: foo
> >
> >  * neither hint matches path, return 404
> >
> >
> >
> >
> > : Date: Mon, 9 Dec 2024 23:42:45 -0500
> > : From: David Smiley <[email protected]>
> > : Reply-To: [email protected]
> > : To: [email protected]
> > : Subject: SolrCloud request routing URL structure
> > :
> > : In a number of circumstances, CloudSolrClient and various parts of Solr
> > : (distributed search, distributed indexing), will create a request routed
> > to
> > : a specific core in SolrCloud.  But the routing is almost always done
> > : plainly, with the URL path like /solr/foo/handler and SolrCloud
> > : (specifically HttpSolrCall) doesn't know if "foo" is a core or a
> > : collection, so it tries both.  Sometimes, the core once existed but
> > doesn't
> > : any longer due to replica rebalancing activities.  The 404 response code
> > is
> > : rather sad.  Depending on who the caller is and whether the request had a
> > : payload (e.g. indexing), it may or may not know how to retry with an
> > : updated ClusterState or even know if its ClusterState is stale.  Payloads
> > : are not retry-able.  If the request somehow had clarity on the intended
> > : shard, at least, Solr could then handle it locally or proxy it to a
> > : suitable node, and use response headers containing a hint to the caller
> > : that it might want to get a new ClusterState.
> > :
> > : A partial fix is for such requests to always add the "collection"
> > parameter
> > : when routing to a core.  However, it's only suitable when any core of the
> > : collection is a reasonable substitute if the preferred/original core
> > : doesn't resolve.  That'd work for indexing since it routes by payload
> > : content, but not distributed-search (isShard=true) that demands a
> > : particular shard.
> > :
> > : I'm not a fan of the choice of the very existence of the "collection"
> > : parameter either[1].  I strongly think important routing information,
> > : particularly the collection you are talking to (!), should be in the
> > path.
> > : A naively written proxy might have a security issue if its developers
> > : didn't know that a request to a collection can be pointed at another that
> > : wasn't intended to be accessible.
> > :
> > : I'd rather see a more holistic elegant refactoring instead of adding
> > : another parameter.  Here's a straw-man proposal that uses URL matrix
> > : parameters to parameterize the routing before/separate from query
> > : parameters.  I'll show some examples (assume SolrCloud mode)
> > :
> > :   Existing scenarios:
> > : /solr/collectionName/handlerName
> > : /solr/aliasName/handlerName
> > : /solr/collection1,collection2,collection3/handlerName
> > : /solr/coreName/handlerName  (would like this to go away in SolrCloud)
> > :   New scenarios:
> > : /solr/collectionName;s=shardName/handlerName
> > : /solr/collectionName;s=shardName;r=replicaName/handlerName
> > : /solr/collectionName;s=shardName;leader=true/handlerName
> > :
> > : If matrix parameters are present (presence of a semicolon), SolrCloud can
> > : know collectionName is a collection name (and not an alias or a core).
> > "s"
> > : means shard name, "r" means replica name (which might rarely be used[2]).
> > : The single-char choices are the same as used in our logging pattern for
> > : MDC.  "leader=true" for the leader of course.  Matrix parameters are
> > : extensible; we might see fit to add "x" for the core name or other
> > : parameters similar to that of shards.preference[3]
> > :
> > : Any thoughts on this?
> > :
> > : Java variable name parameters might use the term "collSpec" or something
> > to
> > : indicate that the input isn't necessarily a collection.
> > :
> > : [1] "collection" param was added as part of SOLR-4497 for Collection
> > : Aliasing but it wasn't necessary.  Years later when aliasing was improved
> > : (by me), the path component supported a comma delimited list.  But
> > : "collection" should probably have been deprecated.  If you think not;
> > what
> > : am I missing?
> > : [2] Specifying the replica *on a specific node* is probably always
> > : redundant since there is very likely exactly one or zero replicas for the
> > : shard.  If there's more than one, either will do (they are replicas).  It
> > : could be interesting if the client could detect the redundancy and then
> > be
> > : more specific only then but that's probably unnecessary.  I bet tests
> > : overload replicas per shard on a node, however.
> > : [3]
> > :
> > https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter
> > :
> > : ~ David Smiley
> > : Apache Lucene/Solr Search Developer
> > : http://www.linkedin.com/in/davidwsmiley
> > :
> >
> > -Hoss
> > http://www.lucidworks.com/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: SolrCloud request routing URL structure

Reply via email to