This discussion is relevant to a conundrum I face, touching HealthCheckRequest. It has a little hack to detect it's being used with CloudSolrClient so it can induce a failure because it's not supposed to be used in a node-ambiguous way; it's supposed to be directed at a specific node to have meaning. I need to remove the hack as it interferes with other stuff I'm doing. A "negative test" tests for the failure: org.apache.solr.handler.admin.HealthCheckHandlerTest#testHealthCheckHandlerWithCloudClient that I'll need to remove.
For everyone's sake, it would be ideal if the path was readily interpretable as a "cluster" endpoint or a "node" endpoint. This would help users (including us!) immediately identify the discernment just by looking at a path. And it would help CloudSolrClient to notice the request it's asked to invoke is a "node" endpoint and then fail with IllegalArgumentException. V2 offers an opportunity to make clear distinctions, even if only for V2. On Thu, Mar 20, 2025 at 4:40 PM Jason Gerlowski <gerlowsk...@gmail.com> wrote: > Hey Christos, > > Sorry again for the delay; some replies inline. > > > For '/api/node' how can I tell which node I am interacting with when > > running in cloud mode? Am I connecting to a specific node via a different > > hostname + port, or am I connecting with a node through a load balancer / > > zookeeper? > > There's no load-balancing or distribution of `/api/node` APIs within > Solr, even in SolrCloud mode. So if your `/api/node` request goes > right to Solr, you can be pretty certain that it'll be served by > whatever host+port you put in the URL. > > Of course, administrators often put an external load-balancer in front > of Solr - which really complicates the use of these > non-proxied/distributed, `/api/node`-style APIs. But there's no > distribution of `/api/node` requests by Solr itself. > > > In my suggestions before I try to eliminate the > > difference between the two modes, standalone and cloud, at API level, so > > that clients always interact the same way with Solr regardless of the > mode. > > That makes sense to me. And I *think* it's possible, at least at the > API level. That is - I can't think of any functionality offered by > both modes that are exposed through different APIs depending on > "mode". Of course, there's a lot of APIs that will error out if you > try them in standalone. We can definitely be more consistent in how > that is surfaced - I love your suggestion of using the "501" status > code as a way to indicate those cases. > > I have a feeling I'm missing your main concern/point though. If I'm > right about that, feel free to pick a specific example to lead the > discussion - that might be a good way to proceed. > > > What I try to figure out is how a client like the new Admin UI could > > interact with the API in both scenarios, standalone and cloud mode, > without > > having to handle each mode separately > > I might be reading too much into what you mean by "without having to > handle each mode separately"...so if I'm reading that too literally, > just ignore my comments below. > > As I said above, you shouldn't have huge issues with this at the API > syntax level. But on the "conceptual" front, I'm less sure. The > modes share a lot, but ultimately differ hugely in the abstractions > that they offer, the limitations that they have, etc. > > Take "cores" as an example. The "list-cores" API has the same syntax > in both modes. But the meaning of a core itself is hugely different > between the two: in standalone it's the main abstraction, in SolrCloud > it's essentially an implementation detail. > > And on the "limitation" side of that: standalone nodes only know about > themselves, they have no way to know of other nodes in the cluster. > So in standalone mode there's no way to know about cores on other > nodes in the cluster; whereas SolrCloud doesn't have that limitation > at all and could paint a much fuller picture by sending "list-core" > calls to all of the nodes. > > That's all to say - the modes are just very very different. I'm all > for avoiding special-handling, but it might not always be > possible/practical : ( > > Best, > > Jason > > On Wed, Mar 5, 2025 at 1:35 PM Christos Malliaridis > <malliari...@apache.org> wrote: > > > > > > > > '/api/node' is reserved for APIs that only impact the receiving node > > > (and aren't otherwise proxied or distributed) > > > > > > That makes sense to me. In my suggestions before I try to eliminate the > > difference between the two modes, standalone and cloud, at API level, so > > that clients always interact the same way with Solr regardless of the > mode. > > > > removing the "/cluster" bit of the path might > > > mislead as many users as it helps. > > > > > > Eliminating '/api/cluster' may not be necessary, if we consider > "cluster" a > > resource. By my definition a cluster is just a collection of nodes, so > the > > same as '/api/nodes'. But having a cluster as an explicit resource in our > > RESTful API would still make sense, since interacting with the nodes > > resource collection (like with '/api/nodes/properties') could introduce > > potential naming conflicts. That's why I was considering only > > '/api/properties'. But I believe '/api/cluster/properties' could work as > > well, and having "cluster" and "node" as resources is fine too. Not sure > if > > there are also cases where there could be multiple clusters under the > same > > hostname in Solr? > > > > But there are some obstacles IMO - the biggest one being > > > limitations in Solr's featureset as it stands today. > > > > > > I believe this could easily be handled by responses like "501 Not > > Implemented" if an endpoint is not supported in a specific mode. This > would > > also not influence a different structure of the endpoints I believe? > > > > For '/api/node' how can I tell which node I am interacting with when > > running in cloud mode? Am I connecting to a specific node via a different > > hostname + port, or am I connecting with a node through a load balancer / > > zookeeper? > > > > What I try to figure out is how a client like the new Admin UI could > > interact with the API in both scenarios, standalone and cloud mode, > without > > having to handle each mode separately or rely on implementation details > > like zookeeper. > > > > > > On Mon, Feb 17, 2025 at 6:01 PM Jason Gerlowski <gerlowsk...@gmail.com> > > wrote: > > > > > Hey Christos, > > > > > > Thanks for raising this! > > > > > > > without having worked on the API before and without participating in > any > > > prior discussions > > > > > > Quick summary of past discussions and decisions - not defending them > > > necessarily, but important context: > > > > > > '/api/node' is reserved for APIs that only impact the receiving node > > > (and aren't otherwise proxied or distributed): node-healthcheck, > > > status-checking on node-level asynchronous operations, fetching > > > node-specific info (like the node's public-key, environment variables, > > > etc.), debug operations like triggering a thread-dump. '/api/cluster' > > > has APIs that are only available in "SolrCloud" mode, and that > > > (secondarily) have to do with cluster topology/state: cluster > > > properties, setting node-roles, cross-node rebalancing, package and > > > filestore operations, etc. > > > > > > That's not to say that we've gotta stick with those semantics; if > > > there's consensus in another direction we should act while v2 is still > > > "experimental". > > > > > > > What prevents us from moving towards that direction? > > > > > > I love these API suggestions, from a purely aesthetic/cosmetic > > > perspective. But there are some obstacles IMO - the biggest one being > > > limitations in Solr's featureset as it stands today. > > > > > > Take "cluster properties" and "node roles" as examples. I agree that > > > it'd be great to offer them in standalone as well as SolrCloud, and to > > > change the API path to suit. But that'd be a massive effort, and > > > while those gaps exist removing the "/cluster" bit of the path might > > > mislead as many users as it helps. > > > > > > Best, > > > > > > Jason > > > > > > On Fri, Feb 14, 2025 at 12:15 PM Christos Malliaridis > > > <malliari...@apache.org> wrote: > > > > > > > > Hello everyone, > > > > > > > > I am looking into the v2 API and I was wondering what our final > design > > > will > > > > look like in terms of single- and multi-node setups. > > > > > > > > The main question I am trying to answer for myself is "Do we need to > > > > distinguish between the operation mode at API endpoints"? > > > > > > > > From what I can see in the API proposals and current state, some API > > > > endpoints operate inside a cluster context (like > > > /api/cluster/properties), > > > > some inside a node context (like /api/node/logging), some other in > > > cluster > > > > node context (like /api/cluster/nodes/{nodeName}/roles), and some in > no > > > > context (which is I believe cluster and node, depending on operation > > > mode?, > > > > like in /api/aliases/{aliasName}/properties). > > > > > > > > From a consumer's point of view, this may be a bit irritating, and > > > without > > > > having worked on the API before and without participating in any > prior > > > > discussions, I would believe that it could be simplified. > > > > > > > > Looking into where we are now and what we may expect from Solr in the > > > > future, we may not have to distinguish between the operation modes > at the > > > > API endpoints. I am not aware of the historical background or any > > > > constraints that probably apply, so please educate me. > > > > > > > > From what little I know, the following changes would make sense to > me: > > > > - GET /api/cluster/properties could be just GET /api/properties > > > > - it would get the properties of Solr. If it is a cluster, whether > it > > > is > > > > a single node or multiple, it should not make a difference > > > > - GET /api/node/logging/messages could be > > > > /api/nodes/{nodeId}/logging/messages > > > > - It would get the log messages of a specific node. For single node > > > > setups, the node ID is always the same, for multi-nodes it would have > > > > different node IDs > > > > - PUT /api/logging/levels could be added to reflect a cluster-wide > log > > > > level configuration, which seems to be missing in the v2 API at the > > > moment > > > > of writing > > > > - GET /api/cluster/nodes/{nodeId}/roles could be > > > /api/nodes/{nodeName}/roles > > > > - it would return the roles of a specific node (if the roles are > per > > > node > > > > configured) > > > > - GET /api/aliases/{aliasName}/properties would stay as is, as it is > > > > node-independent and therefore a nice and simple endpoint > > > > > > > > This way we would reduce the complexity of our API (for the > consumers) > > > and > > > > make it more intuitive. Additionally, the consumers would not need to > > > know > > > > whether there are multiple nodes or a single node running Solr, and > will > > > > always have a "collection of nodes", even if that collection contains > > > only > > > > a single node at times. And when scaling from one node to multiple > nodes, > > > > no changes at the consumer's side are required (which I'm not sure if > > > this > > > > is currently the case). > > > > > > > > What prevents us from moving towards that direction? > > > > > > > > Best, > > > > Christos > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > > > For additional commands, e-mail: dev-h...@solr.apache.org > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org > >