Re: First class support for node roles

Timothy Potter Tue, 02 Nov 2021 16:46:57 -0700

I'm not missing the point of the query coordinator, but I actually
didn't realize that an empty Solr node would forward the top-level
request onward instead of just being the query controller itself? That
actually seems like a bug vs. a feature, IMO any node that receives
the top-level query should just be the coordinator, what stops it?


Anyway, it sounds to me like you guys have your minds made up
regardless of feedback.

Btw ~ I only mentioned the Zookeeper part b/c it's in your SIP as a
specific role, not sure why you took that as me wanting to discuss the
embedded ZK in your SIP?

On Tue, Nov 2, 2021 at 5:13 PM Ishan Chattopadhyaya
<[email protected]> wrote:
>
> Hi Tim,
> Here are my responses inline.
>
> On Wed, Nov 3, 2021 at 3:22 AM Timothy Potter <[email protected]> wrote:
>>
>> I'm just not convinced this feature is even needed and the SIP is not
>> convincing that "There is no proper alternative today."
>
>
> There are no proper alternatives today, just hacks. On 8x, we have two 
> different deprecated frameworks to stop nodes from being placed on a node (1. 
> rule based replica placement, 2. autoscaling framework). On 9x, we have a new 
> autoscaling framework, which I don't even think is fully implemented. And, 
> there's definitely no way to have a node act as a query coordinator without 
> having data on it.
>
>>
>>
>> 1) Just b/c Elastic and Vespa have a concept of node roles, doesn't
>> mean Solr needs this.
>
>
> Solr needs this. Elastic has such concepts is a coincidence, and also means 
> we have an opportunity to catch up with them; they have these concepts for a 
> reason.
>
>>
>> Also, some of Elastic's roles overlap with
>> concepts Solr already has in a different form, i.e data_hot sounds
>> like NRT and data_warm sounds a lot like our Pull Replica Type
>
>
> I think that is beyond the scope of this SIP.
>
>>
>>
>> 2) You can achieve the "coordinator" role with auto-scaling rules
>> pre-9.x and with the AffinityPlacementPlugin (heck, it even has a node
>> type built in: 
>> .requestNodeSystemProperty(AffinityPlacementConfig.NODE_TYPE_SYSPROP).
>> Simply build your replica placement rules such that no replicas land
>> on "coordinator" nodes. And you can route queries using node.sysprop
>> already using shards.preference.
>
>
> I think you missed the whole point of the query coordinator. Please refer to 
> this https://issues.apache.org/jira/browse/SOLR-15715.
> Let me summarize the main difference between what (I think) you refer to and 
> what is proposed in SOLR-15715.
>
> With your suggestion, we'll have a node that doesn't host any replicas. And 
> you suggest queries landing on such nodes be routed using shards.preference? 
> Well, in such a case, these queries will be forwarded/proxied to a random 
> node hosting a replica of the collection and that node then acts as the 
> coordinator. This situation is no better than sending the query directly to 
> that particular node.
>
> What is proposed in SOLR-15715 is a query aggregation functionality. There 
> will be pseudo replicas (aware of the configset) on this coordinator node 
> that handle the request themselves, sends shard requests to data hosting 
> replicas, collects responses and merges them, and sends back to the user. 
> This merge step is usually extremely memory intensive, and it would be good 
> to serve these off stateless nodes (that host no data).
>
>>
>>
>> 3) Dedicated overseer role? I thought we were removing the overseer?!?
>> Also, we already have the ability to run the overseer on specific
>> nodes w/o a new framework, so this doesn't really convince me we need
>> a new framework.
>
>
> There's absolutely no change proposed to the "overseer" role. What users need 
> on production clusters are nodes dedicated for overseer operations, and for 
> that the current "overseer" role suffices, together with some functionality 
> to not place replicas on such nodes.
>
>>
>>
>> 4) We will indeed need to decide which nodes host embedded Zookeeper's
>> but I'd argue that solution hasn't been designed entirely and we
>> probably don't need a formal node role framework to determine which
>> nodes host embedded ZKs. Moreover, embedded ZK seems more like a small
>> cluster thing and anyone running a large cluster will probably have a
>> dedicated ZK ensemble as they do today. The node role thing seems like
>> it's intended for large clusters and my gut says few will use embedded
>> ZK for large clusters.
>
>
> This SIP is not the right place for this discussion. There's a separate SIP 
> for this.
>
>>
>>
>> 5) You can also achieve a lot of "node role" functionality in query
>> routing using the shards.preference parameter.
>>
>
> That doesn't solve the purpose behind 
> https://issues.apache.org/jira/browse/SOLR-15715.
>
>>
>> At the very least, the SIP needs to list specific use cases that
>> require this feature that are not achievable with the current features
>> before getting bogged down in the impl. details.
>
>
> The coordinator role is the biggest motivation for introducing the concept of 
> roles. However, in addition to what is proposed in SOLR-15715, a coordinator 
> node can later on also be used as a node for users to run streaming 
> expressions on, do bulk indexing on (impl details for this to come later, 
> don't want distraction here).
>
>>
>>
>> Tim
>>
>> On Tue, Nov 2, 2021 at 3:20 PM Gus Heck <[email protected]> wrote:
>> >
>> > I think there are things not yet accounted for. Time I spent yesterday is 
>> > biting me today. Pls give a couple days.
>> >
>> > On Tue, Nov 2, 2021 at 11:28 AM Jason Gerlowski <[email protected]> 
>> > wrote:
>> >>
>> >> Hey Ishan,
>> >>
>> >> I appreciate you writing up the SIP!  Here's some notes/questions I
>> >> had as I was reading through your writeup and this mail thread.
>> >> ("----" separators between thoughts, hopefully that helps.)
>> >>
>> >> ----
>> >>
>> >> I'll add my vote to what Jan, Gus, Ilan, and Houston already
>> >> suggested: roles should default to "all-on".  I see the downsides
>> >> you're worried about with that approach (esp. around 'overseer'), but
>> >> they may be mitigatable, at least in part.
>> >>
>> >> > [mail thread] User wants this node Solr101 to be a dedicated overseer, 
>> >> > but for that to happen, he/she would need to restart all the data nodes 
>> >> > with -Dnode.roles=data
>> >>
>> >> Sure, if roles can only be specified at startup.  But that may be a
>> >> self-imposed constraint.
>> >>
>> >> An API to change a node's roles would remove the need for a restart
>> >> and make it easy for users to affect the semantics they want.  You
>> >> decided you want a dedicated overseer N nodes into your cluster
>> >> deployment?  Deploy node 'N' with the 'overseer', and toggle the
>> >> overseer role off on the remainder.
>> >>
>> >> Now, I understand that you don't want roles to change at runtime, but
>> >> I haven't seen you get much into "why", beyond saying "it is very
>> >> risky to have nodes change roles while they are up and running."  Can
>> >> you expand a bit on the risks you're worried about?  If you're
>> >> explicit about them here maybe someone can think of a clever way to
>> >> address them?
>> >>
>> >> > Hence, if those nodes are "assumed to have all roles", then just by 
>> >> > virtue of upgrading to this new version, new capabilities will be 
>> >> > turned on for the entire cluster, whether or not the user opted for 
>> >> > such a capability. This is totally undesirable.
>> >>
>> >> Obviously "roles" refer to much bigger chunks of functionality than
>> >> usual, so in a sense defaulting roles on is scarier.  But in a sense
>> >> you're describing something that's an inherent part of software
>> >> releases.  Releases expose new features that are typically on by
>> >> default.  A new default-on role in 9.1 might hurt a user, but there's
>> >> no fundamental difference between that and a change to backups or
>> >> replication or whatever in the same release.
>> >>
>> >> I don't mean to belittle the difference in scope - I get your concern.
>> >> But IMO this is something to address with good release notes and
>> >> documentation.  Designing for admins who don't do even cursory
>> >> research before an upgrade ties both our hands behind our back as a
>> >> project.
>> >>
>> >> ----
>> >>
>> >> > [SIP] Internal representation in ZK ... Implementation details like 
>> >> > these can be fleshed out in the PR
>> >>
>> >> IMO this is important enough to flush out as part of the SIP, at least
>> >> in broad strokes.  It affects backcompat, SolrJ client design, etc.
>> >>
>> >> ----
>> >>
>> >> > [SIP] GET /api/cluster/roles?node=node1
>> >>
>> >> Woohoo - way to include a v2 API definition!
>> >>
>> >> AFAIR, the v2 API has a /nodes path defined - I wonder whether "GET
>> >> /nodes/someNode/roles" wouldn't be a more intuitive endpoint for the
>> >> "get the roles this node has" functionality.  Though I leave that for
>> >> your consideration.
>> >>
>> >> ----
>> >>
>> >> Looking forward to your responses and seeing the SIP progress!  It's a
>> >> really cool, promising idea IMO.
>> >>
>> >> Best,
>> >>
>> >> Jason
>> >>
>> >> On Tue, Nov 2, 2021 at 11:21 AM Ishan Chattopadhyaya
>> >> <[email protected]> wrote:
>> >> >
>> >> > Are there any unaddressed outstanding concerns that we should hold up 
>> >> > the SIP for?
>> >> >
>> >> > On Mon, 1 Nov, 2021, 10:31 pm Ishan Chattopadhyaya, 
>> >> > <[email protected]> wrote:
>> >> >>>
>> >> >>> >> Agree. However, I disagree with ideas where "query analysis" has a 
>> >> >>> >> role of its own. Where would that lead us to? Separate roles for
>> >> >>>
>> >> >>> >> nodes that do "faceting" or "spell correction" etc.? But anyway, 
>> >> >>> >> that is for discussion when we add future roles. This is beyond 
>> >> >>> >> this SIP.
>> >> >>
>> >> >>
>> >> >> > I am not asking you to implement every possible role of course :). 
>> >> >> > As a note I know a company that is running an entire separate
>> >> >> > cluster to offload and better serve highlighting on a subset of 
>> >> >> > large docs, so YES I think there are people who may want such fine 
>> >> >> > grained control.
>> >> >>
>> >> >> Cool, I think we can discuss adding any additional roles (for 
>> >> >> highlighting?) on a case by case basis at a later point.
>> >> >>
>> >> >>
>> >> >> On Mon, Nov 1, 2021 at 10:25 PM Ishan Chattopadhyaya 
>> >> >> <[email protected]> wrote:
>> >> >>>
>> >> >>> > Boiling it down the idea I'm proposing is that roles required for 
>> >> >>> > back compatibility get explicitly added on startup, if not by the 
>> >> >>> > user then by the code. This is more flexible than assuming that no 
>> >> >>> > role means every role, because then every new feature that has a 
>> >> >>> > role will end up on legacy clusters which are also not back 
>> >> >>> > compatible.
>> >> >>>
>> >> >>> +1, I totally agree. I even said so, when I said: "This is why I was 
>> >> >>> advocating that 1) we assume the "data" as a default, 2) not assume 
>> >> >>> overseer to be implicitly defined (because of the way overseer role 
>> >> >>> is written today), 3) not assume any future roles to be true by 
>> >> >>> default."
>> >> >>>
>> >> >>> So, basically, I'm proposing that the "roles required for back 
>> >> >>> compatibility" (that should be explicitly added on startup) be just 
>> >> >>> the ["data"] role, and not the "overseer" role (due to the way 
>> >> >>> overseer role is currently defined, i.e. it is "preferred overseer").
>> >> >>>
>> >> >>> On Mon, Nov 1, 2021 at 10:19 PM Gus Heck <[email protected]> wrote:
>> >> >>>>
>> >> >>>> Very sorry don't mean to sound offended, Frustrated yes offended no 
>> >> >>>> :)... the most difficult thing about communication is the illusion 
>> >> >>>> it has occurred :)
>> >> >>>>
>> >> >>>> If you read back just a few emails you'll see where I talk about 
>> >> >>>> roles being applied on startup. Boiling it down the idea I'm 
>> >> >>>> proposing is that roles required for back compatibility get 
>> >> >>>> explicitly added on startup, if not by the user then by the code. 
>> >> >>>> This is more flexible than assuming that no role means every role, 
>> >> >>>> because then every new feature that has a role will end up on legacy 
>> >> >>>> clusters which are also not back compatible.
>> >> >>>>
>> >> >>>> There are points where I said all roles rather than back 
>> >> >>>> compatibility roles because I was thinking about back compatibility 
>> >> >>>> specifically, but you can't know that if I don't say that can you :).
>> >> >>>>
>> >> >>>> On Mon, Nov 1, 2021 at 12:39 PM Ishan Chattopadhyaya 
>> >> >>>> <[email protected]> wrote:
>> >> >>>>>
>> >> >>>>> > If you read more closely, my way can provide full back 
>> >> >>>>> > compatibility. To say or imply it doesn't isn't helping. Perhaps 
>> >> >>>>> > you need to re-read?
>> >> >>>>>
>> >> >>>>> I understand e-mails are frustrating, and I'm trying my best. 
>> >> >>>>> Please don't be offended, and kindly point me to the exact part you 
>> >> >>>>> want me to re-read.
>> >> >>>>>
>> >> >>>>> On Mon, Nov 1, 2021 at 10:05 PM Gus Heck <[email protected]> wrote:
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> On Mon, Nov 1, 2021 at 12:22 PM Ishan Chattopadhyaya 
>> >> >>>>>> <[email protected]> wrote:
>> >> >>>>>>>
>> >> >>>>>>> >    Positive - They denote the existence of a capability
>> >> >>>>>>>
>> >> >>>>>>> Agree, the SIP already reflects this.
>> >> >>>>>>>
>> >> >>>>>>> >   Absolute - Absence/Presence binary identification of a 
>> >> >>>>>>> > capability; no implications, no assumptions
>> >> >>>>>>>
>> >> >>>>>>> Disagree, we need backcompat handling on nodes running without 
>> >> >>>>>>> any roles. There has to be an implicit assumption as to what 
>> >> >>>>>>> roles are those nodes assumed to have. My proposal is that only 
>> >> >>>>>>> the "data" role be assumed, but not the "overseer" role. For any 
>> >> >>>>>>> future roles ("coordinator", "zookeeper" etc.), this decision as 
>> >> >>>>>>> to what absence of any role implies should be left to the 
>> >> >>>>>>> implementation of that future role. Documentation should reflect 
>> >> >>>>>>> clearly about these implicit assumptions.
>> >> >>>>>>>
>> >> >>>>>>
>> >> >>>>>> If you read more closely, my way can provide full back 
>> >> >>>>>> compatibility. To say or imply it doesn't isn't helping. Perhaps 
>> >> >>>>>> you need to re-read?
>> >> >>>>>>
>> >> >>>>>>>
>> >> >>>>>>> >    Focused - Do one thing per role
>> >> >>>>>>>
>> >> >>>>>>> Agree. However, I disagree with ideas where "query analysis" has 
>> >> >>>>>>> a role of its own. Where would that lead us to? Separate roles 
>> >> >>>>>>> for nodes that do "faceting" or "spell correction" etc.? But 
>> >> >>>>>>> anyway, that is for discussion when we add future roles. This is 
>> >> >>>>>>> beyond this SIP.
>> >> >>>>>>>
>> >> >>>>>>
>> >> >>>>>> I am not asking you to implement every possible role of course :). 
>> >> >>>>>> As a note I know a company that is running an entire separate 
>> >> >>>>>> cluster to offload and better serve highlighting on a subset of 
>> >> >>>>>> large docs, so YES I think there are people who may want such fine 
>> >> >>>>>> grained control.
>> >> >>>>>>
>> >> >>>>>>>
>> >> >>>>>>> >    Accessible - It should be dead simple to determine the 
>> >> >>>>>>> > members of a role, avoid parsing blobs of json, avoid 
>> >> >>>>>>> > calculating implications, avoid consulting other resources 
>> >> >>>>>>> > after listing nodes with the role
>> >> >>>>>>>
>> >> >>>>>>> Agree. I'm open to any implementation details that make it easy. 
>> >> >>>>>>> There should be a reasonable API to return these node roles, with 
>> >> >>>>>>> ability to filter by role or filter by node.
>> >> >>>>>>>
>> >> >>>>>>> >    Independent - One role should not require other roles to be 
>> >> >>>>>>> > present
>> >> >>>>>>>
>> >> >>>>>>> Do we need to have this hard and fast requirement upfront? There 
>> >> >>>>>>> might be situations where this is desirable. I feel we can 
>> >> >>>>>>> discuss on a case by case basis whenever a future role is added.
>> >> >>>>>>>
>> >> >>>>>>> >    Persistent - roles should not be lost across reboot
>> >> >>>>>>>
>> >> >>>>>>> Agree.
>> >> >>>>>>>
>> >> >>>>>>> >    Immutable - roles should not change while the node is running
>> >> >>>>>>>
>> >> >>>>>>> Agree
>> >> >>>>>>>
>> >> >>>>>>> >    Lively - A node with a capability may not be presently 
>> >> >>>>>>> > providing that capability.
>> >> >>>>>>>
>> >> >>>>>>> I don't understand, can you please elaborate?
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Specifically imagine the case where there are 100 nodes:
>> >> >>>>>> 1-100 ==> DATA
>> >> >>>>>> 101-103 ==> OVERSEER
>> >> >>>>>> 104-106 ==> ZOOKEEPER
>> >> >>>>>>
>> >> >>>>>> But you won't have 3 overseers... you'll want only one of those to 
>> >> >>>>>> be providing overseer functionality and the other two to be 
>> >> >>>>>> capable, but not providing (so that if the current overseer goes 
>> >> >>>>>> down a new one can be assigned).
>> >> >>>>>>
>> >> >>>>>> Then you decide you'd ike 5 Zookeepers. You start nodes 107-108 
>> >> >>>>>> with that role, but you probably want to ensure that zookeepers 
>> >> >>>>>> require some sort of command for them to actually join the 
>> >> >>>>>> zookeeper cluster (i.e. /admin?action=ZKADD&nodes=node107,node18) 
>> >> >>>>>> ... to do that the nodes need to be up. But oh look I typoed 
>> >> >>>>>> 108... we want that to fail... how? because 18 does not have the 
>> >> >>>>>> capability to become a zookeeper.
>> >> >>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> On Mon, Nov 1, 2021 at 9:30 PM Ishan Chattopadhyaya 
>> >> >>>>>>> <[email protected]> wrote:
>> >> >>>>>>>>
>> >> >>>>>>>> > Ilan: A node not having node.roles defined should be assumed 
>> >> >>>>>>>> > to have all roles. Not only data. I don't see a reason to 
>> >> >>>>>>>> > special case this one or any role.
>> >> >>>>>>>> > Gus: There should be no "assumptions" Nothing to figure out. A 
>> >> >>>>>>>> > node has a role or not. For back compatibility reasons, all 
>> >> >>>>>>>> > roles would be assumed on startup if none specified.
>> >> >>>>>>>> > Jan: No role == all roles. Explicit list of roles = exactly 
>> >> >>>>>>>> > those roles.
>> >> >>>>>>>>
>> >> >>>>>>>> Problem with this approach is mainly to do with backcompat.
>> >> >>>>>>>>
>> >> >>>>>>>> 1. Overseer backcompat:
>> >> >>>>>>>> If we don't make any modifications to how overseer works and 
>> >> >>>>>>>> adopt this approach (as quoted), then imagine this situation:
>> >> >>>>>>>>
>> >> >>>>>>>> Solr1-100: No roles param (assumed to be "data,overseer").
>> >> >>>>>>>> Solr101: -Dnode.roles=overseer (intention: dedicated overseer)
>> >> >>>>>>>>
>> >> >>>>>>>> User wants this node Solr101 to be a dedicated overseer, but for 
>> >> >>>>>>>> that to happen, he/she would need to restart all the data nodes 
>> >> >>>>>>>> with -Dnode.roles=data. This will cause unnecessary disruption 
>> >> >>>>>>>> to running clusters where a dedicated overseer is needed. Keep 
>> >> >>>>>>>> in mind, if a user needs a dedicated overseer, he's likely in an 
>> >> >>>>>>>> emergency situation and restarting the whole cluster might not 
>> >> >>>>>>>> be viable for him/her.
>> >> >>>>>>>>
>> >> >>>>>>>> 2. Future roles might not be compatible with this "assumed to 
>> >> >>>>>>>> have all roles" idea:
>> >> >>>>>>>> Take the proposed "zookeeper" role for example. Today, regular 
>> >> >>>>>>>> nodes are not supposed to have embedded ZK running on them. By 
>> >> >>>>>>>> introducing this artificial limitation ("assumed to have all 
>> >> >>>>>>>> roles"), we constrain adoption of all future roles to 
>> >> >>>>>>>> necessarily require a full cluster restart.
>> >> >>>>>>>>
>> >> >>>>>>>> Keep in mind newer Solr versions can introduce new capabilities 
>> >> >>>>>>>> and roles. Imagine we have a role that is defined in a new Solr 
>> >> >>>>>>>> version (and there's functionality to go with that role), and 
>> >> >>>>>>>> user upgrades to that version. However, his/her nodes all were 
>> >> >>>>>>>> started with no node.roles param. Hence, if those nodes are 
>> >> >>>>>>>> "assumed to have all roles", then just by virtue of upgrading to 
>> >> >>>>>>>> this new version, new capabilities will be turned on for the 
>> >> >>>>>>>> entire cluster, whether or not the user opted for such a 
>> >> >>>>>>>> capability. This is totally undesirable.
>> >> >>>>>>>>
>> >> >>>>>>>> > Gus: I actually don't want a coordinator to do more work, I 
>> >> >>>>>>>> > would prefer small focused roles with names that accurately 
>> >> >>>>>>>> > describe their function. In that light, COORDINATOR might be 
>> >> >>>>>>>> > too nebulous. How about AGREGATOR role? (what I was thinking 
>> >> >>>>>>>> > of would better be called a QUERY_ANALYSIS role)
>> >> >>>>>>>>
>> >> >>>>>>>> If you want to do specific things like query analysis or query 
>> >> >>>>>>>> aggregation or bulk indexing etc, all of those can be done on 
>> >> >>>>>>>> COORDINATOR nodes (as is the case in ElasticSearch). Having tens 
>> >> >>>>>>>> of of " small focused roles" defined as first class concepts 
>> >> >>>>>>>> would be confusing to the user. As a remedy to your situation 
>> >> >>>>>>>> where you want the coordinator role to also do query-analysis 
>> >> >>>>>>>> for shards, one possible solution is to send such a query to a 
>> >> >>>>>>>> coordinator node with a parameter like 
>> >> >>>>>>>> "coordinator.query_analysis=true", and then the coordinator, 
>> >> >>>>>>>> instead of blindly hitting remote shards, also does some extra 
>> >> >>>>>>>> work on behalf of the shards.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> On Mon, Nov 1, 2021 at 9:01 PM Ishan Chattopadhyaya 
>> >> >>>>>>>> <[email protected]> wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>> > If we make collections role-aware for example (replicas of 
>> >> >>>>>>>>> > that collection can only be
>> >> >>>>>>>>> > placed on nodes with a specific role, in addition to the 
>> >> >>>>>>>>> > other role based constraints),
>> >> >>>>>>>>> > the set of roles should be user extensible and not fixed.
>> >> >>>>>>>>> > If collections are not role aware, the constraints introduced 
>> >> >>>>>>>>> > by roles apply to all collections
>> >> >>>>>>>>> > equally which might be insufficient if a user needs for 
>> >> >>>>>>>>> > example a heavily used collection to
>> >> >>>>>>>>> > only be placed on more powerful nodes.
>> >> >>>>>>>>>
>> >> >>>>>>>>> I feel node roles and role-aware collections are orthogonal 
>> >> >>>>>>>>> topics. What you describe above can be achieved by the 
>> >> >>>>>>>>> autoscaling+replica placement framework where the placement 
>> >> >>>>>>>>> plugins take the node roles as one of the inputs.
>> >> >>>>>>>>>
>> >> >>>>>>>>> > It does impact the design from early on: the set of roles 
>> >> >>>>>>>>> > need to be expandable by a user
>> >> >>>>>>>>> > by creating a collection with new roles for example (consumed 
>> >> >>>>>>>>> > by placement plugins) and be
>> >> >>>>>>>>> > able to start nodes with new (arbitrary) roles. Should such 
>> >> >>>>>>>>> > roles follow some naming syntax to
>> >> >>>>>>>>> > differentiate them from built in roles? To be able to fail on 
>> >> >>>>>>>>> > typos on roles - that otherwise can be
>> >> >>>>>>>>> > crippling and hard to debug. This implies in any case that 
>> >> >>>>>>>>> > the current design can't assume all
>> >> >>>>>>>>> > roles are known at compile time or define them in a Java enum.
>> >> >>>>>>>>>
>> >> >>>>>>>>> I think this should be achieved by something different from 
>> >> >>>>>>>>> roles. Something like node labels (user defined) which can then 
>> >> >>>>>>>>> be used in a replica placement plugin to assign replicas. I see 
>> >> >>>>>>>>> roles as more closely associated with kinds of functionality a 
>> >> >>>>>>>>> node is designated for. Therefore, I feel that replica 
>> >> >>>>>>>>> placements and user defined node labels is out of scope for 
>> >> >>>>>>>>> this SIP. It can be added later in a separate SIP, without 
>> >> >>>>>>>>> being at odds with this proposal.
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> On Mon, Nov 1, 2021 at 8:42 PM Jan Høydahl 
>> >> >>>>>>>>> <[email protected]> wrote:
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> > 1. nov. 2021 kl. 14:46 skrev Ilan Ginzburg 
>> >> >>>>>>>>>> > <[email protected]>:
>> >> >>>>>>>>>> > A node not having node.roles defined should be assumed to 
>> >> >>>>>>>>>> > have all roles. Not only data. I don't see a reason to 
>> >> >>>>>>>>>> > special case this one or any role.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> +1, make it simple and transparent. No role == all roles. 
>> >> >>>>>>>>>> Explicit list of roles = exactly those roles.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> > (Gus) See my comment above, but maybe preference is 
>> >> >>>>>>>>>> > something handled as a feature of the role rather than via 
>> >> >>>>>>>>>> > role designation?
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Yea, we always need an overseer, so that feature can decide to 
>> >> >>>>>>>>>> use its list of nodes as a preference if it so chooses.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Aside: I think it makes it easier if we always prefix Solr 
>> >> >>>>>>>>>> env.vars and sys.props with "SOLR_" or "solr.", i.e. 
>> >> >>>>>>>>>> -Dsolr.node.roles=foo. That way we can get away from having to 
>> >> >>>>>>>>>> have explicit code in bin/solr, bin/solr.cmd and SolrCLI to 
>> >> >>>>>>>>>> manage every single property. Instead we can parse all ENVs 
>> >> >>>>>>>>>> and Props with the solr prefix in our bootstrap code. And we 
>> >> >>>>>>>>>> can by convention allow e.g. docker run -e SOLR_NODE_ROLES=foo 
>> >> >>>>>>>>>> solr:9 and it would be the same ting...
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Jan
>> >> >>>>>>>>>> ---------------------------------------------------------------------
>> >> >>>>>>>>>> To unsubscribe, e-mail: [email protected]
>> >> >>>>>>>>>> For additional commands, e-mail: [email protected]
>> >> >>>>>>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> --
>> >> >>>>>> http://www.needhamsoftware.com (work)
>> >> >>>>>> http://www.the111shift.com (play)
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> --
>> >> >>>> http://www.needhamsoftware.com (work)
>> >> >>>> http://www.the111shift.com (play)
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: [email protected]
>> >> For additional commands, e-mail: [email protected]
>> >>
>> >
>> >
>> > --
>> > http://www.needhamsoftware.com (work)
>> > http://www.the111shift.com (play)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: First class support for node roles

Reply via email to