Are there any unaddressed outstanding concerns that we should hold up the SIP for?
On Mon, 1 Nov, 2021, 10:31 pm Ishan Chattopadhyaya, < ichattopadhy...@gmail.com> wrote: > >> Agree. However, I disagree with ideas where "query analysis" has a role >> of its own. Where would that lead us to? Separate roles for >> > >> nodes that do "faceting" or "spell correction" etc.? But anyway, that >> is for discussion when we add future roles. This is beyond this SIP. >> > > > I am not asking you to implement every possible role of course :). As a > note I know a company that is running an entire separate > > cluster to offload and better serve highlighting on a subset of large > docs, so YES I think there are people who may want such fine grained > control. > > Cool, I think we can discuss adding any additional roles (for > highlighting?) on a case by case basis at a later point. > > > On Mon, Nov 1, 2021 at 10:25 PM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > >> > Boiling it down the idea I'm proposing is that roles required for back >> compatibility get explicitly added on startup, if not by the user then by >> the code. This is more flexible than assuming that no role means every >> role, because then every new feature that has a role will end up on legacy >> clusters which are also not back compatible. >> >> +1, I totally agree. I even said so, when I said: "This is why I was >> advocating that 1) we assume the "data" as a default, 2) not assume >> overseer to be implicitly defined (because of the way overseer role is >> written today), 3) not assume any future roles to be true by default." >> >> So, basically, I'm proposing that the "roles required for back >> compatibility" (that should be explicitly added on startup) be just the >> ["data"] role, and not the "overseer" role (due to the way overseer role is >> currently defined, i.e. it is "preferred overseer"). >> >> On Mon, Nov 1, 2021 at 10:19 PM Gus Heck <gus.h...@gmail.com> wrote: >> >>> Very sorry don't mean to sound offended, Frustrated yes offended no >>> :)... the most difficult thing about communication is the illusion it has >>> occurred :) >>> >>> If you read back just a few emails you'll see where I talk about roles >>> being applied on startup. Boiling it down the idea I'm proposing is that >>> roles required for back compatibility get explicitly added on startup, if >>> not by the user then by the code. This is more flexible than assuming that >>> no role means every role, because then every new feature that has a role >>> will end up on legacy clusters which are also not back compatible. >>> >>> There are points where I said all roles rather than back compatibility >>> roles because I was thinking about back compatibility specifically, but you >>> can't know that if I don't say that can you :). >>> >>> On Mon, Nov 1, 2021 at 12:39 PM Ishan Chattopadhyaya < >>> ichattopadhy...@gmail.com> wrote: >>> >>>> > If you read more closely, my way can provide full back compatibility. >>>> To say or imply it doesn't isn't helping. Perhaps you need to re-read? >>>> >>>> I understand e-mails are frustrating, and I'm trying my best. Please >>>> don't be offended, and kindly point me to the exact part you want me to >>>> re-read. >>>> >>>> On Mon, Nov 1, 2021 at 10:05 PM Gus Heck <gus.h...@gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Nov 1, 2021 at 12:22 PM Ishan Chattopadhyaya < >>>>> ichattopadhy...@gmail.com> wrote: >>>>> >>>>>> > Positive - They denote the existence of a capability >>>>>> >>>>>> Agree, the SIP already reflects this. >>>>>> >>>>>> > Absolute - Absence/Presence binary identification of a >>>>>> capability; no implications, no assumptions >>>>>> >>>>>> Disagree, we need backcompat handling on nodes running without any >>>>>> roles. There has to be an implicit assumption as to what roles are those >>>>>> nodes assumed to have. My proposal is that only the "data" role be >>>>>> assumed, >>>>>> but not the "overseer" role. For any future roles ("coordinator", >>>>>> "zookeeper" etc.), this decision as to what absence of any role implies >>>>>> should be left to the implementation of that future role. Documentation >>>>>> should reflect clearly about these implicit assumptions. >>>>>> >>>>>> >>>>> If you read more closely, my way can provide full back compatibility. >>>>> To say or imply it doesn't isn't helping. Perhaps you need to re-read? >>>>> >>>>> >>>>>> > Focused - Do one thing per role >>>>>> >>>>>> Agree. However, I disagree with ideas where "query analysis" has a >>>>>> role of its own. Where would that lead us to? Separate roles for nodes >>>>>> that >>>>>> do "faceting" or "spell correction" etc.? But anyway, that is for >>>>>> discussion when we add future roles. This is beyond this SIP. >>>>>> >>>>>> >>>>> I am not asking you to implement every possible role of course :). As >>>>> a note I know a company that is running an entire separate cluster to >>>>> offload and better serve highlighting on a subset of large docs, so YES I >>>>> think there are people who may want such fine grained control. >>>>> >>>>> >>>>>> > Accessible - It should be dead simple to determine the members >>>>>> of a role, avoid parsing blobs of json, avoid calculating implications, >>>>>> avoid consulting other resources after listing nodes with the role >>>>>> >>>>>> Agree. I'm open to any implementation details that make it easy. >>>>>> There should be a reasonable API to return these node roles, with ability >>>>>> to filter by role or filter by node. >>>>>> >>>>>> > Independent - One role should not require other roles to be >>>>>> present >>>>>> >>>>>> Do we need to have this hard and fast requirement upfront? There >>>>>> might be situations where this is desirable. I feel we can discuss on a >>>>>> case by case basis whenever a future role is added. >>>>>> >>>>>> > Persistent - roles should not be lost across reboot >>>>>> >>>>>> Agree. >>>>>> >>>>>> > Immutable - roles should not change while the node is running >>>>>> >>>>>> Agree >>>>>> >>>>>> > Lively - A node with a capability may not be presently providing >>>>>> that capability. >>>>>> >>>>>> I don't understand, can you please elaborate? >>>>>> >>>>> >>>>> >>>>> Specifically imagine the case where there are 100 nodes: >>>>> 1-100 ==> DATA >>>>> 101-103 ==> OVERSEER >>>>> 104-106 ==> ZOOKEEPER >>>>> >>>>> But you won't have 3 overseers... you'll want only one of those to be >>>>> *providing >>>>> *overseer functionality and the other two to be *capable*, but not >>>>> providing (so that if the current overseer goes down a new one can be >>>>> assigned). >>>>> >>>>> Then you decide you'd ike 5 Zookeepers. You start nodes 107-108 with >>>>> that role, but you probably want to ensure that zookeepers require some >>>>> sort of command for them to actually join the zookeeper cluster (i.e. >>>>> /admin?action=ZKADD&nodes=node107,node18) ... to do that the nodes need to >>>>> be up. But oh look I typoed 108... we want that to fail... how? because 18 >>>>> does not have the *capability* to become a zookeeper. >>>>> >>>>> >>>>>> >>>>>> On Mon, Nov 1, 2021 at 9:30 PM Ishan Chattopadhyaya < >>>>>> ichattopadhy...@gmail.com> wrote: >>>>>> >>>>>>> > Ilan: A node not having node.roles defined should be assumed to >>>>>>> have all roles. Not only data. I don't see a reason to special case this >>>>>>> one or any role. >>>>>>> > Gus: There should be no "assumptions" Nothing to figure out. A >>>>>>> node has a role or not. For back compatibility reasons, all roles would >>>>>>> be >>>>>>> assumed on startup if none specified. >>>>>>> > Jan: No role == all roles. Explicit list of roles = exactly those >>>>>>> roles. >>>>>>> >>>>>>> Problem with this approach is mainly to do with backcompat. >>>>>>> >>>>>>> *1. Overseer backcompat:* >>>>>>> If we don't make any modifications to how overseer works and adopt >>>>>>> this approach (as quoted), then imagine this situation: >>>>>>> >>>>>>> Solr1-100: No roles param (assumed to be "data,overseer"). >>>>>>> Solr101: -Dnode.roles=overseer (intention: dedicated overseer) >>>>>>> >>>>>>> User wants this node Solr101 to be a dedicated overseer, but for >>>>>>> that to happen, he/she would need to restart all the data nodes with >>>>>>> -Dnode.roles=data. This will cause unnecessary disruption to running >>>>>>> clusters where a dedicated overseer is needed. Keep in mind, if a user >>>>>>> needs a dedicated overseer, he's likely in an emergency situation and >>>>>>> restarting the whole cluster might not be viable for him/her. >>>>>>> >>>>>>> *2. Future roles might not be compatible with this "assumed to have >>>>>>> all roles" idea:* >>>>>>> Take the proposed "zookeeper" role for example. Today, regular nodes >>>>>>> are not supposed to have embedded ZK running on them. By introducing >>>>>>> this >>>>>>> artificial limitation ("assumed to have all roles"), we constrain >>>>>>> adoption >>>>>>> of all future roles to necessarily require a full cluster restart. >>>>>>> >>>>>>> Keep in mind newer Solr versions can introduce new capabilities and >>>>>>> roles. Imagine we have a role that is defined in a new Solr version (and >>>>>>> there's functionality to go with that role), and user upgrades to that >>>>>>> version. However, his/her nodes all were started with no node.roles >>>>>>> param. >>>>>>> Hence, if those nodes are "assumed to have all roles", then just by >>>>>>> virtue >>>>>>> of upgrading to this new version, new capabilities will be turned on for >>>>>>> the entire cluster, whether or not the user opted for such a capability. >>>>>>> This is totally undesirable. >>>>>>> >>>>>>> > Gus: I actually don't want a coordinator to do more work, I would >>>>>>> prefer small focused roles with names that accurately describe their >>>>>>> function. In that light, COORDINATOR might be too nebulous. How about >>>>>>> AGREGATOR role? (what I was thinking of would better be called a >>>>>>> QUERY_ANALYSIS role) >>>>>>> >>>>>>> If you want to do specific things like query analysis or query >>>>>>> aggregation or bulk indexing etc, all of those can be done on >>>>>>> COORDINATOR >>>>>>> nodes (as is the case in ElasticSearch). Having tens of of " small >>>>>>> focused >>>>>>> roles" defined as first class concepts would be confusing to the user. >>>>>>> As a >>>>>>> remedy to your situation where you want the coordinator role to also do >>>>>>> query-analysis for shards, one possible solution is to send such a >>>>>>> query to >>>>>>> a coordinator node with a parameter like >>>>>>> "coordinator.query_analysis=true", >>>>>>> and then the coordinator, instead of blindly hitting remote shards, also >>>>>>> does some extra work on behalf of the shards. >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 1, 2021 at 9:01 PM Ishan Chattopadhyaya < >>>>>>> ichattopadhy...@gmail.com> wrote: >>>>>>> >>>>>>>> > If we make collections role-aware for example (replicas of that >>>>>>>> collection can only be >>>>>>>> > placed on nodes with a specific role, in addition to the other >>>>>>>> role based constraints), >>>>>>>> > the set of roles should be user extensible and not fixed. >>>>>>>> > If collections are not role aware, the constraints introduced by >>>>>>>> roles apply to all collections >>>>>>>> > equally which might be insufficient if a user needs for example a >>>>>>>> heavily used collection to >>>>>>>> > only be placed on more powerful nodes. >>>>>>>> >>>>>>>> I feel node roles and role-aware collections are orthogonal topics. >>>>>>>> What you describe above can be achieved by the autoscaling+replica >>>>>>>> placement framework where the placement plugins take the node roles as >>>>>>>> one >>>>>>>> of the inputs. >>>>>>>> >>>>>>>> > It does impact the design from early on: the set of roles need to >>>>>>>> be expandable by a user >>>>>>>> > by creating a collection with new roles for example (consumed by >>>>>>>> placement plugins) and be >>>>>>>> > able to start nodes with new (arbitrary) roles. Should such roles >>>>>>>> follow some naming syntax to >>>>>>>> > differentiate them from built in roles? To be able to fail on >>>>>>>> typos on roles - that otherwise can be >>>>>>>> > crippling and hard to debug. This implies in any case that the >>>>>>>> current design can't assume all >>>>>>>> > roles are known at compile time or define them in a Java enum. >>>>>>>> >>>>>>>> I think this should be achieved by something different from roles. >>>>>>>> Something like node *labels* (user defined) which can then be used >>>>>>>> in a replica placement plugin to assign replicas. I see roles as more >>>>>>>> closely associated with kinds of functionality a node is designated >>>>>>>> for. >>>>>>>> Therefore, I feel that replica placements and user defined node labels >>>>>>>> is >>>>>>>> out of scope for this SIP. It can be added later in a separate SIP, >>>>>>>> without >>>>>>>> being at odds with this proposal. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 1, 2021 at 8:42 PM Jan Høydahl <jan....@cominvent.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > 1. nov. 2021 kl. 14:46 skrev Ilan Ginzburg <ilans...@gmail.com>: >>>>>>>>> > A node not having node.roles defined should be assumed to have >>>>>>>>> all roles. Not only data. I don't see a reason to special case this >>>>>>>>> one or >>>>>>>>> any role. >>>>>>>>> >>>>>>>>> +1, make it simple and transparent. No role == all roles. Explicit >>>>>>>>> list of roles = exactly those roles. >>>>>>>>> >>>>>>>>> > (Gus) See my comment above, but maybe preference is something >>>>>>>>> handled as a feature of the role rather than via role designation? >>>>>>>>> >>>>>>>>> Yea, we always need an overseer, so that feature can decide to use >>>>>>>>> its list of nodes as a preference if it so chooses. >>>>>>>>> >>>>>>>>> >>>>>>>>> Aside: I think it makes it easier if we always prefix Solr >>>>>>>>> env.vars and sys.props with "SOLR_" or "solr.", i.e. >>>>>>>>> -Dsolr.node.roles=foo. >>>>>>>>> That way we can get away from having to have explicit code in >>>>>>>>> bin/solr, >>>>>>>>> bin/solr.cmd and SolrCLI to manage every single property. Instead we >>>>>>>>> can >>>>>>>>> parse all ENVs and Props with the solr prefix in our bootstrap code. >>>>>>>>> And we >>>>>>>>> can by convention allow e.g. docker run -e SOLR_NODE_ROLES=foo solr:9 >>>>>>>>> and >>>>>>>>> it would be the same ting... >>>>>>>>> >>>>>>>>> Jan >>>>>>>>> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>>>>>>>> For additional commands, e-mail: dev-h...@solr.apache.org >>>>>>>>> >>>>>>>>> >>>>> >>>>> -- >>>>> http://www.needhamsoftware.com (work) >>>>> http://www.the111shift.com (play) >>>>> >>>> >>> >>> -- >>> http://www.needhamsoftware.com (work) >>> http://www.the111shift.com (play) >>> >>