Jan, +1 to using -Dsolr.node.roles instead of -Dnode.roles. On Mon, Nov 1, 2021 at 10:18 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote:
> > But I assume that a new feature in 9.x that introduces a new role can > also decide for some alternative back-compat logic to support rolling > restart if it is needed. > > IMHO, having per feature enable/disable flag would be ugly user experience. > > Imagine, telling users that for the newly introduced "zookeeper" role, you > need to start nodes with: > > -Dnodes.role=zookeeper and -Dembedded.zk=true > > instead of > > -Dnodes.role=zookeeper > (itself enables the functionality needed for that role). > > On Mon, Nov 1, 2021 at 9:49 PM Jan Høydahl <jan....@cominvent.com> wrote: > >> I think it is safe to assume that small clusters, say 1-5 nodes will most >> often want to have all features on all nodes as the cluster is too small to >> specialize, and then the default is perfect. >> For large clusters we should recommend explicitly specifying roles during >> the 9.0 upgrade. So if you have 100 nodes, you would likely have assigned >> the overseer role to a handful nodes when upgrading to 9.0. >> And for every new feature in 9.x you will explicitly decide whether to >> use it and what nodes should have the role. >> >> But I assume that a new feature in 9.x that introduces a new role can >> also decide for some alternative back-compat logic to support rolling >> restart if it is needed. >> >> Jan >> >> 1. nov. 2021 kl. 17:00 skrev Ishan Chattopadhyaya < >> ichattopadhy...@gmail.com>: >> >> > Ilan: A node not having node.roles defined should be assumed to have >> all roles. Not only data. I don't see a reason to special case this one or >> any role. >> > Gus: There should be no "assumptions" Nothing to figure out. A node has >> a role or not. For back compatibility reasons, all roles would be assumed >> on startup if none specified. >> > Jan: No role == all roles. Explicit list of roles = exactly those roles. >> >> Problem with this approach is mainly to do with backcompat. >> >> *1. Overseer backcompat:* >> If we don't make any modifications to how overseer works and adopt this >> approach (as quoted), then imagine this situation: >> >> Solr1-100: No roles param (assumed to be "data,overseer"). >> Solr101: -Dnode.roles=overseer (intention: dedicated overseer) >> >> User wants this node Solr101 to be a dedicated overseer, but for that to >> happen, he/she would need to restart all the data nodes with >> -Dnode.roles=data. This will cause unnecessary disruption to running >> clusters where a dedicated overseer is needed. Keep in mind, if a user >> needs a dedicated overseer, he's likely in an emergency situation and >> restarting the whole cluster might not be viable for him/her. >> >> *2. Future roles might not be compatible with this "assumed to have all >> roles" idea:* >> Take the proposed "zookeeper" role for example. Today, regular nodes are >> not supposed to have embedded ZK running on them. By introducing this >> artificial limitation ("assumed to have all roles"), we constrain adoption >> of all future roles to necessarily require a full cluster restart. >> >> Keep in mind newer Solr versions can introduce new capabilities and >> roles. Imagine we have a role that is defined in a new Solr version (and >> there's functionality to go with that role), and user upgrades to that >> version. However, his/her nodes all were started with no node.roles param. >> Hence, if those nodes are "assumed to have all roles", then just by virtue >> of upgrading to this new version, new capabilities will be turned on for >> the entire cluster, whether or not the user opted for such a capability. >> This is totally undesirable. >> >> > Gus: I actually don't want a coordinator to do more work, I would >> prefer small focused roles with names that accurately describe their >> function. In that light, COORDINATOR might be too nebulous. How about >> AGREGATOR role? (what I was thinking of would better be called a >> QUERY_ANALYSIS role) >> >> If you want to do specific things like query analysis or query >> aggregation or bulk indexing etc, all of those can be done on COORDINATOR >> nodes (as is the case in ElasticSearch). Having tens of of " small focused >> roles" defined as first class concepts would be confusing to the user. As a >> remedy to your situation where you want the coordinator role to also do >> query-analysis for shards, one possible solution is to send such a query to >> a coordinator node with a parameter like "coordinator.query_analysis=true", >> and then the coordinator, instead of blindly hitting remote shards, also >> does some extra work on behalf of the shards. >> >> >> On Mon, Nov 1, 2021 at 9:01 PM Ishan Chattopadhyaya < >> ichattopadhy...@gmail.com> wrote: >> >>> > If we make collections role-aware for example (replicas of that >>> collection can only be >>> > placed on nodes with a specific role, in addition to the other role >>> based constraints), >>> > the set of roles should be user extensible and not fixed. >>> > If collections are not role aware, the constraints introduced by roles >>> apply to all collections >>> > equally which might be insufficient if a user needs for example a >>> heavily used collection to >>> > only be placed on more powerful nodes. >>> >>> I feel node roles and role-aware collections are orthogonal topics. What >>> you describe above can be achieved by the autoscaling+replica placement >>> framework where the placement plugins take the node roles as one of the >>> inputs. >>> >>> > It does impact the design from early on: the set of roles need to be >>> expandable by a user >>> > by creating a collection with new roles for example (consumed by >>> placement plugins) and be >>> > able to start nodes with new (arbitrary) roles. Should such roles >>> follow some naming syntax to >>> > differentiate them from built in roles? To be able to fail on typos on >>> roles - that otherwise can be >>> > crippling and hard to debug. This implies in any case that the current >>> design can't assume all >>> > roles are known at compile time or define them in a Java enum. >>> >>> I think this should be achieved by something different from roles. >>> Something like node *labels* (user defined) which can then be used in a >>> replica placement plugin to assign replicas. I see roles as more closely >>> associated with kinds of functionality a node is designated for. Therefore, >>> I feel that replica placements and user defined node labels is out of scope >>> for this SIP. It can be added later in a separate SIP, without being at >>> odds with this proposal. >>> >>> >>> >>> >>> >>> >>> On Mon, Nov 1, 2021 at 8:42 PM Jan Høydahl <jan....@cominvent.com> >>> wrote: >>> >>>> >>>> >>>> > 1. nov. 2021 kl. 14:46 skrev Ilan Ginzburg <ilans...@gmail.com>: >>>> > A node not having node.roles defined should be assumed to have all >>>> roles. Not only data. I don't see a reason to special case this one or any >>>> role. >>>> >>>> +1, make it simple and transparent. No role == all roles. Explicit list >>>> of roles = exactly those roles. >>>> >>>> > (Gus) See my comment above, but maybe preference is something handled >>>> as a feature of the role rather than via role designation? >>>> >>>> Yea, we always need an overseer, so that feature can decide to use its >>>> list of nodes as a preference if it so chooses. >>>> >>>> >>>> Aside: I think it makes it easier if we always prefix Solr env.vars and >>>> sys.props with "SOLR_" or "solr.", i.e. -Dsolr.node.roles=foo. That way we >>>> can get away from having to have explicit code in bin/solr, bin/solr.cmd >>>> and SolrCLI to manage every single property. Instead we can parse all ENVs >>>> and Props with the solr prefix in our bootstrap code. And we can by >>>> convention allow e.g. docker run -e SOLR_NODE_ROLES=foo solr:9 and it would >>>> be the same ting... >>>> >>>> Jan >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>>> For additional commands, e-mail: dev-h...@solr.apache.org >>>> >>>> >>