I think we should establish a known set of default roles. Other roles are non-default roles. New features can be added as default or non-default as required. Default roles are explicitly loaded at startup if no other information is provided. I'd even be fine with a config that said +zookeeper -overseer so long as this resulted in an explicit set of roles easily identified at runtime and not requiring runtime code level logic more complex than list.contains() (conceptually, exact implementation may vary).
As for weak roles, that shouldn't exist (going forward). Roles however might have associated config that can do things like express a preference order or precedence or whatever other complex behavior among role members they like, but role membership should be binary. As I noted before it's not necessary to rework everything into roles at once, so the above statement about overseer doesn't apply till we create a "new role system role" for it. Until the new role is created, it's just basic functionality not associated with a role, and applicable to all machines. To make an explicit proposal, maybe add the following to the SIP: *Roles During Application Lifecycle"* 1) Roles may be configured for a node (means TBD) before a node is started. 2) On startup the node check for configured roles. 3) a) If configured roles are found export the configured list such that it is visible to code (probably zk, impl TBD) 3) b) If no roles are configured export the default set of roles such that they are visible to code (method identical to 3) a) ) 4) Node completes any other necessary startup and publishes itself in live_nodes. *Usage of roles in code* 1) Roles will be checked in publicly published configuration (i.e. zk), and a watch set to detect any change. 2) Roles will not be checked by loading config from disk or caching disk config in memory. (zk ONLY source of truth) Thoughts? (the changeability of roles can On Wed, Nov 17, 2021 at 2:15 PM Ilan Ginzburg <ilans...@gmail.com> wrote: > Current Overseer role defaults to all nodes can, and that changes only > when some nodes get the Overseer role. > > This is a default value per node based on the presence of the role on > any node of the cluster. > But the Overseer role is "weak" in that nodes not having the role can > become overseer even if some nodes did get the role (but are down for > example). > > Shall the SIP explore the notion of default when a role is not defined > for ANY node in the cluster rather than only when a role is not > defined on a per node basis? It might solve some of the problems we > have. > > Ilan > > > On Wed, Nov 17, 2021 at 7:08 PM Ishan Chattopadhyaya > <ichattopadhy...@gmail.com> wrote: > > > > > > > > On Wed, Nov 17, 2021 at 8:12 PM Jan Høydahl <jan....@cominvent.com> > wrote: > >> > >> I think your VOTE is premature as several design decisions are > obviously not landed. That may be the reason there are no votes yet, and > I'm not going to vote either. > >> > >> > >> If "-Dsolr.node.roles" parameter is not passed, it is implicitly > assumed to be "-Dsolr.nodes.role=data" (due to backcompat reasons and also > so that those who don't use the role feature don't need any extra > parameters). > >> > >> > >> I'm not sold on making such a complex rule for what roles are enabled > and treating data role differently from other roles. > > > > > > As I've said this before, we can't say "all roles are on by default" > since we can't forsee the future to decide whether enabling a future role > enabled by default. As of today, we have two roles: "data" and "overseer", > and "data" is enabled by default on all nodes, and "overseer" (which stands > for preferred overseer) is disabled by default on all nodes. Hence, I > mentioned that if node hasn't started with "solr.node.roles" parameter, we > should assume it is solr.nodes.role=data. > > > >> > >> It's fine to require certain upgrade steps for 9.0. > > > > > > Forcing everyone to explicitly use -Dsolr.nodes.role=data parameter to > start their nodes, irrespective of whether they want to use the roles > feature, doesn't seem like a reasonable idea. > > > >> > >> We should keep role config 1:1 and dead simple, i.e. WYSIWYG and no > roles means all roles. Then handle back-compat in more targeted ways like > we have done for certain features before such as HTTP1 vs HTTP2. > >> > >> If a coordinator node is started with "data" role also, it fails to > startup with a message indicating a node cannot both be coordinator and > data node. > >> > >> > >> Such custom complex rules don't make sense to me. If you want a single > node to handle both data, zookeeper, overseer, coordination, > streaming-expressions, sql, foo and bar, then fine, why block it? > > > > > > The coordinator role's implementation is outside the scope of this SIP. > I propose that any future role (zk, coordination, sql, foo, bar...) be free > to choose its own implementation or constraints. We can discuss this at the > time of introduction of those roles. > > > >> > >> Users will start in that mode and then separate out certain nodes for > certain workloads as they grow their clusters. > >> > >> Jan > >> > >> 15. nov. 2021 kl. 16:36 skrev Ishan Chattopadhyaya < > ichattopadhy...@gmail.com>: > >> > >> Thanks Jan, I've updated the SIP document with all the applicable > changes with a link to this thread (which contains the summary at the end). > >> I'll initiate the vote thread now. Thanks to everyone for contributing. > >> > >> On Mon, Nov 15, 2021 at 6:53 PM Jan Høydahl <jan....@cominvent.com> > wrote: > >>> > >>> Thanks for trying to summarize and drive the work Ishan. > >>> > >>> I'd like to add > >>> > >>> Scope of SIP > >>> Ishan: Role API and config > >>> Jan: Role API, config, and impact of one real role e.g. the "data" > role, to examplify and justify the role infrastructure > >>> > >>> According to SIP process the next step is not implementation, but > rather to iterate the SIP text to something you believe would pass a vote. > It's hard to stitch together all these email and mini summaries into a > meaningful whole. > >>> > >>> Jan > >>> > >>> 15. nov. 2021 kl. 05:28 skrev Ishan Chattopadhyaya < > ichattopadhy...@gmail.com>: > >>> > >>> Thanks to everyone for the feedback. > >>> > >>> Here's an attempt to summarize broad topics discussed. > >>> > >>> No negative roles > >>> Everyone agree > >>> > >>> Roles on/off by default? > >>> Jason+(Ilan,Houston?): All roles to be on by default > >>> Gus,Ishan,Noble: Only those roles to be on by default that are needed > for backcompat > >>> > >>> Which branch to target? > >>> Jan,Ishan,Noble: New feature to be added to 9x branch > >>> > >>> Need for roles? > >>> Tim: new concept of nodes unnecessary since everything that's proposed > can be achieved using changes to new autoscaling framework and replica > placement plugins. > >>> Ishan,Noble: A first class concept of roles is important so that this > functionality is expected to work, irrespective of whatever custom > placement plugins users deploy (since placement plugins don't support > chaining). > >>> > >>> Roles for collections? > >>> Ilan: Role aware collections > >>> Ishan: This can be implemented separately later using node roles and > placement plugins. > >>> > >>> Configuration > >>> Sysprops vs solr.xml+sysprops vs envvars: > >>> Shawn: Solr.xml and/or envvars > >>> Houston,Ilan: Sysprops and/or envvars > >>> Ishan,Noble: Sysprops > >>> Jan: SIP-11 > >>> > >>> Outstanding issues > >>> Shawn: Color of the bikeshed ;-) > >>> > >>> Please let me know if I missed something here. If there are no further > strong objections, we can proceed to the implementation phase. There's > already a draft/WIP PR in the works: > https://github.com/apache/solr/pull/403 > >>> > >>> Thanks, > >>> Ishan > >>> > >>> On Fri, Nov 12, 2021 at 11:38 PM Gus Heck <gus.h...@gmail.com> wrote: > >>>> > >>>> Yeah we should only be looking for and only be reporting (if we > choose to report to the user) a specific set of env variables. Anything > else should be ignored.Should be an enum or constants somewhere listing > what solr cares about, and we should ignore or be blind to anything else. > >>>> > >>>> Perhaps we'd like to have a ConfigParams (or whatever) enum that has > methods returning the env, sysprop, bin/solr arg, configFile and zkLocation > that can be used to provide each possible configuration option (for things > that are single value or short list, obviously an entire schema probably > would not be setable by sysprop :) )? > >>>> > >>>> The return type of those methods could be Optional<>() since we > neither have all of those for everything any time soon, and not all of them > will make sense in all cases. > >>>> > >>>> zkLocation is a bit tricky and nebulous since it's probably a zk path > and a JSON path or Xpath combined and relative to the chroot which itself > is a potential config param, some stuff to think through there. > >>>> > >>>> On Thu, Nov 11, 2021 at 3:49 PM Ilan Ginzburg <ilans...@gmail.com> > wrote: > >>>>> > >>>>> Houston made a very valid comment back then on the placement plugin > support of environment variables (dropped as a consequence). > >>>>> > >>>>> > https://issues.apache.org/jira/browse/SOLR-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286680#comment-17286680 > >>>>> > >>>>> It could be possible to unintentionally leak node data that should > be kept secret if Solr is allowed to freely access (random?) environment > variables as part of configuration. > >>>>> > >>>>> Something to keep in mind. > >>>>> > >>>>> Ilan > >>>>> > >>>>> > >>>>> On Thu 11 Nov 2021 at 20:12, Eric Pugh < > ep...@opensourceconnections.com> wrote: > >>>>>> > >>>>>> Agreed! > >>>>>> > >>>>>> I’ve noticed that in the Play Framework, you can configure > everything via a property based configuration file, however it makes it > easy to override the property file via another one, or via an ENV variables: > >>>>>> > >>>>>> db.default.username="smui" > >>>>>> db.default.username=${?SMUI_DB_USER} > >>>>>> > >>>>>> Which turns out to be very liberating! > >>>>>> > >>>>>> > >>>>>> On Nov 11, 2021, at 2:09 PM, Jan Høydahl <jan....@cominvent.com> > wrote: > >>>>>> > >>>>>> +1 to a roundup of env and props across the board. I think SIP 11 > is on the track of something. But can be done independent of this. > >>>>>> > >>>>>> Jan Høydahl > >>>>>> > >>>>>> 11. nov. 2021 kl. 17:44 skrev Gus Heck <gus.h...@gmail.com>: > >>>>>> > >>>>>> > >>>>>> I guess all I mean is that it shouldn't be only sysprops. Enabling > sysprops, Env vars etc seems fine but we need to clearly document > precedence among any/all options. What is convenient varies from case to > case and in a perfect world what I'd like to see is full support across > each style (files, zk, props, env vars) with consistent and obvious naming > and well documented resolution order. > >>>>>> > >>>>>> What I don't like is a little bit of env vars for some stuff, props > for others, files for yet more stuff and some unclear aggregation of that > showing up in zk... (or maybe some of it not showing up anywhere code could > check it...) > >>>>>> > >>>>>> On Thu, Nov 11, 2021 at 11:07 AM Houston Putman < > houstonput...@gmail.com> wrote: > >>>>>>> > >>>>>>> I agree with Jan, when thinking about making Solr as cloud > friendly as possible EnvVars and (to a lesser extent) sysProps are much > preferable than having a setting in the solr.xml. > >>>>>>> This is because it's easier to customize EnvVars per-node, while > customizing a config file is much harder, as those tend to be static and > shared across a whole environment. > >>>>>>> > >>>>>>> Also thanks for linking that SIP Jan, very applicable. > >>>>>>> > >>>>>>> - Houston > >>>>>>> > >>>>>>> On Fri, Nov 5, 2021 at 5:19 PM Jan Høydahl <jan....@cominvent.com> > wrote: > >>>>>>>> > >>>>>>>> Thinking of these roles as labels, I think sysProps and envVars > are the two universal methods, and nothing wrong with that. > >>>>>>>> I keep trying to think cloud native and container, so having > excellent 1st class support for env.vars for such configs is a priority to > me. > >>>>>>>> Most tools, CI-environments etc have built-in support for > env.vars, and so it makes sense to me. > >>>>>>>> > >>>>>>>> See > https://cwiki.apache.org/confluence/display/SOLR/SIP-11+Uniform+cluster-level+configuration+API > for some interesting ideas around cluster/node level config. > >>>>>>>> > >>>>>>>> See > >>>>>>>> > >>>>>>>> 5. nov. 2021 kl. 15:04 skrev Gus Heck <gus.h...@gmail.com>: > >>>>>>>> > >>>>>>>> Agree better to something other than sysprops. an arg in the > start script would be friendlier than -D props which generally are > irritatingly verbose and expose too much implementation. > >>>>>>>> > >>>>>>>> We lack a config file per level. solr.xml does double duty as > global and per-node depending on how it's used (zk or filesystem). > >>>>>>>> > >>>>>>>> Config file names are confusing too. Our file names are legacy of > non-cloud mode I think, and we really should at some point (10.x?) rework > configs to be cluster.xml, node.xml, collection.xml (formerly > solrconfig.xml) and schema.xml (and maybe support something other than xml, > but that's not nearly as important as clarity in naming, and having > features) > >>>>>>>> > >>>>>>>> But this is all straying way off topic and should have its own > SIP if someone seems to have time for it :) > >>>>>>>> > >>>>>>>> On Thu, Nov 4, 2021 at 6:07 PM Shawn Heisey < > elyog...@elyograg.org> wrote: > >>>>>>>>> > >>>>>>>>> On 11/4/21 2:51 PM, Noble Paul wrote: > >>>>>>>>> > The SIP can be boiled down to the following > >>>>>>>>> > > >>>>>>>>> > * *Tag a node with a label (role) using a system property* > >>>>>>>>> > ** Use the placement plugin to whitelist/block list certain > nodes* > >>>>>>>>> > ** Publish the roles through an API* > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> In general, for Solr, do we like the idea of having things > controlled by > >>>>>>>>> system properties? > >>>>>>>>> > >>>>>>>>> I would think solr.xml would be the right place to configure > this, > >>>>>>>>> except that people can and probably do put solr.xml in > zookeeper, which > >>>>>>>>> would mean every system would have the SAME solr.xml, and we're > back to > >>>>>>>>> system properties as a way to customize solr.xml on each system. > >>>>>>>>> > >>>>>>>>> I have never used system properties to configure Solr. When I > customize > >>>>>>>>> the config, I will often remove property substitutions from it > and go > >>>>>>>>> with explicit settings. My general opinion about system > properties is > >>>>>>>>> that if they're going to be used, they should DIRECTLY configure > the > >>>>>>>>> application, not be sent in via property substitution in a > config file. > >>>>>>>>> I've never liked the way our default configs use that paradigm. > It > >>>>>>>>> means you cannot look at the config and know exactly how things > are > >>>>>>>>> configured, without finding out whether system properties have > been set. > >>>>>>>>> > >>>>>>>>> What color do others think that bikeshed should be painted? > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Shawn > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > --------------------------------------------------------------------- > >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > >>>>>>>>> For additional commands, e-mail: dev-h...@solr.apache.org > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> http://www.needhamsoftware.com (work) > >>>>>>>> http://www.the111shift.com (play) > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> http://www.needhamsoftware.com (work) > >>>>>> http://www.the111shift.com (play) > >>>>>> > >>>>>> > >>>>>> _______________________ > >>>>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | > 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy > >>>>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed > >>>>>> This e-mail and all contents, including attachments, is considered > to be Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > >>>>>> > >>>> > >>>> > >>>> -- > >>>> http://www.needhamsoftware.com (work) > >>>> http://www.the111shift.com (play) > >>> > >>> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org > > -- http://www.needhamsoftware.com (work) http://www.the111shift.com (play)