ack meant to write "default +zookeep4r -overseer" any config should be explicit.
On Wed, Nov 17, 2021 at 7:43 PM Gus Heck <gus.h...@gmail.com> wrote: > I think we should establish a known set of default roles. Other roles are > non-default roles. New features can be added as default or non-default as > required. Default roles are explicitly loaded at startup if no other > information is provided. I'd even be fine with a config that > said +zookeeper -overseer so long as this resulted in an explicit set of > roles easily identified at runtime and not requiring runtime code level > logic more complex than list.contains() (conceptually, exact implementation > may vary). > > As for weak roles, that shouldn't exist (going forward). Roles however > might have associated config that can do things like express a preference > order or precedence or whatever other complex behavior among role members > they like, but role membership should be binary. > > As I noted before it's not necessary to rework everything into roles at > once, so the above statement about overseer doesn't apply till we create a > "new role system role" for it. Until the new role is created, it's just > basic functionality not associated with a role, and applicable to all > machines. > > To make an explicit proposal, maybe add the following to the SIP: > > *Roles During Application Lifecycle"* > 1) Roles may be configured for a node (means TBD) before a node is started. > 2) On startup the node check for configured roles. > 3) a) If configured roles are found export the configured list such that > it is visible to code (probably zk, impl TBD) > 3) b) If no roles are configured export the default set of roles such that > they are visible to code (method identical to 3) a) ) > 4) Node completes any other necessary startup and publishes itself in > live_nodes. > > *Usage of roles in code* > 1) Roles will be checked in publicly published configuration (i.e. zk), > and a watch set to detect any change. > 2) Roles will not be checked by loading config from disk or caching disk > config in memory. (zk ONLY source of truth) > > Thoughts? > > (the changeability of roles can > > > On Wed, Nov 17, 2021 at 2:15 PM Ilan Ginzburg <ilans...@gmail.com> wrote: > >> Current Overseer role defaults to all nodes can, and that changes only >> when some nodes get the Overseer role. >> >> This is a default value per node based on the presence of the role on >> any node of the cluster. >> But the Overseer role is "weak" in that nodes not having the role can >> become overseer even if some nodes did get the role (but are down for >> example). >> >> Shall the SIP explore the notion of default when a role is not defined >> for ANY node in the cluster rather than only when a role is not >> defined on a per node basis? It might solve some of the problems we >> have. >> >> Ilan >> >> >> On Wed, Nov 17, 2021 at 7:08 PM Ishan Chattopadhyaya >> <ichattopadhy...@gmail.com> wrote: >> > >> > >> > >> > On Wed, Nov 17, 2021 at 8:12 PM Jan Høydahl <jan....@cominvent.com> >> wrote: >> >> >> >> I think your VOTE is premature as several design decisions are >> obviously not landed. That may be the reason there are no votes yet, and >> I'm not going to vote either. >> >> >> >> >> >> If "-Dsolr.node.roles" parameter is not passed, it is implicitly >> assumed to be "-Dsolr.nodes.role=data" (due to backcompat reasons and also >> so that those who don't use the role feature don't need any extra >> parameters). >> >> >> >> >> >> I'm not sold on making such a complex rule for what roles are enabled >> and treating data role differently from other roles. >> > >> > >> > As I've said this before, we can't say "all roles are on by default" >> since we can't forsee the future to decide whether enabling a future role >> enabled by default. As of today, we have two roles: "data" and "overseer", >> and "data" is enabled by default on all nodes, and "overseer" (which stands >> for preferred overseer) is disabled by default on all nodes. Hence, I >> mentioned that if node hasn't started with "solr.node.roles" parameter, we >> should assume it is solr.nodes.role=data. >> > >> >> >> >> It's fine to require certain upgrade steps for 9.0. >> > >> > >> > Forcing everyone to explicitly use -Dsolr.nodes.role=data parameter to >> start their nodes, irrespective of whether they want to use the roles >> feature, doesn't seem like a reasonable idea. >> > >> >> >> >> We should keep role config 1:1 and dead simple, i.e. WYSIWYG and no >> roles means all roles. Then handle back-compat in more targeted ways like >> we have done for certain features before such as HTTP1 vs HTTP2. >> >> >> >> If a coordinator node is started with "data" role also, it fails to >> startup with a message indicating a node cannot both be coordinator and >> data node. >> >> >> >> >> >> Such custom complex rules don't make sense to me. If you want a single >> node to handle both data, zookeeper, overseer, coordination, >> streaming-expressions, sql, foo and bar, then fine, why block it? >> > >> > >> > The coordinator role's implementation is outside the scope of this SIP. >> I propose that any future role (zk, coordination, sql, foo, bar...) be free >> to choose its own implementation or constraints. We can discuss this at the >> time of introduction of those roles. >> > >> >> >> >> Users will start in that mode and then separate out certain nodes for >> certain workloads as they grow their clusters. >> >> >> >> Jan >> >> >> >> 15. nov. 2021 kl. 16:36 skrev Ishan Chattopadhyaya < >> ichattopadhy...@gmail.com>: >> >> >> >> Thanks Jan, I've updated the SIP document with all the applicable >> changes with a link to this thread (which contains the summary at the end). >> >> I'll initiate the vote thread now. Thanks to everyone for contributing. >> >> >> >> On Mon, Nov 15, 2021 at 6:53 PM Jan Høydahl <jan....@cominvent.com> >> wrote: >> >>> >> >>> Thanks for trying to summarize and drive the work Ishan. >> >>> >> >>> I'd like to add >> >>> >> >>> Scope of SIP >> >>> Ishan: Role API and config >> >>> Jan: Role API, config, and impact of one real role e.g. the "data" >> role, to examplify and justify the role infrastructure >> >>> >> >>> According to SIP process the next step is not implementation, but >> rather to iterate the SIP text to something you believe would pass a vote. >> It's hard to stitch together all these email and mini summaries into a >> meaningful whole. >> >>> >> >>> Jan >> >>> >> >>> 15. nov. 2021 kl. 05:28 skrev Ishan Chattopadhyaya < >> ichattopadhy...@gmail.com>: >> >>> >> >>> Thanks to everyone for the feedback. >> >>> >> >>> Here's an attempt to summarize broad topics discussed. >> >>> >> >>> No negative roles >> >>> Everyone agree >> >>> >> >>> Roles on/off by default? >> >>> Jason+(Ilan,Houston?): All roles to be on by default >> >>> Gus,Ishan,Noble: Only those roles to be on by default that are needed >> for backcompat >> >>> >> >>> Which branch to target? >> >>> Jan,Ishan,Noble: New feature to be added to 9x branch >> >>> >> >>> Need for roles? >> >>> Tim: new concept of nodes unnecessary since everything that's >> proposed can be achieved using changes to new autoscaling framework and >> replica placement plugins. >> >>> Ishan,Noble: A first class concept of roles is important so that this >> functionality is expected to work, irrespective of whatever custom >> placement plugins users deploy (since placement plugins don't support >> chaining). >> >>> >> >>> Roles for collections? >> >>> Ilan: Role aware collections >> >>> Ishan: This can be implemented separately later using node roles and >> placement plugins. >> >>> >> >>> Configuration >> >>> Sysprops vs solr.xml+sysprops vs envvars: >> >>> Shawn: Solr.xml and/or envvars >> >>> Houston,Ilan: Sysprops and/or envvars >> >>> Ishan,Noble: Sysprops >> >>> Jan: SIP-11 >> >>> >> >>> Outstanding issues >> >>> Shawn: Color of the bikeshed ;-) >> >>> >> >>> Please let me know if I missed something here. If there are no >> further strong objections, we can proceed to the implementation phase. >> There's already a draft/WIP PR in the works: >> https://github.com/apache/solr/pull/403 >> >>> >> >>> Thanks, >> >>> Ishan >> >>> >> >>> On Fri, Nov 12, 2021 at 11:38 PM Gus Heck <gus.h...@gmail.com> wrote: >> >>>> >> >>>> Yeah we should only be looking for and only be reporting (if we >> choose to report to the user) a specific set of env variables. Anything >> else should be ignored.Should be an enum or constants somewhere listing >> what solr cares about, and we should ignore or be blind to anything else. >> >>>> >> >>>> Perhaps we'd like to have a ConfigParams (or whatever) enum that has >> methods returning the env, sysprop, bin/solr arg, configFile and zkLocation >> that can be used to provide each possible configuration option (for things >> that are single value or short list, obviously an entire schema probably >> would not be setable by sysprop :) )? >> >>>> >> >>>> The return type of those methods could be Optional<>() since we >> neither have all of those for everything any time soon, and not all of them >> will make sense in all cases. >> >>>> >> >>>> zkLocation is a bit tricky and nebulous since it's probably a zk >> path and a JSON path or Xpath combined and relative to the chroot which >> itself is a potential config param, some stuff to think through there. >> >>>> >> >>>> On Thu, Nov 11, 2021 at 3:49 PM Ilan Ginzburg <ilans...@gmail.com> >> wrote: >> >>>>> >> >>>>> Houston made a very valid comment back then on the placement plugin >> support of environment variables (dropped as a consequence). >> >>>>> >> >>>>> >> https://issues.apache.org/jira/browse/SOLR-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286680#comment-17286680 >> >>>>> >> >>>>> It could be possible to unintentionally leak node data that should >> be kept secret if Solr is allowed to freely access (random?) environment >> variables as part of configuration. >> >>>>> >> >>>>> Something to keep in mind. >> >>>>> >> >>>>> Ilan >> >>>>> >> >>>>> >> >>>>> On Thu 11 Nov 2021 at 20:12, Eric Pugh < >> ep...@opensourceconnections.com> wrote: >> >>>>>> >> >>>>>> Agreed! >> >>>>>> >> >>>>>> I’ve noticed that in the Play Framework, you can configure >> everything via a property based configuration file, however it makes it >> easy to override the property file via another one, or via an ENV variables: >> >>>>>> >> >>>>>> db.default.username="smui" >> >>>>>> db.default.username=${?SMUI_DB_USER} >> >>>>>> >> >>>>>> Which turns out to be very liberating! >> >>>>>> >> >>>>>> >> >>>>>> On Nov 11, 2021, at 2:09 PM, Jan Høydahl <jan....@cominvent.com> >> wrote: >> >>>>>> >> >>>>>> +1 to a roundup of env and props across the board. I think SIP 11 >> is on the track of something. But can be done independent of this. >> >>>>>> >> >>>>>> Jan Høydahl >> >>>>>> >> >>>>>> 11. nov. 2021 kl. 17:44 skrev Gus Heck <gus.h...@gmail.com>: >> >>>>>> >> >>>>>> >> >>>>>> I guess all I mean is that it shouldn't be only sysprops. Enabling >> sysprops, Env vars etc seems fine but we need to clearly document >> precedence among any/all options. What is convenient varies from case to >> case and in a perfect world what I'd like to see is full support across >> each style (files, zk, props, env vars) with consistent and obvious naming >> and well documented resolution order. >> >>>>>> >> >>>>>> What I don't like is a little bit of env vars for some stuff, >> props for others, files for yet more stuff and some unclear aggregation of >> that showing up in zk... (or maybe some of it not showing up anywhere code >> could check it...) >> >>>>>> >> >>>>>> On Thu, Nov 11, 2021 at 11:07 AM Houston Putman < >> houstonput...@gmail.com> wrote: >> >>>>>>> >> >>>>>>> I agree with Jan, when thinking about making Solr as cloud >> friendly as possible EnvVars and (to a lesser extent) sysProps are much >> preferable than having a setting in the solr.xml. >> >>>>>>> This is because it's easier to customize EnvVars per-node, while >> customizing a config file is much harder, as those tend to be static and >> shared across a whole environment. >> >>>>>>> >> >>>>>>> Also thanks for linking that SIP Jan, very applicable. >> >>>>>>> >> >>>>>>> - Houston >> >>>>>>> >> >>>>>>> On Fri, Nov 5, 2021 at 5:19 PM Jan Høydahl <jan....@cominvent.com> >> wrote: >> >>>>>>>> >> >>>>>>>> Thinking of these roles as labels, I think sysProps and envVars >> are the two universal methods, and nothing wrong with that. >> >>>>>>>> I keep trying to think cloud native and container, so having >> excellent 1st class support for env.vars for such configs is a priority to >> me. >> >>>>>>>> Most tools, CI-environments etc have built-in support for >> env.vars, and so it makes sense to me. >> >>>>>>>> >> >>>>>>>> See >> https://cwiki.apache.org/confluence/display/SOLR/SIP-11+Uniform+cluster-level+configuration+API >> for some interesting ideas around cluster/node level config. >> >>>>>>>> >> >>>>>>>> See >> >>>>>>>> >> >>>>>>>> 5. nov. 2021 kl. 15:04 skrev Gus Heck <gus.h...@gmail.com>: >> >>>>>>>> >> >>>>>>>> Agree better to something other than sysprops. an arg in the >> start script would be friendlier than -D props which generally are >> irritatingly verbose and expose too much implementation. >> >>>>>>>> >> >>>>>>>> We lack a config file per level. solr.xml does double duty as >> global and per-node depending on how it's used (zk or filesystem). >> >>>>>>>> >> >>>>>>>> Config file names are confusing too. Our file names are legacy >> of non-cloud mode I think, and we really should at some point (10.x?) >> rework configs to be cluster.xml, node.xml, collection.xml (formerly >> solrconfig.xml) and schema.xml (and maybe support something other than xml, >> but that's not nearly as important as clarity in naming, and having >> features) >> >>>>>>>> >> >>>>>>>> But this is all straying way off topic and should have its own >> SIP if someone seems to have time for it :) >> >>>>>>>> >> >>>>>>>> On Thu, Nov 4, 2021 at 6:07 PM Shawn Heisey < >> elyog...@elyograg.org> wrote: >> >>>>>>>>> >> >>>>>>>>> On 11/4/21 2:51 PM, Noble Paul wrote: >> >>>>>>>>> > The SIP can be boiled down to the following >> >>>>>>>>> > >> >>>>>>>>> > * *Tag a node with a label (role) using a system property* >> >>>>>>>>> > ** Use the placement plugin to whitelist/block list certain >> nodes* >> >>>>>>>>> > ** Publish the roles through an API* >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> In general, for Solr, do we like the idea of having things >> controlled by >> >>>>>>>>> system properties? >> >>>>>>>>> >> >>>>>>>>> I would think solr.xml would be the right place to configure >> this, >> >>>>>>>>> except that people can and probably do put solr.xml in >> zookeeper, which >> >>>>>>>>> would mean every system would have the SAME solr.xml, and we're >> back to >> >>>>>>>>> system properties as a way to customize solr.xml on each system. >> >>>>>>>>> >> >>>>>>>>> I have never used system properties to configure Solr. When I >> customize >> >>>>>>>>> the config, I will often remove property substitutions from it >> and go >> >>>>>>>>> with explicit settings. My general opinion about system >> properties is >> >>>>>>>>> that if they're going to be used, they should DIRECTLY >> configure the >> >>>>>>>>> application, not be sent in via property substitution in a >> config file. >> >>>>>>>>> I've never liked the way our default configs use that >> paradigm. It >> >>>>>>>>> means you cannot look at the config and know exactly how things >> are >> >>>>>>>>> configured, without finding out whether system properties have >> been set. >> >>>>>>>>> >> >>>>>>>>> What color do others think that bikeshed should be painted? >> >>>>>>>>> >> >>>>>>>>> Thanks, >> >>>>>>>>> Shawn >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> --------------------------------------------------------------------- >> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> >>>>>>>>> For additional commands, e-mail: dev-h...@solr.apache.org >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> http://www.needhamsoftware.com (work) >> >>>>>>>> http://www.the111shift.com (play) >> >>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> http://www.needhamsoftware.com (work) >> >>>>>> http://www.the111shift.com (play) >> >>>>>> >> >>>>>> >> >>>>>> _______________________ >> >>>>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | >> 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy >> >>>>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >> >>>>>> This e-mail and all contents, including attachments, is considered >> to be Company Confidential unless explicitly stated otherwise, regardless >> of whether attachments are marked as such. >> >>>>>> >> >>>> >> >>>> >> >>>> -- >> >>>> http://www.needhamsoftware.com (work) >> >>>> http://www.the111shift.com (play) >> >>> >> >>> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> For additional commands, e-mail: dev-h...@solr.apache.org >> >> > > -- > http://www.needhamsoftware.com (work) > http://www.the111shift.com (play) > -- http://www.needhamsoftware.com (work) http://www.the111shift.com (play)