ack meant to write "default +zookeep4r -overseer"   any config should be
explicit.

On Wed, Nov 17, 2021 at 7:43 PM Gus Heck <gus.h...@gmail.com> wrote:

> I think we should establish a known set of default roles. Other roles are
> non-default roles. New features can be added as default or non-default as
> required. Default roles are explicitly loaded at startup if no other
> information is provided. I'd even be fine with a config that
> said +zookeeper -overseer so long as this resulted in an explicit set of
> roles easily identified at runtime and not requiring runtime code level
> logic more complex than list.contains() (conceptually, exact implementation
> may vary).
>
> As for weak roles, that shouldn't exist (going forward). Roles however
> might have associated config that can do things like express a preference
> order or precedence or whatever other complex behavior among role members
> they like, but role membership should be binary.
>
> As I noted before it's not necessary to rework everything into roles at
> once, so the above statement about overseer doesn't apply till we create a
> "new role system role" for it. Until the new role is created, it's just
> basic functionality not associated with a role, and applicable to all
> machines.
>
> To make an explicit proposal, maybe add the following to the SIP:
>
> *Roles During Application Lifecycle"*
> 1) Roles may be configured for a node (means TBD) before a node is started.
> 2) On startup the node check for configured roles.
> 3) a) If configured roles are found export the configured list such that
> it is visible to code (probably zk, impl TBD)
> 3) b) If no roles are configured export the default set of roles such that
> they are visible to code (method identical to 3) a) )
> 4) Node completes any other necessary startup and publishes itself in
> live_nodes.
>
> *Usage of roles in code*
> 1) Roles will be checked in publicly published configuration (i.e. zk),
> and a watch set to detect any change.
> 2) Roles will not be checked by loading config from disk or caching disk
> config in memory. (zk ONLY source of truth)
>
> Thoughts?
>
> (the changeability of roles can
>
>
> On Wed, Nov 17, 2021 at 2:15 PM Ilan Ginzburg <ilans...@gmail.com> wrote:
>
>> Current Overseer role defaults to all nodes can, and that changes only
>> when some nodes get the Overseer role.
>>
>> This is a default value per node based on the presence of the role on
>> any node of the cluster.
>> But the Overseer role is "weak" in that nodes not having the role can
>> become overseer even if some nodes did get the role (but are down for
>> example).
>>
>> Shall the SIP explore the notion of default when a role is not defined
>> for ANY node in the cluster rather than only when a role is not
>> defined on a per node basis? It might solve some of the problems we
>> have.
>>
>> Ilan
>>
>>
>> On Wed, Nov 17, 2021 at 7:08 PM Ishan Chattopadhyaya
>> <ichattopadhy...@gmail.com> wrote:
>> >
>> >
>> >
>> > On Wed, Nov 17, 2021 at 8:12 PM Jan Høydahl <jan....@cominvent.com>
>> wrote:
>> >>
>> >> I think your VOTE is premature as several design decisions are
>> obviously not landed. That may be the reason there are no votes yet, and
>> I'm not going to vote either.
>> >>
>> >>
>> >> If "-Dsolr.node.roles" parameter is not passed, it is implicitly
>> assumed to be "-Dsolr.nodes.role=data" (due to backcompat reasons and also
>> so that those who don't use the role feature don't need any extra
>> parameters).
>> >>
>> >>
>> >> I'm not sold on making such a complex rule for what roles are enabled
>> and treating data role differently from other roles.
>> >
>> >
>> > As I've said this before, we can't say "all roles are on by default"
>> since we can't forsee the future to decide whether enabling a future role
>> enabled by default. As of today, we have two roles: "data" and "overseer",
>> and "data" is enabled by default on all nodes, and "overseer" (which stands
>> for preferred overseer) is disabled by default on all nodes. Hence, I
>> mentioned that if node hasn't started with "solr.node.roles" parameter, we
>> should assume it is solr.nodes.role=data.
>> >
>> >>
>> >> It's fine to require certain upgrade steps for 9.0.
>> >
>> >
>> > Forcing everyone to explicitly use -Dsolr.nodes.role=data parameter to
>> start their nodes, irrespective of whether they want to use the roles
>> feature, doesn't seem like a reasonable idea.
>> >
>> >>
>> >> We should keep role config 1:1 and dead simple, i.e. WYSIWYG and no
>> roles means all roles. Then handle back-compat in more targeted ways like
>> we have done for certain features before such as HTTP1 vs HTTP2.
>> >>
>> >> If a coordinator node is started with "data" role also, it fails to
>> startup with a message indicating a node cannot both be coordinator and
>> data node.
>> >>
>> >>
>> >> Such custom complex rules don't make sense to me. If you want a single
>> node to handle both data, zookeeper, overseer, coordination,
>> streaming-expressions, sql, foo and bar, then fine, why block it?
>> >
>> >
>> > The coordinator role's implementation is outside the scope of this SIP.
>> I propose that any future role (zk, coordination, sql, foo, bar...) be free
>> to choose its own implementation or constraints. We can discuss this at the
>> time of introduction of those roles.
>> >
>> >>
>> >> Users will start in that mode and then separate out certain nodes for
>> certain workloads as they grow their clusters.
>> >>
>> >> Jan
>> >>
>> >> 15. nov. 2021 kl. 16:36 skrev Ishan Chattopadhyaya <
>> ichattopadhy...@gmail.com>:
>> >>
>> >> Thanks Jan, I've updated the SIP document with all the applicable
>> changes with a link to this thread (which contains the summary at the end).
>> >> I'll initiate the vote thread now. Thanks to everyone for contributing.
>> >>
>> >> On Mon, Nov 15, 2021 at 6:53 PM Jan Høydahl <jan....@cominvent.com>
>> wrote:
>> >>>
>> >>> Thanks for trying to summarize and drive the work Ishan.
>> >>>
>> >>> I'd like to add
>> >>>
>> >>> Scope of SIP
>> >>> Ishan: Role API and config
>> >>> Jan: Role API, config, and impact of one real role e.g. the "data"
>> role, to examplify and justify the role infrastructure
>> >>>
>> >>> According to SIP process the next step is not implementation, but
>> rather to iterate the SIP text to something you believe would pass a vote.
>> It's hard to stitch together all these email and mini summaries into a
>> meaningful whole.
>> >>>
>> >>> Jan
>> >>>
>> >>> 15. nov. 2021 kl. 05:28 skrev Ishan Chattopadhyaya <
>> ichattopadhy...@gmail.com>:
>> >>>
>> >>> Thanks to everyone for the feedback.
>> >>>
>> >>> Here's an attempt to summarize broad topics discussed.
>> >>>
>> >>> No negative roles
>> >>> Everyone agree
>> >>>
>> >>> Roles on/off by default?
>> >>> Jason+(Ilan,Houston?): All roles to be on by default
>> >>> Gus,Ishan,Noble: Only those roles to be on by default that are needed
>> for backcompat
>> >>>
>> >>> Which branch to target?
>> >>> Jan,Ishan,Noble: New feature to be added to 9x branch
>> >>>
>> >>> Need for roles?
>> >>> Tim: new concept of nodes unnecessary since everything that's
>> proposed can be achieved using changes to new autoscaling framework and
>> replica placement plugins.
>> >>> Ishan,Noble: A first class concept of roles is important so that this
>> functionality is expected to work, irrespective of whatever custom
>> placement plugins users deploy (since placement plugins don't support
>> chaining).
>> >>>
>> >>> Roles for collections?
>> >>> Ilan: Role aware collections
>> >>> Ishan: This can be implemented separately later using node roles and
>> placement plugins.
>> >>>
>> >>> Configuration
>> >>> Sysprops vs solr.xml+sysprops vs envvars:
>> >>> Shawn: Solr.xml and/or envvars
>> >>> Houston,Ilan: Sysprops and/or envvars
>> >>> Ishan,Noble: Sysprops
>> >>> Jan: SIP-11
>> >>>
>> >>> Outstanding issues
>> >>> Shawn: Color of the bikeshed ;-)
>> >>>
>> >>> Please let me know if I missed something here. If there are no
>> further strong objections, we can proceed to the implementation phase.
>> There's already a draft/WIP PR in the works:
>> https://github.com/apache/solr/pull/403
>> >>>
>> >>> Thanks,
>> >>> Ishan
>> >>>
>> >>> On Fri, Nov 12, 2021 at 11:38 PM Gus Heck <gus.h...@gmail.com> wrote:
>> >>>>
>> >>>> Yeah we should only be looking for and only be reporting (if we
>> choose to report to the user) a specific set of env variables. Anything
>> else should be ignored.Should be an enum or constants somewhere listing
>> what solr cares about, and we should ignore or be blind to anything else.
>> >>>>
>> >>>> Perhaps we'd like to have a ConfigParams (or whatever) enum that has
>> methods returning the env, sysprop, bin/solr arg, configFile and zkLocation
>> that can be used to provide each possible configuration option (for things
>> that are single value or short list, obviously an entire schema probably
>> would not be setable by sysprop :) )?
>> >>>>
>> >>>> The return type of those methods could be Optional<>() since we
>> neither have all of those for everything any time soon, and not all of them
>> will make sense in all cases.
>> >>>>
>> >>>> zkLocation is a bit tricky and nebulous since it's probably a zk
>> path and a JSON path or Xpath combined and relative to the chroot which
>> itself is a potential config param, some stuff to think through there.
>> >>>>
>> >>>> On Thu, Nov 11, 2021 at 3:49 PM Ilan Ginzburg <ilans...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> Houston made a very valid comment back then on the placement plugin
>> support of environment variables (dropped as a consequence).
>> >>>>>
>> >>>>>
>> https://issues.apache.org/jira/browse/SOLR-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286680#comment-17286680
>> >>>>>
>> >>>>> It could be possible to unintentionally leak node data that should
>> be kept secret if Solr is allowed to freely access (random?) environment
>> variables as part of configuration.
>> >>>>>
>> >>>>> Something to keep in mind.
>> >>>>>
>> >>>>> Ilan
>> >>>>>
>> >>>>>
>> >>>>> On Thu 11 Nov 2021 at 20:12, Eric Pugh <
>> ep...@opensourceconnections.com> wrote:
>> >>>>>>
>> >>>>>> Agreed!
>> >>>>>>
>> >>>>>> I’ve noticed that in the Play Framework, you can configure
>> everything via a property based configuration file, however it makes it
>> easy to override the property file via another one, or via an ENV variables:
>> >>>>>>
>> >>>>>> db.default.username="smui"
>> >>>>>> db.default.username=${?SMUI_DB_USER}
>> >>>>>>
>> >>>>>> Which turns out to be very liberating!
>> >>>>>>
>> >>>>>>
>> >>>>>> On Nov 11, 2021, at 2:09 PM, Jan Høydahl <jan....@cominvent.com>
>> wrote:
>> >>>>>>
>> >>>>>> +1 to a roundup of env and props across the board. I think SIP 11
>> is on the track of something. But can be done independent of this.
>> >>>>>>
>> >>>>>> Jan Høydahl
>> >>>>>>
>> >>>>>> 11. nov. 2021 kl. 17:44 skrev Gus Heck <gus.h...@gmail.com>:
>> >>>>>>
>> >>>>>> 
>> >>>>>> I guess all I mean is that it shouldn't be only sysprops. Enabling
>> sysprops, Env vars etc seems fine but we need to clearly document
>> precedence among any/all options. What is convenient varies from case to
>> case and in a perfect world what I'd like to see is full support across
>> each style (files, zk, props, env vars) with consistent and obvious naming
>> and well documented resolution order.
>> >>>>>>
>> >>>>>> What I don't like is a little bit of env vars for some stuff,
>> props for others, files for yet more stuff and some unclear aggregation of
>> that showing up in zk... (or maybe some of it not showing up anywhere code
>> could check it...)
>> >>>>>>
>> >>>>>> On Thu, Nov 11, 2021 at 11:07 AM Houston Putman <
>> houstonput...@gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> I agree with Jan, when thinking about making Solr as cloud
>> friendly as possible EnvVars and (to a lesser extent) sysProps are much
>> preferable than having a setting in the solr.xml.
>> >>>>>>> This is because it's easier to customize EnvVars per-node, while
>> customizing a config file is much harder, as those tend to be static and
>> shared across a whole environment.
>> >>>>>>>
>> >>>>>>> Also thanks for linking that SIP Jan, very applicable.
>> >>>>>>>
>> >>>>>>> - Houston
>> >>>>>>>
>> >>>>>>> On Fri, Nov 5, 2021 at 5:19 PM Jan Høydahl <jan....@cominvent.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> Thinking of these roles as labels, I think sysProps and envVars
>> are the two universal methods, and nothing wrong with that.
>> >>>>>>>> I keep trying to think cloud native and container, so having
>> excellent 1st class support for env.vars for such configs is a priority to
>> me.
>> >>>>>>>> Most tools, CI-environments etc have built-in support for
>> env.vars, and so it makes sense to me.
>> >>>>>>>>
>> >>>>>>>> See
>> https://cwiki.apache.org/confluence/display/SOLR/SIP-11+Uniform+cluster-level+configuration+API
>> for some interesting ideas around cluster/node level config.
>> >>>>>>>>
>> >>>>>>>> See
>> >>>>>>>>
>> >>>>>>>> 5. nov. 2021 kl. 15:04 skrev Gus Heck <gus.h...@gmail.com>:
>> >>>>>>>>
>> >>>>>>>> Agree better to something other than sysprops. an arg in the
>> start script would be friendlier than -D props which generally are
>> irritatingly verbose and expose too much implementation.
>> >>>>>>>>
>> >>>>>>>> We lack a config file per level. solr.xml does double duty as
>> global and per-node depending on how it's used (zk or filesystem).
>> >>>>>>>>
>> >>>>>>>> Config file names are confusing too. Our file names are legacy
>> of non-cloud mode I think, and we really should at some point (10.x?)
>> rework configs to be cluster.xml, node.xml, collection.xml (formerly
>> solrconfig.xml) and schema.xml (and maybe support something other than xml,
>> but that's not nearly as important as clarity in naming, and having
>> features)
>> >>>>>>>>
>> >>>>>>>> But this is all straying way off topic and should have its own
>> SIP if someone seems to have time for it :)
>> >>>>>>>>
>> >>>>>>>> On Thu, Nov 4, 2021 at 6:07 PM Shawn Heisey <
>> elyog...@elyograg.org> wrote:
>> >>>>>>>>>
>> >>>>>>>>> On 11/4/21 2:51 PM, Noble Paul wrote:
>> >>>>>>>>> > The SIP can be boiled down to the following
>> >>>>>>>>> >
>> >>>>>>>>> > * *Tag a node with a label (role) using a system property*
>> >>>>>>>>> > ** Use the placement plugin to whitelist/block list certain
>> nodes*
>> >>>>>>>>> > ** Publish the roles through an API*
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> In general, for Solr, do we like the idea of having things
>> controlled by
>> >>>>>>>>> system properties?
>> >>>>>>>>>
>> >>>>>>>>> I would think solr.xml would be the right place to configure
>> this,
>> >>>>>>>>> except that people can and probably do put solr.xml in
>> zookeeper, which
>> >>>>>>>>> would mean every system would have the SAME solr.xml, and we're
>> back to
>> >>>>>>>>> system properties as a way to customize solr.xml on each system.
>> >>>>>>>>>
>> >>>>>>>>> I have never used system properties to configure Solr.  When I
>> customize
>> >>>>>>>>> the config, I will often remove property substitutions from it
>> and go
>> >>>>>>>>> with explicit settings.  My general opinion about system
>> properties is
>> >>>>>>>>> that if they're going to be used, they should DIRECTLY
>> configure the
>> >>>>>>>>> application, not be sent in via property substitution in a
>> config file.
>> >>>>>>>>> I've never liked the way our default configs use that
>> paradigm.  It
>> >>>>>>>>> means you cannot look at the config and know exactly how things
>> are
>> >>>>>>>>> configured, without finding out whether system properties have
>> been set.
>> >>>>>>>>>
>> >>>>>>>>> What color do others think that bikeshed should be painted?
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Shawn
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> >>>>>>>>> For additional commands, e-mail: dev-h...@solr.apache.org
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> http://www.needhamsoftware.com (work)
>> >>>>>>>> http://www.the111shift.com (play)
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> http://www.needhamsoftware.com (work)
>> >>>>>> http://www.the111shift.com (play)
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________
>> >>>>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC |
>> 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy
>> >>>>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> >>>>>> This e-mail and all contents, including attachments, is considered
>> to be Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>> >>>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> http://www.needhamsoftware.com (work)
>> >>>> http://www.the111shift.com (play)
>> >>>
>> >>>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>>
>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to