Quite late to the party, but with all this great back and forth I felt like I had to join in :)
I believe SolrCloud uses ZooKeeper to manage most of its configuration files. When searching, I was only able to find this ( https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+Files). I wasn't able to find any initial discussion on their architecture. If we can find more we still may be able to learn from them. Also, on the idea of passing a username/password to a Stellar function or to some shell script. We may want to do it a bit differently or at least give the option to do it differently. I know supplying the username/password directly is easy when testing and playing around, but it probably isn't going to be allowed for a user in production. Maybe we can also support a credentials file and eventually support encrypting sensitive values in configs? Thanks, JJ On Sun, Jan 15, 2017 at 1:26 PM, Michael Miklavcic < [email protected]> wrote: > Ha, I was betrayed by copy/paste in Chrome. > > On Thu, Jan 12, 2017 at 7:24 PM, Matt Foley <[email protected]> wrote: > >> Mike, could you try again on the image, please, making sure it is a >> simple format (gif, png, or jpeg)? It got munched, at least in my viewer. >> Thanks. >> >> Casey, responding to some of the questions you raised: >> >> I’m going to make a rather strong statement: We already have a service >> “to intermediate and handle config update/retrieval”. >> Furthermore, it: >> - Correctly handles the problems of distributed services running on >> multi-node clusters. (That’s a HARD problem, people, and we shouldn’t try >> to reinvent the wheel.) >> - Correctly handles Kerberos security. (That’s kinda hard too, or at >> least a lot of work.) >> - It does automatic versioning of configurations, and allows viewing, >> comparing, and reverting historical configs >> - It has a capable REST API for all those things. >> It doesn’t natively integrate Zookeeper storage of configs, but there is >> a natural place to specify copy to/from Zookeeper for the files desired. >> >> It is Ambari. And we should commit to it, rather than try to re-create >> such features. >> Because it has a good REST API, it is perfectly feasible to implement >> Stellar functions that call it. >> GUI configuration tools can also use the Ambari APIs, or better yet be >> integrated in an “Ambari View”. (Eg, see the “Yarn Capacity Scheduler >> Configuration Tool” example in the Ambari documentation, under “Using >> Ambari Views”.) >> >> Arguments are: Parsimony, Sufficiency, Not reinventing the wheel, and Not >> spending weeks and weeks of developer time over the next year reinventing >> the wheel while getting details wrong multiple times… >> >> Okay, off soapbox. >> >> Casey asked what the config update behavior of Ambari is, and how it will >> interact with changes made from outside Ambari. >> The following is from my experience working with the Ambari Mpack for >> Metron. I am not otherwise an Ambari expert, so tomorrow I’ll get it >> reviewed by an Ambari development engineer. >> >> Ambari-server runs on one node, and Ambari-agent runs on each of all the >> nodes. >> Ambari-server has a private set of py, xml, and template files, which >> together are used both to generate the Ambari configuration GUI, with >> defaults, and to generate configuration files (of any needed filetype) for >> the various Stack components. >> Ambari-server also has a database where it stores the schema related to >> these files, so even if you reach in and edit Ambari’s files, it will Error >> out if the set of parameters or parameter names changes. The historical >> information about configuration changes is also stored in the db. >> For each component (and in the case of Metron, for each topology), there >> is a python file which controls the logic for these actions, among others: >> - Install >> - Start / stop / restart / status >> - Configure >> >> It is actually up to this python code (which we wrote for the Metron >> Mpack) what happens in each of these API calls. But the current code, and >> I believe this is typical of Ambari-managed components, performs a >> “Configure” action whenever you press the “Save” button after changing a >> component config in Ambari, and also on each Install and Start or Restart. >> >> The Configure action consists of approximately the following sequence >> (see disclaimer above :-) >> - Recreate the generated config files, using the template files and the >> actual configuration most recently set in Ambari >> o Note this is also under the control of python code that we wrote, and >> this is the appropriate place to push to ZK if desired. >> - Propagate those config files to each Ambari-agent, with a command to >> set them locally >> - The ambari-agents on each node receive the files and write them to the >> specified locations on local storage >> >> Ambari-server then whines that the updated services should be restarted, >> but does not initiate that action itself (unless of course the initiating >> action was a Start command from the administrator). >> >> Make sense? It’s all quite straightforward in concept, there’s just an >> awful lot of stuff wrapped around that to make it all go smoothly and >> handle the problems when it doesn’t. >> >> There’s additional complexity in that the Ambari-agent also caches (on >> each node) both the template files and COMPILED forms of the python files >> (.pyc) involved in transforming them. The pyc files incorporate some >> amount of additional info regarding parameter values, but I’m not sure of >> the form. I don’t think that changes the above in any practical way unless >> you’re trying to cheat Ambari by reaching in and editing its files >> directly. In that case, you also need to whack the pyc files (on each >> node) to force the data to be reloaded from Ambari-server. Best solution >> is don’t cheat. >> >> Also, there may be circumstances under which the Ambari-agent will detect >> changes and re-write the latest version it knows of the config files, even >> without a Save or Start action at the Ambari-server. I’m not sure of this >> and need to check with Ambari developers. It may no longer happen, altho >> I’m pretty sure change detection/reversion was a feature of early versions >> of Ambari. >> >> Hope this helps, >> --Matt >> >> ================================================ >> From: Michael Miklavcic <[email protected]> >> Reply-To: "[email protected]" < >> [email protected]> >> Date: Thursday, January 12, 2017 at 3:59 PM >> To: "[email protected]" <[email protected]> >> Subject: Re: [DISCUSS] Ambari Metron Configuration Management >> consequences and call to action >> >> Hi Casey, >> >> Thanks for starting this thread. I believe you are correct in your >> assessment of the 4 options for updating configs in Metron. When using more >> than one of these options we can get into a split-brain scenario. A basic >> example is updating the global config on disk and using the >> zk_load_configs.sh. Later, if a user decides to restart Ambari, the cached >> version stored by Ambari (it's in the MySQL or other database backing >> Ambari) will be written out to disk in the defined config directory, and >> subsequently loaded using the zk_load_configs.sh under the hood. Any global >> configuration modified outside of Ambari will be lost at this point. This >> is obviously undesirable, but I also like the purpose and utility exposed >> by the multiple config management interfaces we currently have available. I >> also agree that a service would be best. >> >> For reference, here's my understanding of the current configuration >> loading mechanisms and their deps. >> >> <image> >> >> Mike >> >> >> On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella <[email protected]> wrote: >> >> In the course of discussion on the PR for METRON-652 >> <https://github.com/apache/incubator-metron/pull/415> something that I >> should definitely have understood better came to light and I thought that >> it was worth bringing to the attention of the community to get >> clarification/discuss is just how we manage configs. >> >> Currently (assuming the management UI that Ryan Merriman submitted) >> configs >> are managed/adjusted via a couple of different mechanism. >> >> - zk_load_utils.sh: pushed and pulled from disk to zookeeper >> - Stellar REPL: pushed and pulled via the CONFIG_GET/CONFIG_PUT >> functions >> - Ambari: initialized via the zk_load_utils script and then some of >> them >> are managed directly (global config) and some indirectly >> (sensor-specific >> configs). >> - NOTE: Upon service restart, it may or may not overwrite changes on >> disk or on zookeeper. *Can someone more knowledgeable than me about >> this describe precisely the semantics that we can expect on >> service restart >> for Ambari? What gets overwritten on disk and what gets updated >> in ambari?* >> - The Management UI: manages some of the configs. *RYAN: Which configs >> do we support here and which don't we support here?* >> >> As you can see, we have a mishmash of mechanisms to update and manage the >> configuration for Metron in zookeeper. In the beginning the approach was >> just to edit configs on disk and push/pull them via zk_load_utils. >> Configs >> could be historically managed using source control, etc. As we got more >> and more components managing the configs, we haven't taken care that they >> they all work with each other in an expected way (I believe these are >> true..correct me if I'm wrong): >> >> - If configs are modified in the management UI or the Stellar REPL and >> someone forgets to pull the configs from zookeeper to disk, before >> they do >> a push via zk_load_utils, they will clobber the configs in zookeeper >> with >> old configs. >> - If the global config is changed on disk and the ambari service >> restarts, it'll get reset with the original global config. >> - *Ryan, in the management UI, if someone changes the zookeeper configs >> from outside, are those configs reflected immediately in the UI?* >> >> >> It seems to me that we have a couple of options here: >> >> - A service to intermediate and handle config update/retrieval and >> tracking historical changes so these different mechanisms can use a >> common >> component for config management/tracking and refactor the existing >> mechanisms to use that service >> - Standardize on exactly one component to manage the configs and >> regress >> the others (that's a verb, right? nicer than delete.) >> >> I happen to like the service approach, myself, but I wanted to put it up >> for discussion and hopefully someone will volunteer to design such a >> thing. >> >> To frame the debate, I want us to keep in mind a couple of things that may >> or may not be relevant to the discussion: >> >> - We will eventually be moving to support kerberos so there should at >> least be a path to use kerberos for any solution IMO >> - There is value in each of the different mechanisms in place now. If >> there weren't, then they wouldn't have been created. Before we try to >> make >> this a "there can be only one" argument, I'd like to hear very good >> arguments. >> >> Finally, I'd appreciate if some people might answer the questions I have >> in >> bold there. Hopefully this discussion, if nothing else happens, will >> result in fodder for proper documentation of the ins and outs of each of >> the components bulleted above. >> >> Best, >> >> Casey >> >> >> >> >> >> >
