What kind of data will ultimately be gathered? Are we just considering performance counters, or custom monitors that can aggregate data of varying types so that our users can create their own alerting? (i'm thinking splunk-esque custom log digestion) What platform(s) are being targeted for monitoring - sorry if this is documented already elsewhere. What rights does the agent need to access the data? (i'm not a *unix guru, but the same principals apply - least privilege is best)
For the data that is being collected, how is the data protected in transit as well as at rest? I don't know how customize-able we want to be, but is user management being consider? Can the admin define who can see what data, or define "Team X can see their services and define their own counters/metrics, but not other teams." or is the thinking more along the lines of there's one central team who manages all of the monitoring and they are the POC for defining alerts? Much like Chris, many thoughts in my head and rewrote this a few times. ________________________________ From: Chris Thistlethwaite <chr...@apache.org> Sent: Sunday, June 24, 2018 5:11 PM To: dev@warble.apache.org Subject: Re: The general idea, design and goal of Apache Warble I completely agree, easy setup is key. This is starting to sound like Etherpad, out of the box you get DirtyDB and then can pipe it to something bigger/better if you want too. This email has been re-written a few times as I type out ideas, flush them out a bit, then realize they don't make any sense. -Chris T. On Sun, 2018-06-24 at 09:53 -0500, Daniel Gruno wrote: > On 06/23/2018 11:59 AM, Chris Lambertus wrote: > > > > > > This is a good summary. As we move forward with the design specs, > > I’d also like to see some comparisons with existing off-the-shelf > > monitoring systems (zabbix, nagios, etc.,) to show that Warble is > > first and foremost easy to set up. This is typically the biggest > > barrier to entry for folks trying to set up a monitoring solution. > > Zabbix and Nagios specifically require a fairly steep learning > > curve to get any useful data. > > the potential turnkey aspect of Warble is very much on my mind. I > want > it to be as easy as downloading, running a setup script, and things > should be operational. While we can't "compete" against other > software > products (this is very anathema to ASF projects), we can highlight > what > we think are Warble's strengths and mebbe do a matrix comparison of > features, pros and cons etc. > > > > > > As for the database, I'm leaning towards using ElasticSearch for > > > the > > > permanent storage, and possibly Redis for the ephemeral lookup > > > cache for > > > the alerter. > > > > > > If it can be made turn-key, I’m OK with ES, but it’s a heavy > > solution for small scale monitoring. It may be worth considering a > > tiered approach with something as simple as sqllite as the > > “default” backend with ES for a more enterprise-sized deployment. > > I think this is a reasonable approach. For the initial development, > we > can definitely start with something simple like sqlite, and as > development continues, we can add in the more advanced stuff that > would > require a proper timeseries/aggregated database system once we get > closer to scoping out the visualization aspects of gathered data > (that > is, beyond the basic alerting and up/down stats). > > > > > Any thoughts on being backwards-compatible with existing nagios > > check scripts? There’s a fairly broad ecosystem there that might be > > leveraged. > > This is a tricky question, and my immediate reaction is that this is > not > something _I_ would focus on, as building an additional API endpoint > for > Nagios could take quite a while. Instead, I'll be working on building > a > simple and 'modular' base class for tests as possible, so anyone can > quickly build a test for whatever they want to test for, or quickly > port > from other systems to Warble. This is not to say that the Nagios idea > is > bad, but rather that I would prefer starting out with what I know. > > To give an idea of what a test class currently looks like, I have an > example class at > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste.apache.org%2FJi05&data=02%7C01%7C%7C931b1694761f45d8705808d5da304301%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636654823166178998&sdata=LfuxkZHEiUOmeD1XP3IPMiUUrB7Via1h42kDd34CTvw%3D&reserved=0 > - all tests are > shaped > the same way, and have an init and a run function, which relies on a > bunch of common libraries to quickly do stuff. You pass test > parameters > to the test class, and it spits out a generic report object. We > prooobably can make wrappers for nagios and other systems in the > future. > > > > > -Chris > > > > > > > > > > ----------------------------------------------------------------- > > ---- > > To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org > > For additional commands, e-mail: dev-h...@warble.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org > For additional commands, e-mail: dev-h...@warble.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org For additional commands, e-mail: dev-h...@warble.apache.org