Re: The general idea, design and goal of Apache Warble
On 06/24/2018 11:04 PM, Michael Andescavage wrote: What kind of data will ultimately be gathered? Are we just considering performance counters, or custom monitors that can aggregate data of varying types so that our users can create their own alerting? (i'm thinking splunk-esque custom log digestion) What platform(s) are being targeted for monitoring - sorry if this is documented already elsewhere. What rights does the agent need to access the data? (i'm not a *unix guru, but the same principals apply - least privilege is best) My thinking right now is that this belongs in one or more specific emails surrounding the nodes/agents and the server, not so much the general idea in this thread, so I won't go into complete detail here. Suffice it to say, I personally envision both generic (standardized) data and custom data in various ways. For the data that is being collected, how is the data protected in transit as well as at rest? The initial thought (I'll touch on that very soon in a new thread) is that we rely on both encryption (HTTPS) and signing of results with an async keypair. The same keypair would be used in reverse, when sending test parameters to a node/agent (this time encrypted instead of just signed). We could add encryption of the data once stored, I think sqlite et al have encryption options, but it's not a thing I personally want to invest too much time in before we need to. I don't know how customize-able we want to be, but is user management being consider? Can the admin define who can see what data, or define "Team X can see their services and define their own counters/metrics, but not other teams." or is the thinking more along the lines of there's one central team who manages all of the monitoring and they are the POC for defining alerts? It's definitely something I've been pondering, and something PMB did very well (IMHO). For starters, I think we should just get a basic system going with normal users, super users, and robits (the nodes/agents have pseudo-accounts via their API key). Once we have that, we can expand it to teams and roles. I would love to generate a final design that would allow people to offer commercial installations of Warble in a multi-tenant sense. I'll put a pin in this discussion for later. With regards, Daniel. Much like Chris, many thoughts in my head and rewrote this a few times. From: Chris Thistlethwaite Sent: Sunday, June 24, 2018 5:11 PM To: dev@warble.apache.org Subject: Re: The general idea, design and goal of Apache Warble I completely agree, easy setup is key. This is starting to sound like Etherpad, out of the box you get DirtyDB and then can pipe it to something bigger/better if you want too. This email has been re-written a few times as I type out ideas, flush them out a bit, then realize they don't make any sense. -Chris T. On Sun, 2018-06-24 at 09:53 -0500, Daniel Gruno wrote: On 06/23/2018 11:59 AM, Chris Lambertus wrote: This is a good summary. As we move forward with the design specs, I’d also like to see some comparisons with existing off-the-shelf monitoring systems (zabbix, nagios, etc.,) to show that Warble is first and foremost easy to set up. This is typically the biggest barrier to entry for folks trying to set up a monitoring solution. Zabbix and Nagios specifically require a fairly steep learning curve to get any useful data. the potential turnkey aspect of Warble is very much on my mind. I want it to be as easy as downloading, running a setup script, and things should be operational. While we can't "compete" against other software products (this is very anathema to ASF projects), we can highlight what we think are Warble's strengths and mebbe do a matrix comparison of features, pros and cons etc. As for the database, I'm leaning towards using ElasticSearch for the permanent storage, and possibly Redis for the ephemeral lookup cache for the alerter. If it can be made turn-key, I’m OK with ES, but it’s a heavy solution for small scale monitoring. It may be worth considering a tiered approach with something as simple as sqllite as the “default” backend with ES for a more enterprise-sized deployment. I think this is a reasonable approach. For the initial development, we can definitely start with something simple like sqlite, and as development continues, we can add in the more advanced stuff that would require a proper timeseries/aggregated database system once we get closer to scoping out the visualization aspects of gathered data (that is, beyond the basic alerting and up/down stats). Any thoughts on being backwards-compatible with existing nagios check scripts? There’s a fairly broad ecosystem there that might be leveraged. This is a tricky question, and my immediate reaction is that this is not something _I_ would focus on, as building an additional API endpoint for Nagios could take quite a while. Instead, I'll be working on
Re: The general idea, design and goal of Apache Warble
What kind of data will ultimately be gathered? Are we just considering performance counters, or custom monitors that can aggregate data of varying types so that our users can create their own alerting? (i'm thinking splunk-esque custom log digestion) What platform(s) are being targeted for monitoring - sorry if this is documented already elsewhere. What rights does the agent need to access the data? (i'm not a *unix guru, but the same principals apply - least privilege is best) For the data that is being collected, how is the data protected in transit as well as at rest? I don't know how customize-able we want to be, but is user management being consider? Can the admin define who can see what data, or define "Team X can see their services and define their own counters/metrics, but not other teams." or is the thinking more along the lines of there's one central team who manages all of the monitoring and they are the POC for defining alerts? Much like Chris, many thoughts in my head and rewrote this a few times. From: Chris Thistlethwaite Sent: Sunday, June 24, 2018 5:11 PM To: dev@warble.apache.org Subject: Re: The general idea, design and goal of Apache Warble I completely agree, easy setup is key. This is starting to sound like Etherpad, out of the box you get DirtyDB and then can pipe it to something bigger/better if you want too. This email has been re-written a few times as I type out ideas, flush them out a bit, then realize they don't make any sense. -Chris T. On Sun, 2018-06-24 at 09:53 -0500, Daniel Gruno wrote: > On 06/23/2018 11:59 AM, Chris Lambertus wrote: > > > > > > This is a good summary. As we move forward with the design specs, > > I’d also like to see some comparisons with existing off-the-shelf > > monitoring systems (zabbix, nagios, etc.,) to show that Warble is > > first and foremost easy to set up. This is typically the biggest > > barrier to entry for folks trying to set up a monitoring solution. > > Zabbix and Nagios specifically require a fairly steep learning > > curve to get any useful data. > > the potential turnkey aspect of Warble is very much on my mind. I > want > it to be as easy as downloading, running a setup script, and things > should be operational. While we can't "compete" against other > software > products (this is very anathema to ASF projects), we can highlight > what > we think are Warble's strengths and mebbe do a matrix comparison of > features, pros and cons etc. > > > > > > As for the database, I'm leaning towards using ElasticSearch for > > > the > > > permanent storage, and possibly Redis for the ephemeral lookup > > > cache for > > > the alerter. > > > > > > If it can be made turn-key, I’m OK with ES, but it’s a heavy > > solution for small scale monitoring. It may be worth considering a > > tiered approach with something as simple as sqllite as the > > “default” backend with ES for a more enterprise-sized deployment. > > I think this is a reasonable approach. For the initial development, > we > can definitely start with something simple like sqlite, and as > development continues, we can add in the more advanced stuff that > would > require a proper timeseries/aggregated database system once we get > closer to scoping out the visualization aspects of gathered data > (that > is, beyond the basic alerting and up/down stats). > > > > > Any thoughts on being backwards-compatible with existing nagios > > check scripts? There’s a fairly broad ecosystem there that might be > > leveraged. > > This is a tricky question, and my immediate reaction is that this is > not > something _I_ would focus on, as building an additional API endpoint > for > Nagios could take quite a while. Instead, I'll be working on building > a > simple and 'modular' base class for tests as possible, so anyone can > quickly build a test for whatever they want to test for, or quickly > port > from other systems to Warble. This is not to say that the Nagios idea > is > bad, but rather that I would prefer starting out with what I know. > > To give an idea of what a test class currently looks like, I have an > example class at > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste.apache.org%2FJi05=02%7C01%7C%7C931b1694761f45d8705808d5da304301%7C84df9e7fe9f640afb435%7C1%7C0%7C636654823166178998=LfuxkZHEiUOmeD1XP3IPMiUUrB7Via1h42kDd34CTvw%3D=0 > - all tests are > shaped > the same way, and have an init and a run function, which relies on a > bunch of common libraries to quickly do stuff. You pass test > parameters > to the test class, and it spits out a generic report object. We > prooobably can make wrappers for nagios and other systems in the > future. > > > > > -Chris > > > > > > > > > > - > > > > To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org > > For additional commands, e-mail: dev-h...@warble.apache.org > > > > >
Re: The general idea, design and goal of Apache Warble
I completely agree, easy setup is key. This is starting to sound like Etherpad, out of the box you get DirtyDB and then can pipe it to something bigger/better if you want too. This email has been re-written a few times as I type out ideas, flush them out a bit, then realize they don't make any sense. -Chris T. On Sun, 2018-06-24 at 09:53 -0500, Daniel Gruno wrote: > On 06/23/2018 11:59 AM, Chris Lambertus wrote: > > > > > > This is a good summary. As we move forward with the design specs, > > I’d also like to see some comparisons with existing off-the-shelf > > monitoring systems (zabbix, nagios, etc.,) to show that Warble is > > first and foremost easy to set up. This is typically the biggest > > barrier to entry for folks trying to set up a monitoring solution. > > Zabbix and Nagios specifically require a fairly steep learning > > curve to get any useful data. > > the potential turnkey aspect of Warble is very much on my mind. I > want > it to be as easy as downloading, running a setup script, and things > should be operational. While we can't "compete" against other > software > products (this is very anathema to ASF projects), we can highlight > what > we think are Warble's strengths and mebbe do a matrix comparison of > features, pros and cons etc. > > > > > > As for the database, I'm leaning towards using ElasticSearch for > > > the > > > permanent storage, and possibly Redis for the ephemeral lookup > > > cache for > > > the alerter. > > > > > > If it can be made turn-key, I’m OK with ES, but it’s a heavy > > solution for small scale monitoring. It may be worth considering a > > tiered approach with something as simple as sqllite as the > > “default” backend with ES for a more enterprise-sized deployment. > > I think this is a reasonable approach. For the initial development, > we > can definitely start with something simple like sqlite, and as > development continues, we can add in the more advanced stuff that > would > require a proper timeseries/aggregated database system once we get > closer to scoping out the visualization aspects of gathered data > (that > is, beyond the basic alerting and up/down stats). > > > > > Any thoughts on being backwards-compatible with existing nagios > > check scripts? There’s a fairly broad ecosystem there that might be > > leveraged. > > This is a tricky question, and my immediate reaction is that this is > not > something _I_ would focus on, as building an additional API endpoint > for > Nagios could take quite a while. Instead, I'll be working on building > a > simple and 'modular' base class for tests as possible, so anyone can > quickly build a test for whatever they want to test for, or quickly > port > from other systems to Warble. This is not to say that the Nagios idea > is > bad, but rather that I would prefer starting out with what I know. > > To give an idea of what a test class currently looks like, I have an > example class at https://paste.apache.org/Ji05 - all tests are > shaped > the same way, and have an init and a run function, which relies on a > bunch of common libraries to quickly do stuff. You pass test > parameters > to the test class, and it spits out a generic report object. We > prooobably can make wrappers for nagios and other systems in the > future. > > > > > -Chris > > > > > > > > > > - > > > > To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org > > For additional commands, e-mail: dev-h...@warble.apache.org > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org > For additional commands, e-mail: dev-h...@warble.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org For additional commands, e-mail: dev-h...@warble.apache.org
Re: The general idea, design and goal of Apache Warble
On 06/23/2018 11:59 AM, Chris Lambertus wrote: This is a good summary. As we move forward with the design specs, I’d also like to see some comparisons with existing off-the-shelf monitoring systems (zabbix, nagios, etc.,) to show that Warble is first and foremost easy to set up. This is typically the biggest barrier to entry for folks trying to set up a monitoring solution. Zabbix and Nagios specifically require a fairly steep learning curve to get any useful data. the potential turnkey aspect of Warble is very much on my mind. I want it to be as easy as downloading, running a setup script, and things should be operational. While we can't "compete" against other software products (this is very anathema to ASF projects), we can highlight what we think are Warble's strengths and mebbe do a matrix comparison of features, pros and cons etc. As for the database, I'm leaning towards using ElasticSearch for the permanent storage, and possibly Redis for the ephemeral lookup cache for the alerter. If it can be made turn-key, I’m OK with ES, but it’s a heavy solution for small scale monitoring. It may be worth considering a tiered approach with something as simple as sqllite as the “default” backend with ES for a more enterprise-sized deployment. I think this is a reasonable approach. For the initial development, we can definitely start with something simple like sqlite, and as development continues, we can add in the more advanced stuff that would require a proper timeseries/aggregated database system once we get closer to scoping out the visualization aspects of gathered data (that is, beyond the basic alerting and up/down stats). Any thoughts on being backwards-compatible with existing nagios check scripts? There’s a fairly broad ecosystem there that might be leveraged. This is a tricky question, and my immediate reaction is that this is not something _I_ would focus on, as building an additional API endpoint for Nagios could take quite a while. Instead, I'll be working on building a simple and 'modular' base class for tests as possible, so anyone can quickly build a test for whatever they want to test for, or quickly port from other systems to Warble. This is not to say that the Nagios idea is bad, but rather that I would prefer starting out with what I know. To give an idea of what a test class currently looks like, I have an example class at https://paste.apache.org/Ji05 - all tests are shaped the same way, and have an init and a run function, which relies on a bunch of common libraries to quickly do stuff. You pass test parameters to the test class, and it spits out a generic report object. We prooobably can make wrappers for nagios and other systems in the future. -Chris - To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org For additional commands, e-mail: dev-h...@warble.apache.org - To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org For additional commands, e-mail: dev-h...@warble.apache.org
Re: The general idea, design and goal of Apache Warble
> On Jun 21, 2018, at 3:46 PM, Daniel Gruno wrote: > >Apache Warble is a turnkey monitoring solution with a modular, >threaded design, allowing for both public telemetry (data pull) from >easily deployable nodes around the world, as well as internal >monitoring (data push) on machines, with a centralized master server >for both managing, collecting, aggregating, alerting on and >visualizing the data. This is a good summary. As we move forward with the design specs, I’d also like to see some comparisons with existing off-the-shelf monitoring systems (zabbix, nagios, etc.,) to show that Warble is first and foremost easy to set up. This is typically the biggest barrier to entry for folks trying to set up a monitoring solution. Zabbix and Nagios specifically require a fairly steep learning curve to get any useful data. > As for the database, I'm leaning towards using ElasticSearch for the > permanent storage, and possibly Redis for the ephemeral lookup cache for > the alerter. If it can be made turn-key, I’m OK with ES, but it’s a heavy solution for small scale monitoring. It may be worth considering a tiered approach with something as simple as sqllite as the “default” backend with ES for a more enterprise-sized deployment. Any thoughts on being backwards-compatible with existing nagios check scripts? There’s a fairly broad ecosystem there that might be leveraged. -Chris - To unsubscribe, e-mail: dev-unsubscr...@warble.apache.org For additional commands, e-mail: dev-h...@warble.apache.org