Guys, thanks for all that! Can we for a second abstract this discussion from technology and start by lining up scenerios we want to achieve. Then put a software that will allow us to achieve all/most of scenerios with least amount of work/maintenance?
So my scenerios: I want to see when health of docker service I want to see when message queue becomes saturated I want to see when RAM exceeds 70% I want to see when my network causes tons of retransmissions I want to see when one of nodes is down Did I miss anything? Which software stack would allow me to see these things? Cheers, Michal On 24 July 2016 at 09:09, Mathias Ewald <[email protected]> wrote: > I think Sensu is the best monitoring approach out there atm. Nagios / Icinga > are way to static and scale badly imho. The kind of checks you proposed are > quite interesting. I would suggest to wrap a sensu check around Tempest but > that's going to far for the first cycle. > > The two stacks (Sensu + Unchiwa and TICK) only really overlap in metrics > collection which can be done via Sensu and Telegraf. I don't know if it > makes sense to have both ... I definitely think we need Sensu though simply > to monitor service availability and other thresholds and events which aren't > covered in TICK as not everything is time series data and to have the > alerting. Only with Sensu we don't have insight into performance and trends, > with TICK only we lack alerting on events and non-performance metric data > (Is Keystone up? etc) > > I think it won't hurt to develop theses two stacks in parallel and maybe > we'll join them together in a chain as I described earlier. > > 2016-07-24 14:25 GMT+02:00 Dave Walker <[email protected]>: >> >> Thanks Mathias, >> >> I'm not tied to Sensu.. anything can really fill that gap in my mind. >> You've done a good job at outlining the steps involved. I created a >> blueprint with the steps I had in mind[0] >> >> For this cycle, I wanted to keep it simple so it was easily achievable. I >> only planned to have some basic up/down for each node and throw the >> performance data on the floor. >> >> I wanted to include the option to include local configs, as json blobs. >> Some of the things I was thinking as local config: >> - daily checkouts, can instances be built with networking >> - remaining resources count (ie, does each subnet have X remaining ip >> addresses available) >> - Is Ceph healthy? >> >> So, these things aren't really performance over time interesting.. which >> means the intention does differ. However, I do agree that both stacks could >> achieve both objectives. >> >> I've essentially got much of this working locally, but would require about >> a day of cleaning up for submission... but if your work can achieve the >> objectives above, i'm happy to discontinue... or help make your stack >> pluggable. >> >> [0] https://blueprints.launchpad.net/kolla/+spec/sensu >> >> -- >> Kind Regards, >> Dave Walker >> >> On 24 July 2016 at 11:56, Mathias Ewald <[email protected]> wrote: >>> >>> Monitoring is a difficult topic as the number of options regarding the >>> toolset and mechanisms are very high. We had some chats about it in IRC that >>> discovered even more options than I thought existed :D I believe Dave's view >>> on Sensu is generally correct in that Sensu is more directed to monitoring >>> in the form of "if X running/working" but of course has the ability to >>> transport metrics, too, but lacks the good dashboarding capabilities for >>> performance data. One set up I could images is >>> >>> 1. Sensu Client to collect checks and metrics >>> 2. RabbitMQ for transport >>> 3. Sensu Server to receive, evaluate, alarm and write metrics to InfluxDB >>> 4. Uchiwa as a Dashboard to Sensu >>> 5. InfluxDB to store metrics >>> 6. Grafana to dashboard metrics >>> >>> So Sensu could be used as a replacement for (or in addition to) a metrics >>> collection daemon like Collectd or what I decided to use: Telegraf. For my >>> implementation, this means I will add a parameter to make Telegraf optional. >>> This way, someone else may implement the rest of the stack and the user can >>> decide which one to use. >>> >>> What do you think? >>> >>> Mathias >>> >>> >>> >>> 2016-07-23 21:51 GMT+02:00 Stephen Hindle <[email protected]>: >>>> >>>> My understanding was Sensu could produce metrics ? >>>> And Kapacitor can do alerting for the TICK stack stuff mewald is >>>> doing... >>>> I really don't see them as that different ? >>>> >>>> >>>> On Fri, Jul 22, 2016 at 5:19 PM, Dave Walker <[email protected]> wrote: >>>> > Yes, this is my thought. >>>> > >>>> > The scope of the Sensu work is: "Is this thing working?" (with the >>>> > reference >>>> > being up/down) >>>> > But the scope of the Grafana and friends is, "How hard is this >>>> > working?" >>>> > (but no alerting) >>>> > >>>> > They are certainly complementary.... However, Sensu can throw data at >>>> > a >>>> > Grafana stack (aiui).. but I fear that is too much to achieve this >>>> > cycle. >>>> > >>>> > -- >>>> > Kind Regards, >>>> > Dave Walker >>>> > >>>> > On 23 July 2016 at 00:11, Fox, Kevin M <[email protected]> wrote: >>>> >> >>>> >> I think those are two different, complementary things. >>>> >> >>>> >> One's metrics and the other is monitoring. You probably want both at >>>> >> the >>>> >> same time. >>>> >> >>>> >> Thanks, >>>> >> Kevin >>>> >> ________________________________________ >>>> >> From: Steven Dake (stdake) [[email protected]] >>>> >> Sent: Friday, July 22, 2016 3:52 PM >>>> >> To: OpenStack Development Mailing List (not for usage questions) >>>> >> Subject: Re: [openstack-dev] [kolla] Monitoring tooling >>>> >> >>>> >> Thanks for pointing that out. Brain out to lunch today it appears :( >>>> >> >>>> >> I think choices are a good thing even though they increase our >>>> >> implementation footprint. Anyone opposed to implementing both with >>>> >> something in globals.yml like >>>> >> monitoring: grafana or >>>> >> monitoring: sensu >>>> >> >>>> >> Comments questions or concerns welcome. >>>> >> >>>> >> Regards >>>> >> -steve >>>> >> >>>> >> On 7/22/16, 3:42 PM, "Stephen Hindle" <[email protected]> wrote: >>>> >> >>>> >> >Don't forget mewalds implementation as well - we now have 2 >>>> >> > monitoring >>>> >> >options for kolla :-) >>>> >> > >>>> >> >On Fri, Jul 22, 2016 at 3:15 PM, Steven Dake (stdake) >>>> >> > <[email protected]> >>>> >> >wrote: >>>> >> >> Hi folks, >>>> >> >> >>>> >> >> At the midcycle we decided to push off implementing Monitoring >>>> >> >> until >>>> >> >>post >>>> >> >> Newton. The rationale for this decision was that the core review >>>> >> >> team >>>> >> >>has >>>> >> >> enough on their plates and nobody was super keen to implement any >>>> >> >>monitoring >>>> >> >> solution given our other priorities. >>>> >> >> >>>> >> >> Like all good things, communities produce new folks that want to >>>> >> >> do new >>>> >> >> things, and Sensu was proposed as Kolla's monitoring solution >>>> >> >> (atleast >>>> >> >>the >>>> >> >> first one). A developer that has done some good work has shown up >>>> >> >> to >>>> >> >>do the >>>> >> >> job as well :) I have heard good things about Sensu, minus the >>>> >> >> fact >>>> >> >>that it >>>> >> >> is implemented in Ruby and I fear it may end up causing our gate a >>>> >> >> lot >>>> >> >>of >>>> >> >> hassle. >>>> >> >> >>>> >> >> https://review.openstack.org/#/c/341861/ >>>> >> >> >>>> >> >> >>>> >> >> Anyway I think we can work through the gate problem. >>>> >> >> >>>> >> >> Does anyone have any better suggestion? I'd like to unblock >>>> >> >> Dave's >>>> >> >> work >>>> >> >> which is blocked on a 2 pending a complete discussion of our >>>> >> >> monitoring >>>> >> >> solution. Note we may end up implementing more than one down the >>>> >> >> road >>>> >> >> >>>> >> >> Sensu is just where the original interest was. >>>> >> >> >>>> >> >> Please provide feedback, even if you don't have a preference, >>>> >> >> whether >>>> >> >>your a >>>> >> >> core reviewer or not. >>>> >> >> >>>> >> >> My take is we can merge this work in non-prioirty order, and if it >>>> >> >>makes the >>>> >> >> end of the cycle fantastic if not, we can release it in Ocatta. >>>> >> >> >>>> >> >> Regards >>>> >> >> -steve >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>> >> >> >> >>_________________________________________________________________________ >>>> >> >>_ >>>> >> >> OpenStack Development Mailing List (not for usage questions) >>>> >> >> Unsubscribe: >>>> >> >>[email protected]?subject:unsubscribe >>>> >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >> >> >>>> >> > >>>> >> > >>>> >> > >>>> >> >-- >>>> >> >Stephen Hindle - Senior Systems Engineer >>>> >> >480.807.8189 480.807.8189 >>>> >> >www.limelight.com Delivering Faster Better >>>> >> > >>>> >> >Join the conversation >>>> >> > >>>> >> >at Limelight Connect >>>> >> > >>>> >> >-- >>>> >> >The information in this message may be confidential. It is intended >>>> >> >solely >>>> >> >for >>>> >> >the addressee(s). If you are not the intended recipient, any >>>> >> > disclosure, >>>> >> >copying or distribution of the message, or any action or omission >>>> >> > taken >>>> >> >by >>>> >> >you >>>> >> >in reliance on it, is prohibited and may be unlawful. Please >>>> >> > immediately >>>> >> >contact the sender if you have received this message in error. >>>> >> > >>>> >> > >>>> >> >>>> >> > >>>> >> > > >__________________________________________________________________________ >>>> >> >OpenStack Development Mailing List (not for usage questions) >>>> >> >Unsubscribe: >>>> >> > [email protected]?subject:unsubscribe >>>> >> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >> >>>> >> >>>> >> >>>> >> __________________________________________________________________________ >>>> >> OpenStack Development Mailing List (not for usage questions) >>>> >> Unsubscribe: >>>> >> [email protected]?subject:unsubscribe >>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >> >>>> >> >>>> >> __________________________________________________________________________ >>>> >> OpenStack Development Mailing List (not for usage questions) >>>> >> Unsubscribe: >>>> >> [email protected]?subject:unsubscribe >>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> > >>>> > >>>> > >>>> > >>>> > __________________________________________________________________________ >>>> > OpenStack Development Mailing List (not for usage questions) >>>> > Unsubscribe: >>>> > [email protected]?subject:unsubscribe >>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> > >>>> >>>> >>>> >>>> -- >>>> Stephen Hindle - Senior Systems Engineer >>>> 480.807.8189 480.807.8189 >>>> www.limelight.com Delivering Faster Better >>>> >>>> Join the conversation >>>> >>>> at Limelight Connect >>>> >>>> -- >>>> The information in this message may be confidential. It is intended >>>> solely >>>> for >>>> the addressee(s). If you are not the intended recipient, any >>>> disclosure, >>>> copying or distribution of the message, or any action or omission taken >>>> by >>>> you >>>> in reliance on it, is prohibited and may be unlawful. Please >>>> immediately >>>> contact the sender if you have received this message in error. >>>> >>>> >>>> >>>> __________________________________________________________________________ >>>> OpenStack Development Mailing List (not for usage questions) >>>> Unsubscribe: >>>> [email protected]?subject:unsubscribe >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >>> >>> >>> -- >>> Mobil: +49 176 10567592 >>> E-Mail: [email protected] >>> >>> evoila GmbH >>> Wilhelm-Theodor-Römheld-Str. 34 >>> 55130 Mainz >>> Germany >>> >>> Geschäftsführer: Johannes Hiemer >>> >>> Amtsgericht Mainz HRB 42719 >>> >>> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte >>> Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail >>> irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und >>> vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte >>> Weitergabe dieser Mail ist nicht gestattet. >>> >>> This e-mail may contain confidential and/or privileged information. If >>> You are not the intended recipient (or have received this e-mail in error) >>> please notify the sender immediately and destroy this e-mail. Any >>> unauthorised copying, disclosure or distribution of the material in this >>> e-mail is strictly forbidden. >>> >>> >>> __________________________________________________________________________ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> [email protected]?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > > -- > Mobil: +49 176 10567592 > E-Mail: [email protected] > > evoila GmbH > Wilhelm-Theodor-Römheld-Str. 34 > 55130 Mainz > Germany > > Geschäftsführer: Johannes Hiemer > > Amtsgericht Mainz HRB 42719 > > Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail > irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und > vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte > Weitergabe dieser Mail ist nicht gestattet. > > This e-mail may contain confidential and/or privileged information. If You > are not the intended recipient (or have received this e-mail in error) > please notify the sender immediately and destroy this e-mail. Any > unauthorised copying, disclosure or distribution of the material in this > e-mail is strictly forbidden. > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
