Good evening,
Am 04.11.2014 um 17:06 schrieb Darko Hojnik:
No flames, blames or another punker-stuff… I write it honestly
The mostly once whats confusing me is that we have several years running an
Nagios 3 install where we are able to filter the notifications. I believed
Icinga 1.x is not only a fork from nagios. It’s the legitim successor of
Nagios. And Icinga 2 is the improvement to make the software holy, mighty and
absolutely awesome.
notification_options u,c,r ; Send
notifications about unknown, critical, and recovery events
notification_interval 120 ; Re-notify
about service problems every hour
notification_period 24x7 ;
Notifications can be sent out at any time
So it seems my problem if if I have it right understanded, that Icinga 2 can
not distinguish between the states Unknown, Critical and Warning about Recovery
notifications. It would always sent an recovery-notification if an service
switches from an bad state to OK. Or even you have configured out
recovery-notifications. Then notifications with the state Critical would been
send only. Even you wanna have noting to do with the state Warning in your
notifications.
So if Icinga has not an functionality about filtering notifications from states
what is the sense about Icinga? Ok the first mission critical mission to
collecting data about the Infrastructure works like a charm. But in my opinion,
to be unable to define exactly which kind of notification has to send and which
not makes the project in my point of view absolutely useless.
My conclusion I have learned the new syntax and Icinga 2 and after testing and
configuring the last three weeks. My current state is I am 98% ready before
rollout Icinga 2 to 500 servers. And yes its frustrating me totally that my
work may have been useless because while with this showstopper would the
project canceled completely.
Trivago would like to be one of the first company’s who is willing to bring
Icinga 2 as an mission critical application in there infrastructure. We are
willing to share our experience and sometimes could be code written on one of
our hackathons for this project also. In short my setup works with three nodes
and an master above. The master only is reporting and storing stuff to an
Postgresql-database. Because the master has the modules ido-postgresql and
notifications activated only. So all issues are stored in a database. I could
not believe that this project is unable to implement something what is so
important. And I don’t think that I am alone with this issue.
thanks all for reading
In short - your post is offending, insulting and what not. The Icinga
project is not "useless" as you call it.
You've encountered a bug, and we were on the way of understanding it and
getting an idea that I was mistaken. Until you wrote that line:
"makes the project in my point of view absolutely useless."
I'm not interested in your personal frustration at work, or whatever is
causing you not to think what others might feel about your harsh words.
So let's cut the bullshit.
Late at night on Tuesday, I did install Nagios 3.5.1 (you didn't specify
which version, so I took the latest) and built a configuration based on
your description. Tip for next time - provide a working configuration
for reproducing your issue easier.
https://dev.icinga.org/attachments/download/2114/trivago_test.cfg
Further, I did hack up a little testrunner, sending passive checkresults
to the core.
https://dev.icinga.org/attachments/download/2116/trigger_trivago_crit_recovery_nagios
There, I did see that I was wrong, and docs were not clear about it at
all. There are certainly code regions in Nagios and Icinga 1.x I have
never debugged or touched, even in 5,5 years of digging into that code.
So, I figured, after building an Icinga 2.x configuration based on your
description - again, provide a working example - that this must be a bug.
https://dev.icinga.org/attachments/download/2124/trivago.conf
And the testrunner, a bit more tests for Icinga 2.x
https://dev.icinga.org/attachments/download/2125/trigger_trivago_crit_recovery
My first fix attempt failed, it was too late and the week was already
going into a 60h+ week like the one before.
Discussed that with Gunnar the other day, and so we came up with an
issue, and agreed on fixing this bug in the next days. "when there's
time" since there are and were other open tasks to finish first on the
2.2 roadmap.
https://dev.icinga.org/versions/200
Looking into the problem more in-depth, it was rather easy, after
deciding if the users statehistory and notificationhistory was
important. Storing all notified users (since we have dedicated
notification objects in Icinga 2.x) during a notification cycle, and
checking that against when a recovery notification occurs, while
resetting the list on HARD OK.
https://dev.icinga.org/issues/7579
Conclusion:
You can have nearly everything from Icinga developers & supporters. For
free, be it features, bugfixes or support.
But if you approach us making us responsible for your problems, that's
not gonna work out.
We're doing Icinga for years now, and its success isn't built on judging
each other for failures. It's because we work _together_ and fix and
test stuff _together_. In a sometimes rough, but always polite atmosphere.
Your behaviour is bad karma, and hopefully you'll think about that.
You can test my tested fix in the latest snapshot packages. You owe me a
beer, or two, for fixing it Sunday evening.
Best regards,
Michael
--
DI (FH) Michael Friedrich
[email protected] || icinga open source monitoring
https://twitter.com/dnsmichi || lead core developer
[email protected] || https://www.icinga.org/team
irc.freenode.net/icinga || dnsmichi
_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users