Hi Jean,
First at all, I want to thank you and all other Shinken developers. I'm
glad to see how Shinken grows.
I think a own User Interfcae can be a big plus for Shinken. A new
interface could unify the advantages of Shinken and a modern interface
with features that a operator/admin/manager realy needs. I agree with
you, that the points root problems, impacts and criticity are very
importent and the focus of the new UI must be on simpleness (like the
KISS principle).
I think another important point is that the new UI should be expandable.
It should be easy to customise or add new features. I try to explain it.
A few years ago, I used the CMS TYPO3 and I was amazed how expandable
this CMS was. The TYPO3 Core has a clear focus on the root problems but
it was and still is quite easy to add new features. The result was that
a lot of people wrote extentions and shared them with the community.
Such a dynamic community would be great for Shinken.
My personal "perfect" UI has an simple UI core, is focused on the
business and is modularly expandable.
I think with a elaborate concept of the UI project could be fine.
Andreas
Am 06.05.2011 16:05, schrieb nap:
Hi,
I would like us to take some time to make a little break like we did
the last year after the 0.2 version, and look at the project vision
and why not, change it.
**Beware** : one time again, I wrote a lot, sorry :)
*
What we did great the last year*
Let look at what we did this last year, since the 0.1 version. We
focus on put in place the core, with huge distributed feature. I think
we fulfill our goal for this part. We got a full scale architecture,
we can manage all classic network or organizationals problems (DMZ,
distant lans, or customers). And I"m quite proud we can say :
* distributed architecture : *Done*.
We add new modules (lot of retention ones, livestatus, ndo, merlin
(not finished :) ), pnp, etc). So, Ninja put aside, we manage in a
good way the main UIs of the Nagios world. It's still a point in
progress, but it should not ask a lot of work, mainly bug fixes and
small improvements. So we can say :
* export/presentation modules : *really good*.
One other thing that we add is a configuration enhancement and
simplification (service generators or easy dependencies definitions
for example). It's cool for people that wrote their conf with vi, they
wrote their conf in an efficient way now.
* configuration enhancement and simplification : *Done*.
We also add new quality method, especially the test driven one, and so
we are sure we just delete bugs, and nearly never add new ones n
previous features. It's a very comfortable way for hacking code.
Without it, w should not have as much feature as we got, and maybe no
production installation at all :)
One other thing I'm glad we add is a new way of look at the
monitoring. I'm talking about root problem/impacts + criticity. It's
something very easy to use, because it just need one parameter, from 0
to 5 for the criticity, but the implications are just greats :
* far less easy to configure notification filter (only prod, not less)
* business rules that respect the root cause analysis feature, and
easy to setup.
* export theses informations in LiveStatus (that became the default
API) so UIs can use it to show only Business impacting problems.
So we can say :
* in core "focus on business feature & correlation" : *Done*.
So we can say that we reach a very good product, far better than I
first thought one year and half ago. *Big thanks an congrats everyone :)*
*What did we failed too*
But all was not as good as all theses points :
* My English skill is still very low :)
* Our wiki is very sparse in tutorials. Yes we got the "official doc"
from Nagios with the new features, but it's a nightmare to read and
start with such a documentation.
* The UIs did not follow us a lot. Yes they solve some bugs, but I
think the main addition in the monitoring from Shinken is not it's
architecture, even if it's a great one, but the root problem+
criticity one, really. And this was not used by UIs, Thruk aside with
shinken specific views.
I think there are our major problem right now for a shinken domination
of the world.... too much? ok, for a large shinken acceptance from
users, that show it as a "new Nagios" than a very enhanced one that
will help us in their day to day job.
For my English skills, I start English 16 years ago, so I think it
will just won't be possible. I'll try to read again the whole Harry
potter books and watch films in English, it can help :)
For the wiki, I think it's mainly my fault. It's very very hard to
**start** a documentation, but far more easy to enhanced it. I didn't
wrote in it for some weeks, and hopefully some people remember me that
features are useless without documentation. And I think, it's more
than it. It's not documentation we need, but tutorials about each
feature. That what I try to create in our new wiki main page, with a
lof of tutorials. It's the same thing with our web site, it's more
"easy" to look at what shinken offer to solve users problems.
I hope the wiki problem won't be one when the firsts 20 tutorials will
be write, and every one will help for enhanced them and wrote new one.
I'll also open a forum, so users will have a easy way to ask for help,
far less frightening than posting in a "devel" list :p (I don't think
a user mailing list is useful, it's the same purpose, we can start
with a forum, and wait some times to look at teh result).
For the third point, it's far more problematic. Today's admins are not
the same than 10 years before. Nowadays, we can talk about "speed
admins", because they do not have anymore the time to be expert in one
thing, but must be medium in a lof of things (I'm personally a
linux/windows/SAN/vmware/network/monitoring admin, and it's quite a
short list). It will be even harder in the future, with the "devops"
arrival.
Nearly all of people of this mailing list know the difference between
a core and an UI. But a LOT of admins don't. It's not they are dumb,
it's just they do not have the time to look at such "detail".
And it's is a major problem for our (lovely) project. We got no
visibility. Of course our web site is cool :) but the main page that
is look at is .. screenshots!
So we face a double problem :
* we lack visibility for a lot of users, because we do not have an UI.
Simple problem, but terrible impacts for us.
* the other UIs do no follow us really. We use standard API and add
new features easy to access in it (especially LiveStatus), but it was
not a success. Thruk was the most "following" UI, and I would like to
thanks Sven for his support, really, (especially because my perl code
was a nightmare, and he was kind enough to correct it). But even with
this inclusion, it's stil very hard to look at a Thruk with a
Nagios/Icinga backend, and a Shinken one. Yes we got two new views,
but it's not enough to help the user focus on what we think is
important for today and especially tomorrow monitoring : focus on
business first.
*So? What we do?*
The documentation and user helping problem will got a solution very
soon, but we must look at the UI one. We say last year in our project
vision that we are not here to make an UI, and if we can
"enhance/influence" current ones, it will be good enough.
I think we (mainly I) were wrong for 50%. **Not** making an UI allow
us to focus on core enhancement, stabilization and production ready
product. And now we got this, it's time to look at how we can help the
users to get the more prower from Shinken core in the most efficient
way. I think add plugins to current UIs is not enough. We can't make
the users focus on business first if we got the same view than Nagios
10 years ago. It's just not possible. We can't afford having hosts and
services manage in a different ways anymore, both are "end user
resource" after all, nothing more.
That's why I say that the root problem/criticity was so important the
last year, it will give a new way of "working" for admins for day to
day work. It should be simple to show links betweens elements, it
should be immediate to look at business impacts, it should be
immediate to look at root problems of this impacts, we do not need to
see IT elements every where if they are not "important" (business
supporting IT), it's far enough to look at them on or twice a day by
default in such an UI.
I think it's just not possible to got such a new way with current UIs,
because it will need shinken hooks every where, and no one will want
this, especially because some old school users won't want this change,
got their habits and long hairs and will never use such a monitoring
UI. And it's good, they already got such an UI. They even got plenty
of them, nearly all UIs (nagvis and business process put aside)
propose the same way of thinking.
I think now with a stable core (the main need is for some retention
parameters and an enhanced merlin module, not something that will five
us work for one more year, more like one week :) ), it time for us to
think about such an UI.
I won't fade it, it's important to get "our own" to promote the
project of course, but I'm --> **strongly** <-- against doing an UI
like all others, put our logo and say "cool, we got our own ui, great
isn't it?". No. Doing so is not great. If we do one, it should add a
new dimension, a new way of seeing users problems, like we did in the
core for distributed. The main idea was not to ask "how we can make
the current things scale in a good way", but "how it should be done in
a perfect wold. Ok. Now, it is possible to do it with current code?
Ok, let do a new one->Shinken".
*An UI? For who?**Which UI?*
I think the main thing to ask is if the currents admins and tomorrow
ones got their "perfect" UI? There are strong difference between
monitoring users. We can split in 3 main parts I thinks :
* operators : they are dedicated to monitoring, they should look at
ALL errors and solved them. Simple. Currents UIs are good for them
(maybe a criticity sorting can help them, but plugins and patches are
good for them)
* admins : they are more and more asked to focus on business, because
they have less and less time to give for their monitoring solution.
They should look in continuous way at IT elements that impacts
productions, qualif and dev ones should be looked one or twice a day,
not more.
* admins boss (N+1 for example) : they want to look at business
impacts, and see "easily" what is impacting it (so they can rushed to
the good admin and "help" him to solve it :) ).
So in all cases, the*root problem/impacts + criticty is very very
important*. It's even the difference between look at a console full of
red elements (like 500+) or an UI that show that we lost the distant
ERP, and one click after that, that it's due to the distant firewall
that cannot write logs because its hard drive is full. 10 minutes in
the first case to find "what solved", 30 sec for the N°2.
As such a console user, I begin to look at how get more productive
with my monitoring console. And from now it's not possible, I just
lost a lof of time during large impacts.
I think first we should focus on what we add for pure monitoring that
help in the instant the admins. I vote for an UI:
* very simple : who care about having 20 different views? I think a
very small set of very useful and thinked ones are far better than a
plenty of medium ones.
* strongly focus on business : it should be clear that IT is just here
for support end user app. If the admin want a classic UI, it take one
of the others, they will always be available. So the main view should
be critical (as criticity, not the service status) user app impacted.
Then it should be very easy to show the root problem of theses
impacts. This view will be useful for our two user populations (admins
with a LOT of elements, that should focus on business app first. I
think in the future, most admins will be in this case, and admins
bosses, that focus on prod business only. He (she) doesn't care about
other "environments").
We can add another "classic" view that show host/services in problems
for pure IT elements. And only ONE view for theses 2 elements. It's
another thing important : host and service are here for end user app.
They are resources only, do not need do separate them.
It should be easy to "tag" end user apps (so the criticity).
It should be easy also to "select" realms. So if a guy got access to
some realms, it should be easy for him to select them (active/disable).
It should be easy to see realms status, and in fact daemons status.
Of course, there will be question about the configuration part, we can
put this for a V2 after we solved all of theses points. A lot of huge
IT use on the hand configuration tool (from CMDB, etc), and so such a
tool won't help them. So the "efficient visualization" (focus on
critical root problem) should be add first.
The main spirit should be "small is beautiful". There other UI with a
lof of features, users can still use them if they want :)
I think for operators that must solve everything, the classic view is
enough, old school admin will use it too, new hype admins will use the
efficient one, like their bosses.
We should focus on what shinken add for monitoring, and I think the
distributed and root problem/criticity are the key points. There are
also business rules that can be quite easily added (but not in a
specific view, more like a hover layout that show the tree if the user
want it, no more :) ).
With this, we avoid the dangerous risk of "shinken UI do all you
want". No. From now it help you to focus on business, nothing more.
Then we can look at user reactions, and gather lot of development
power before going too far (we should NOT forgot we got a core to
maintain and develop! :) ).
*So? Are you ok?*
So? Is such an UI ok for you? Is this new project vision good? If it's
ok, we will see how we can do for this ui conception (I've got some
mockups that wait to be shown, and really are different than current
(monitoring) UIs :) ) and start this new adventure :D
Jean
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel