Dear Fulko Hew,

On 5/11/2013 14:42, Fulko Hew wrote:
<snip>

I have instituted exactly this philosophy on the application developers within my company.
After a few decades of creating and supporting applications and devices,
I came to the conclusion that standalone applications should not be looked
at any differently than 'embedded' applications in dedicated devices (eg. routers).
After all, even a router is 'just another application'.

This indeed largely coincides with our vision. We would like to collect granular (performance) information about individual servers/applications running on particular devices instead of limiting ourselves to gathering global information about the status of devices in their totality. This is due to the multi-service nature of our cloud-based back-end, where a single back-end device is able to host multiple highly heterogeneous services.

So I then abstracted out the 'common items' that all applications have
and bundled them into a MIB (and implemented a Perl and Java library
that allow our apps (written in those languages only at the moment)
to provide and receive info (snmp get and set) and send traps for
noteworthy items.  And these libraries interact with Net-SNMP via AgentX

Thanks for sharing this information!

The abstracted stuff fell into 5 categories:
- Dependencies - All apps 'need' stuff, like files/connections/other apps, etc
  as inputs and outputs.  Ie. Things 'this app' depends on to work, and if
  a dependency isn't being fulfilled, the app will not work as needed.
This table also holds simple usage and response stats on each 'dependency' along with traps for up/down, usage threshold alarms and response time alarms.
- Capacity/resource - Where 'dependencies' are a 'comms' kind of thing
the 'capacity' table, allows the app developers to communicate info/status on 'consumable' things an app needs or consumes (buffer pools, memory, sockets, etc) And it defines traps for capacity alarms and when the consumption goes back
  to normal.
- Response time - Even though the dependency table has response time info
  it is very simple.  This table allows you to report _any_ kind of thing
  that you want to comminicate the response time of... with thresholds and
  good/bad traps.
- SLA Criteria - Allows a designer to communicate the 'things that should be measured, and there threshold targets; in order to calculate the application's
  availability for your service level agreement.
- Info - A place to communicate versioning info about application components, show and remotely control debug/logging levels, remote restart a application, etc.

So these tables define the 'things' that should or could be communicated,
now its up to the application designers to populate these tables... put
rows in the tables for each of their applications 'needs', and that will
be different for each application.

So if I understand correctly, you have defined a general MIB that contains entries for each of these types of information? And applications themselves decide which information (that relates to their performance) they want to expose in this MIB structure? How do applications actual insert their values into the MIB (i.e., how do they add a row to the appropriate MIB branch)? Does this happen via the AgentX protocol? If so, do all individual applications need to be extended with an AgentX component, or do you make use of a single AgentX module per device that takes care of MIB-related operations for all applications running on that device?

And then applications also have application specific 'other stuff' that they
may want to expose, that doesn't fit into the above generic tables.

And where exactly do you put this information and make it addressable via SNMP? Is this done via an application-specific MIB?

So as you browse a device, these tables let you 'find out what different
things may be running, how they are interconnected, how well they are
running (or not), and provide capacity planning info (and alarms on 'potential
capacity problems so you don't actually have to do capacity planning
until you get 'close' to your 'gee maybe I should start paying attention'
usage level.

May I ask which kind of NMS you are using to gather, store and handle the SNMP-mediated information? And does an administrator need to manually initiate the browsing of a device, or is this performed automatically by the NMS? Remember that we would like to automate the performance data collection as much as possible due to the dynamic nature of our back-end infrastructure (i.e., devices might spawn and be destroyed at run-time, and we would like the performance monitoring to start/stop automatically when a device is spawned/destroyed).

So I think this design (and implementation) covers the kind of stuff you
are thinking about (and more).

I agree! Your MIB seems fairly elaborate and is likely to exceed our (initial) needs.

Unfortunately, my implementation is
(at the moment) company proprietary; but I can still 'talk' about it.

I would like to take this opportunity to thank you once more for replying to our question and for sharing your knowledge and experience with us. If any of my follow-up questions would be too detailed (given the proprietary nature of your implementation), please feel free to simply state so and to limit yourself to answering only those questions that you feel comfortable about.

Fulko Hew

Best regards,
Maarten Wijnants

--

dr. Maarten Wijnants
Expertise centre for Digital Media (EDM) - Hasselt University
Wetenschapspark 2 - B3590 Diepenbeek
Tel: +32 (0)11 268411 - Mobile: +32 (0)497 341848
Fax: +32 (0)11 268499
E-mail:maarten.wijna...@uhasselt.be
Web:http://research.edm.uhasselt.be/~mwijnants/

------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
Net-snmp-users mailing list
Net-snmp-users@lists.sourceforge.net
Please see the following page to unsubscribe or change other options:
https://lists.sourceforge.net/lists/listinfo/net-snmp-users

Reply via email to