> Now it seems to me that one of the basic characteristics of a
> distributed application is that
> it runs on many hosts, that, in fact, it may run ondifferent
> hosts tomorrow than it did today,
> due to fail over, load balancing, etc....  So, my contention
> is that a distributed application
> does not run on a host, it runs on a network, OK ?
> So now let's try to manage my distributed application using SNMP.
> Now this distributed database of MIBs actually has a primary
> key.  What is it.
> Well, its the IP address of the host.  Without that you don't
> get anywhere.
> OK, so if a distributed application runs on a network and not
> a computer, then it
> has no IP address, nor do its components.

I agree entirely with SNMP not being the ideal solution, but given that a
good commercial Console can be purchased today for SNMP management one could
minimize the IP address issue by doing:

A clustered, resilient appserver has a resilient singleton,
ManagementSingleton.  This is the single focal point of all framework
events, container events, optionally EJB's may call to the Event API (???).
The ManagementSingleton has business
rules for what to do with classes of events; [map to SNMP traps, emit log
messages, map to email, map to pager dials].  The SNMP manager may use RMON
/ SNMPv3 to access a single control point that acts as a Bridge between
SNMPv3 and the rest of the resilient/clustered appserver.

I'd be happy with the ManagementSingleton + business rule engine +  Bridges
to [email, pagers, logs, SNMP traps].

I'm glad to hear that there's many groups working on CMI like APIs.

There's a few features that 24x7 needs that I'm concerned are falling out
side the scope of these groups concern.

Basic features are:

- Notification of critical framework or Business logic failures

- Hands off initiation of service restoration based on failure modes
  given editable business rules.

- Actions when:
        - If the number of handles to an object is greater than X
          spin up another instance on another box.
        - If the method latency is growing, and now exceeds X milliseconds
          spin up another instance on another box
        - If method latency to object Y exceeds X milliseconds emit Event
          AlertMethodIsSlow

- Emits an Event when a method exceeds it's SLA (max acceptable method
  latency)

- Cancels a method dispatch that exceeds it's SLA and redispatches
  that method to another instance on another box (per rules).

- Configurable Object Pings that call business methods.  Emit an
  event if a Rule is fired.  That may in turn result in configuring
  another service or shutting down this service.

- Good UI design that makes it easy to monitor 100's of services
  on 20's of boxes.

  I.E. A War room view of all boxes, with flashing red icons for
  boxes with trouble.  Drill down on a system, container, object.

  The CMI has a work flow engine that emits traps, emails, pages
  with failures occur.

The net of is:

- The feature set is Runtime oriented not configuration time.

- The ManagementSingleton provides a central point for all events, failures
  and logging.

- The ManagementSinglton participates in delivering a resilient
  reliable system down to the method level SLA management.

- The ManagementSingleton has a business rule engine to map combinations of
  input events into new Events.

- The configuration can be the controlled element based on the business
logic
  so you can startup failed services on other boxes, when boxes fail, start
  their configuration on spare boxes or double up the appservers etc etc.

- The system is hands off to most failures and response time is seconds not
  hours as is typical for human intervention and restoration.

I'd love to see any mix of soon to be available products that will do the
above features?

curt


Curt Smith
Z-Tel
email:  [EMAIL PROTECTED]
work:   404-237-1166  x182
FAX:    404-237-1167

===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST".  For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".

Reply via email to