On Sun, May 15, 2005 at 01:48:54PM -0700, Jeremy Boynes wrote:
I was going to include background here but thought it was easier to place it directly on the wiki. It can be found at:
http://wiki.apache.org/geronimo/Architecture/ConfigurationManagement
Thanks, Jeremy, it's cool to get some insight into the "whys and wherefores" that the code can't always convey. I've got a couple of questions about the wiki page, probably based on me failing to make a required intuitive leap. You make the point that Geronimo's architecture is designed to support remote management, but I didn't get a good sense of why or how the old (i.e. uninverted) architecture fails to support remote management, especially given that most telecom OSS/BSS systems are built on the old architecture, yet they seem to allow for remote large-scale network node management.
I guess I don't understand how configuring applications is more or less amenable to large-scale network management than configuring servers (especially as a big NMS would call Geronimo itself an "application"). Later in the page it seems like you're trying to point out that Geronimo has more of a "pull" model in that it can pull components from a trusted external source. That would be cool in that it would allow a fleet of Geronimo servers to do on their own a big part of what a traditional NMS would do for other j2ee servers.
It would really help me (and probably others) "get it" if you could provide an example of how the traditional architecture falls down and the application-centric architecture succeeds.
I'll admit it's been a while, but the basic principal of an NMS as I remember it is that the network elements themselves are dumb - they basically do nothing by themselves, the configuration of each one is controlled by the NMS. The NOC has a picture of how the global network is operating and, using the NMS, can reconfigure the elements as needed to maintain the quality of service.
This is the architecture we are trying to recapture in Geronimo.
In a massive deployment, each server basically comprises of only a pristine kernel and a local configuration manager (analogous to element firmware). When it comes online, it is recognized by the global application manager and dynamically configured to play the correct role in the global application environment (just like the NMS configures the element). As workload conditions change, the central manager can reconfigure each server as needed.
This is different to the traditional application server, think Jetty as an example (I hope Greg does not mind me picking on him). There each server maintains its own configuration and the administration model is one of going to the server (physically, via ssh, JMX remoting or a web console) and configuring it. The setup of each server is defined by its local configuration not by the external controller.
This works well for single servers or small clusters, but does not scale to massive deployments. However, the Geronimo model can be scaled down simply by having a very simple manager running inside a standalone server. For example, the current setup uses the FileConfigurationList to automatically configure the kernel at startup so that it is runs the last known running set of configurations.
A couple of final points. Yes, an NMS would call Geronimo an application; but also, the Geronimo architecture itself does not distinguish between configuration bundles for "system" services (e.g. the J2EE stack) and those for "applications" (e.g. a config generated by webapp deployment). The ultimate vision is that the kernel itself becomes ubiquitous and it is the configuration bundles that get moved around by some NMS.
Secondly, you can get a long way there by layering this on top of an existing architecture. For example, using web services or JMX you can do a lot of reconfiguration of running servers. However, we tried to build this in at the very core.
Thirdly, there are commercial systems out there that work a little bit this way. For example, (at least this is how I understand it), BEA WebLogic has this central management model at a cluster level where from a central console you can reconfigure any member server; IBM also has this in WebSphere ND and goes further with XD. In both, AIUI, you interact with the controller and not the individual servers.
Fourthly, the "pull" stuff is really intended for small to medium size deployments (where there isn't the full NMS). The idea is that resources can be staged to a trusted server in the production environment and that the runtime servers will pull from there. It also helps in a development environment where there are a load of unmanaged development servers; developers can install the app the need to integrate/test with and not have to worry about where all the dependencies will come from.
I hope that helps clarify things. We definitely need to do more of this and rely less on the raw code.
-- Jeremy
