> > I still think that it is a bad idea to have every node in a cluster to
> have
> > the complete configuration information for every other node.  This simply
> > will not scale.
>
> You bring up also later but first let's talk about how much data each node
> has
> to manage and how often we expect that changes happend.
> 1) The actual size of the "default" configuration in the latest JBoss RH
> release
>     is around: 141 KB
> 2) The number of changes except on startup are less than one per hour one a
>      very active system. Normally a J2EE server should serve and not change
>      all day long (sorry when I sound cynical).

That is the default config, but we can not (must not) assume that it will remain
like that as the functionality of the server grows.  Thirdparty may have
significantly larger configurations.  More specifically those configurations
could be based on database access or more complex configuration generation
techniques.

Configuration management is a specialized application and may require
specialized resources.

Basically I am suggesting that the design not be centered around the notion
of "every node has every configuration", but instead assume that there is a
small sub-set of nodes in a cluster which we can talk to and get the
configuration from.

It will be much easier to keep a small number of nodes current with respect
to the latest config than to keep all nodes uptodate.

> Now, what do we have to send ? Whenever a changes is made to a server
> (deploy a service, change a service, stop/destroy a service, deploy and
> undeploy
> an application) we only have to send what to do which is pretty simple.

Either way this is the case.  We are basically talking about the same thing,
once we get past the location of the configuration.

> Therefore
> I don't expect to send much over the wire and also not much to be stored on
> each server. Don't forget that any classes you have to send over the wire
> has
> to send anyway.

True, you may have to load classes (collections of jars, whatever).  Lets
say that there is service0 version 1, which is running on all nodes.  You
want to install service0 version 2 on a select portion for testing, perhaps
deploy everywhere if it works.

If all nodes have the config, then when you install the config for service0
v2 all nodes in the cluster will have to refresh the config files as well as
the .jar's for that service.  If the jars are large then you may have a bit
free time to drink some coffee while all of your nodes sync up.

Having a smaller number of specialized nodes which are deal with this will
scale better than having all nodes deal with this, including net i/o, disk
space, cpu...

> > If you have 10 nodes, change one bit of config on node0, then node2-9 will
> > have to sync up there configuration.  There will probably have to be some
> > distributed tx going on here too so that we are sure that no new nodes
> > startup and use a stale config.
>
> You are saying that the problem of making the updates will not scale because
> then the number of servers grows the whole thing would grow exponential
> and not linear (as the number of servers does).
> BUT I said the every should know every other server on the cluster but only
> one send the update commands to the others. The actions a server has to be

Sure, we are talking about different things here.  Where does the config
live?  What is responsible for syncing that config?  Informing other nodes
about a new config is simple and does not matter who does it.

> able to perform:
> - after a changes notify all other servers of this

If you update a config running on 5% of the nodes in a group, why would
you inform the other 95% of the change, asking them to update and such?

> > Obviously you would want to segement large collections of nodes into
> groups
> > or clusters, but you may need to have a large group, which leaves you to
> > artificially segment or look to an alternate solution.
>
> Right now I only discuss this on a cluster level. The other scenario is to
> have
> a farm of servers running different applications but the same server
> configurations.
> I am not quite sure if this can be automated because some clients want
> Oracle,
> other PostgreSQL DB etc. You don't want to supply them with all the
> available
> service all other server has, don't you ?

Only copy the files required to start/run the services on each node...
anything else is a waste of space (except for the config servers).

> > Another example would be DNS.  There is no way that every DNS server could
> > sync up with each other DNS server, unless there numbers were small and on
> a
> > fast network.
>
> Yeah, but normally you have two doing the same, the configuration is a
> nightmare
> and most DNS server caches resolved mappings. You can run into serious
> troubles
> if the two of the top DNS server goes down or fails to serve or when someone
> temper the DNS cache.

Right. I don't suggest a DNS-style configuration scheme, where it passes on
requests to more authoritative servers.  It was an example of a distributed
database (which is mostly what the configuration is, a db of properties, xml
snippets and .jar files), which would not scale if each node in the network
was designed to be uptodate with the latest changes from all nodes.

> > I think that Jini could really help take JBoss to the next level if
> applied
> > thoughtfully.  Perhaps simply for the lookup serivce (discovery & join),
> > event model and leasing system.  We would still use JMX for all local
> > control, but add some Jini services, which link JMX agents together, allow
> > them to notify each other when they go up, down, need config, have new
> > configs.  Leasing could be used to detect system locks and other fancy
> > stuff.
>
> Isn't this what I proposed initially ?

You know, I am not really sure how this started.  I think that Jini would be
a positive addition to the JBoss infrastructure.  If you think so too, cool.

I would still push back on a configuration management system which is
designed with the intention of syncing all nodes in the system when config
changes.

There is just no reason to do that.

Use Jini to lookup the ConfigServer impl, which will return the service stub
for one of the nodes running in the group, then use it to pull your config.

Folks who are on a budget can have one or two config servers, or for a more
robust and fault tolerant system you can have 10 or 20.

Configuration servers are specialized nodes.  They may require database
access, large file systems, large net pipes and more.

If the design takes this into account and follows the lookup service model,
then you could install a config server on each node, but you don't have to.

What I am hearing from you is that we have to have this running on each
node.  Perhaps I am getting mixed signals, but that is what I am reading.

Lets find something to agree upon and continue =)

Sounds like Jini is a common theme.  Lets assume the lookup service model,
which will allow for both to work (all, most, some or one).  Let us let
those who implement applications on JBoss decide how fault tolerant there
configuration management is.

--jason


_______________________________________________
Jboss-development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development

Reply via email to