Tomcat 4.x cluster

2002-04-24 Thread GOMEZ Henri

A must read from Filip ;)

http://www.theserverside.com/resources/article.jsp?l=Tomcat

-
Henri Gomez ___[_]
EMAIL : [EMAIL PROTECTED](. .) 
PGP KEY : 697ECEDD...oOOo..(_)..oOOo...
PGP Fingerprint : 9DF8 1EA8 ED53 2F39 DC9B 904A 364F 80E6 

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: [PROPOSAL Tomcat 4.x] Cluster

2001-05-07 Thread Kief Morris

Bip Thelin typed the following on 04:06 PM 5/6/2001 -0700
 We also need to answer the question of the request life cycle: the
 DistributedManager needs to know when a request begins and ends.
 At the beginning, it must lock the session to prevent other Catalina
 instances from using it in requests. This can probably just be done
 in Manager.findSession(). At the end, it must tell the ClusterStore to
 update the session to other members of the Cluster, and unlock it.

I'm not really sure what you're saying here. 

OK, ignore my use of the term ClusterStore - I was melding the
Store and Cluster concepts, but from your comments I see that
we may want to be able to use both in the same setup.

My point is that the Manager/Cluster needs to know when the session is in 
use by another instance of Catalina. A locking mechanism must be 
implemented by the Cluster (or whatever) to prevent a session from being 
used by multiple instances at once. This mechanism will require the
Manager/Cluster system to know when a request  begins using
a session, and when it has finished.

If we say that only one JVM at a time can manipulate a sessions since
a sessions only belongs to one machine at a time the only time a session
needs to be replicated is when it's created/changed/destroyed.

Yes. But putting the session into the Cluster at creation time is
unnecessary. It should be put in at the end of the request when
it is created (other Catalina instances can't use it before then
anyway), and updated at the end of each subsequent request.
So we need to have the end of a request call into the Manager
to indicate that the session can be sent to the Cluster and
unlocked for use.

I'd rather see the replication be implemented in a Manager(i.e. 
DistributedManager or
maybe change name to MulticastDistributedManager) thus making it possible to
run any Store with the DistributedManager(i.e. FileStore).

OK, I take your point that extending Store isn't the way to go. But
I don't think we should have a different Manager implementation for 
each available distribution mechanism - MulticastDistributedManager,
JMSDistributedManager, JavaSpacesDistributedManager, 
JCacheDistributedManager, etc. We should use the same pattern
that Manager/Store uses: a single DistributedManager should be
implemented which is independent of the actual session sharing
mechanism. It should be able to use any implementation of the
Cluster interface.

I'd like to keep the possibility open to implement different distribution 
strategies. The strategy we're looking at now is for each instance of the 
application to hold copies of every session. An alternative Cluster strategy 
would keep the sessions in a central location such as a database: when 
a request comes in for a session not found in the current instance, the 
Cluster checks the database to see if it's there. This isn't too different
from simply using JDBCStore. A third way is to have just two instances
of the application holding a given session: when instance A creates
the session, the Cluster chooses instance B to hold a backup copy
in case A goes down: if a request comes in to C, B still has it available.

Not that we need to implement all of these, but the architecture we
build now should allow these possibilities and others. Other people
can try out different ideas, and users can choose the system best
suited to their needs.

I'm also not sure about the issues with using persistence and distribution
simultaneously. If we simply use PersistentManager with this distribution 
code, each instance will save its own copy of every session to persistent 
storage. This might be desirable in some cases - I can see using FileStore,
for instance. But if you use JDBCStore and the Multicast distribution, it's
wasteful - with a 4 server farm, we have 4 copies of each session in the
database. So how should this be addressed? Cluster ought to have some
mechanism which (optionally) ensures that each session is only
persisted once. This may mean having Cluster override Store functionality,
which is why I was thinking of combining the two. 

Kief




Re: [PROPOSAL Tomcat 4.x] Cluster

2001-05-07 Thread Bip Thelin

Kief Morris wrote:
 
 [...]

 My point is that the Manager/Cluster needs to know when the session is in
 use by another instance of Catalina. A locking mechanism must be
 implemented by the Cluster (or whatever) to prevent a session from being
 used by multiple instances at once. This mechanism will require the
 Manager/Cluster system to know when a request  begins using
 a session, and when it has finished.
 
 If we say that only one JVM at a time can manipulate a sessions since
 a sessions only belongs to one machine at a time the only time a session
 needs to be replicated is when it's created/changed/destroyed.
 
 Yes. But putting the session into the Cluster at creation time is
 unnecessary. It should be put in at the end of the request when
 it is created (other Catalina instances can't use it before then
 anyway), and updated at the end of each subsequent request.
 So we need to have the end of a request call into the Manager
 to indicate that the session can be sent to the Cluster and
 unlocked for use.

Do we really need to lock a session for each request and then replicate it?
Sorry I might be confused, you mean a request for a session or a request
as in generating a new request object(http request). If we assume that a session
is only in use in one JVM at a time(which I think we can assume) then that
session doesn't need to be locked it just needs replication when it's changed.

 I'd rather see the replication be implemented in a Manager(i.e.
 DistributedManager or
 maybe change name to MulticastDistributedManager) thus making it possible to
 run any Store with the DistributedManager(i.e. FileStore).
 
 OK, I take your point that extending Store isn't the way to go. But
 I don't think we should have a different Manager implementation for
 each available distribution mechanism - MulticastDistributedManager,
 JMSDistributedManager, JavaSpacesDistributedManager,
 JCacheDistributedManager, etc. We should use the same pattern
 that Manager/Store uses: a single DistributedManager should be
 implemented which is independent of the actual session sharing
 mechanism. It should be able to use any implementation of the
 Cluster interface.

Yes, sorry I was clear as mudd in my last email, so if you look at the
DistributedManager.java I checked in 14h ago(as of writing) it now uses the
new API's I created which is common for any distribuition protocol you might implement.

 I'd like to keep the possibility open to implement different distribution
 strategies. The strategy we're looking at now is for each instance of the
 application to hold copies of every session. An alternative Cluster strategy
 would keep the sessions in a central location such as a database: when
 a request comes in for a session not found in the current instance, the
 Cluster checks the database to see if it's there. This isn't too different
 from simply using JDBCStore. A third way is to have just two instances
 of the application holding a given session: when instance A creates
 the session, the Cluster chooses instance B to hold a backup copy
 in case A goes down: if a request comes in to C, B still has it available.

One way that you could simply go with the cluster is to group them. So there
is an option now to specify the name/port/address of the cluster. What I was
thinking is that you could specify a cluster that this.jvm belongs too and then
specify a cluster it should replicate too.

 Not that we need to implement all of these, but the architecture we
 build now should allow these possibilities and others. Other people
 can try out different ideas, and users can choose the system best
 suited to their needs.

yes, agree.

 I'm also not sure about the issues with using persistence and distribution
 simultaneously. If we simply use PersistentManager with this distribution
 code, each instance will save its own copy of every session to persistent
 storage. This might be desirable in some cases - I can see using FileStore,
 for instance. But if you use JDBCStore and the Multicast distribution, it's
 wasteful - with a 4 server farm, we have 4 copies of each session in the
 database. So how should this be addressed? Cluster ought to have some
 mechanism which (optionally) ensures that each session is only
 persisted once. This may mean having Cluster override Store functionality,
 which is why I was thinking of combining the two.

Yes, that's a good point, at first I was thinking that each machine in a Cluster
is having it's own unique key, so when you generate an session id machine 1 would
get something like: A1KDSFNRKIFLKMFDSFDSA where:
|| -- Are the two letters that identifies the machine, so when
you know which machine that owns the session all machines that have the session 
replicated
know that it doesn't belong to them so they shouldn't save in a Store. It's also
useful for an eventuall tomcat dispatcher frontend to know which machine the session
origins from. However some complications occur when you 

Re: [PROPOSAL Tomcat 4.x] Cluster

2001-05-07 Thread Craig R. McClanahan



On Mon, 7 May 2001, Bip Thelin wrote:

 [SNIP]
 
 Do we really need to lock a session for each request and then
 replicate it? Sorry I might be confused, you mean a request for a
 session or a request as in generating a new request object(http
 request). If we assume that a session is only in use in one JVM at a
 time(which I think we can assume) then that session doesn't need to be
 locked it just needs replication when it's changed.
 

The servlet spec *requires* that all requests for a given session, at any
point in time, be served by a single JVM.

Whether and when replication happens seems to me like a quality of service
issue for different implementations of the cluster -- I don't think a
single answer will suffice here.  I can conceive of everything from never
migrating an existing session (essentially what the current load
balancing support provides) to duplicating every single change live.

An interesting question is, how do you detect when a session has been
changed?  Obviously, you can detect setAttribute/removeAttribute, but
what about changes to the *internal* state of the attributes themselves
that the session does not know about?  (I understand, but haven't
verified, that some J2EE containers expect you to call setAttribute again,
on the same attribute, to tell the container that you've modified
something).

Craig




Re: [PROPOSAL Tomcat 4.x] Cluster

2001-05-07 Thread Kief Morris

Craig R. McClanahan typed the following on 11:18 AM 5/7/2001 -0700
An interesting question is, how do you detect when a session has been
changed?  Obviously, you can detect setAttribute/removeAttribute, but
what about changes to the *internal* state of the attributes themselves
that the session does not know about?

I think we have to consider the session to be dirty at the end of
any request in which it was accessed.

Kief




Re: [PROPOSAL Tomcat 4.x] Cluster

2001-05-07 Thread Craig R. McClanahan



On Mon, 7 May 2001, Kief Morris wrote:

 Craig R. McClanahan typed the following on 11:18 AM 5/7/2001 -0700
 An interesting question is, how do you detect when a session has been
 changed?  Obviously, you can detect setAttribute/removeAttribute, but
 what about changes to the *internal* state of the attributes themselves
 that the session does not know about?
 
 I think we have to consider the session to be dirty at the end of
 any request in which it was accessed.
 

That's certainly feasible, but I'd bet we find it's too conservative a
view given the potential impact on performance
(i.e. needless replications).

 Kief
 
 

Craig