Tomcat 4.x cluster
A must read from Filip ;) http://www.theserverside.com/resources/article.jsp?l=Tomcat - Henri Gomez ___[_] EMAIL : [EMAIL PROTECTED](. .) PGP KEY : 697ECEDD...oOOo..(_)..oOOo... PGP Fingerprint : 9DF8 1EA8 ED53 2F39 DC9B 904A 364F 80E6 -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: [PROPOSAL Tomcat 4.x] Cluster
Bip Thelin typed the following on 04:06 PM 5/6/2001 -0700 We also need to answer the question of the request life cycle: the DistributedManager needs to know when a request begins and ends. At the beginning, it must lock the session to prevent other Catalina instances from using it in requests. This can probably just be done in Manager.findSession(). At the end, it must tell the ClusterStore to update the session to other members of the Cluster, and unlock it. I'm not really sure what you're saying here. OK, ignore my use of the term ClusterStore - I was melding the Store and Cluster concepts, but from your comments I see that we may want to be able to use both in the same setup. My point is that the Manager/Cluster needs to know when the session is in use by another instance of Catalina. A locking mechanism must be implemented by the Cluster (or whatever) to prevent a session from being used by multiple instances at once. This mechanism will require the Manager/Cluster system to know when a request begins using a session, and when it has finished. If we say that only one JVM at a time can manipulate a sessions since a sessions only belongs to one machine at a time the only time a session needs to be replicated is when it's created/changed/destroyed. Yes. But putting the session into the Cluster at creation time is unnecessary. It should be put in at the end of the request when it is created (other Catalina instances can't use it before then anyway), and updated at the end of each subsequent request. So we need to have the end of a request call into the Manager to indicate that the session can be sent to the Cluster and unlocked for use. I'd rather see the replication be implemented in a Manager(i.e. DistributedManager or maybe change name to MulticastDistributedManager) thus making it possible to run any Store with the DistributedManager(i.e. FileStore). OK, I take your point that extending Store isn't the way to go. But I don't think we should have a different Manager implementation for each available distribution mechanism - MulticastDistributedManager, JMSDistributedManager, JavaSpacesDistributedManager, JCacheDistributedManager, etc. We should use the same pattern that Manager/Store uses: a single DistributedManager should be implemented which is independent of the actual session sharing mechanism. It should be able to use any implementation of the Cluster interface. I'd like to keep the possibility open to implement different distribution strategies. The strategy we're looking at now is for each instance of the application to hold copies of every session. An alternative Cluster strategy would keep the sessions in a central location such as a database: when a request comes in for a session not found in the current instance, the Cluster checks the database to see if it's there. This isn't too different from simply using JDBCStore. A third way is to have just two instances of the application holding a given session: when instance A creates the session, the Cluster chooses instance B to hold a backup copy in case A goes down: if a request comes in to C, B still has it available. Not that we need to implement all of these, but the architecture we build now should allow these possibilities and others. Other people can try out different ideas, and users can choose the system best suited to their needs. I'm also not sure about the issues with using persistence and distribution simultaneously. If we simply use PersistentManager with this distribution code, each instance will save its own copy of every session to persistent storage. This might be desirable in some cases - I can see using FileStore, for instance. But if you use JDBCStore and the Multicast distribution, it's wasteful - with a 4 server farm, we have 4 copies of each session in the database. So how should this be addressed? Cluster ought to have some mechanism which (optionally) ensures that each session is only persisted once. This may mean having Cluster override Store functionality, which is why I was thinking of combining the two. Kief
Re: [PROPOSAL Tomcat 4.x] Cluster
Kief Morris wrote: [...] My point is that the Manager/Cluster needs to know when the session is in use by another instance of Catalina. A locking mechanism must be implemented by the Cluster (or whatever) to prevent a session from being used by multiple instances at once. This mechanism will require the Manager/Cluster system to know when a request begins using a session, and when it has finished. If we say that only one JVM at a time can manipulate a sessions since a sessions only belongs to one machine at a time the only time a session needs to be replicated is when it's created/changed/destroyed. Yes. But putting the session into the Cluster at creation time is unnecessary. It should be put in at the end of the request when it is created (other Catalina instances can't use it before then anyway), and updated at the end of each subsequent request. So we need to have the end of a request call into the Manager to indicate that the session can be sent to the Cluster and unlocked for use. Do we really need to lock a session for each request and then replicate it? Sorry I might be confused, you mean a request for a session or a request as in generating a new request object(http request). If we assume that a session is only in use in one JVM at a time(which I think we can assume) then that session doesn't need to be locked it just needs replication when it's changed. I'd rather see the replication be implemented in a Manager(i.e. DistributedManager or maybe change name to MulticastDistributedManager) thus making it possible to run any Store with the DistributedManager(i.e. FileStore). OK, I take your point that extending Store isn't the way to go. But I don't think we should have a different Manager implementation for each available distribution mechanism - MulticastDistributedManager, JMSDistributedManager, JavaSpacesDistributedManager, JCacheDistributedManager, etc. We should use the same pattern that Manager/Store uses: a single DistributedManager should be implemented which is independent of the actual session sharing mechanism. It should be able to use any implementation of the Cluster interface. Yes, sorry I was clear as mudd in my last email, so if you look at the DistributedManager.java I checked in 14h ago(as of writing) it now uses the new API's I created which is common for any distribuition protocol you might implement. I'd like to keep the possibility open to implement different distribution strategies. The strategy we're looking at now is for each instance of the application to hold copies of every session. An alternative Cluster strategy would keep the sessions in a central location such as a database: when a request comes in for a session not found in the current instance, the Cluster checks the database to see if it's there. This isn't too different from simply using JDBCStore. A third way is to have just two instances of the application holding a given session: when instance A creates the session, the Cluster chooses instance B to hold a backup copy in case A goes down: if a request comes in to C, B still has it available. One way that you could simply go with the cluster is to group them. So there is an option now to specify the name/port/address of the cluster. What I was thinking is that you could specify a cluster that this.jvm belongs too and then specify a cluster it should replicate too. Not that we need to implement all of these, but the architecture we build now should allow these possibilities and others. Other people can try out different ideas, and users can choose the system best suited to their needs. yes, agree. I'm also not sure about the issues with using persistence and distribution simultaneously. If we simply use PersistentManager with this distribution code, each instance will save its own copy of every session to persistent storage. This might be desirable in some cases - I can see using FileStore, for instance. But if you use JDBCStore and the Multicast distribution, it's wasteful - with a 4 server farm, we have 4 copies of each session in the database. So how should this be addressed? Cluster ought to have some mechanism which (optionally) ensures that each session is only persisted once. This may mean having Cluster override Store functionality, which is why I was thinking of combining the two. Yes, that's a good point, at first I was thinking that each machine in a Cluster is having it's own unique key, so when you generate an session id machine 1 would get something like: A1KDSFNRKIFLKMFDSFDSA where: || -- Are the two letters that identifies the machine, so when you know which machine that owns the session all machines that have the session replicated know that it doesn't belong to them so they shouldn't save in a Store. It's also useful for an eventuall tomcat dispatcher frontend to know which machine the session origins from. However some complications occur when you
Re: [PROPOSAL Tomcat 4.x] Cluster
On Mon, 7 May 2001, Bip Thelin wrote: [SNIP] Do we really need to lock a session for each request and then replicate it? Sorry I might be confused, you mean a request for a session or a request as in generating a new request object(http request). If we assume that a session is only in use in one JVM at a time(which I think we can assume) then that session doesn't need to be locked it just needs replication when it's changed. The servlet spec *requires* that all requests for a given session, at any point in time, be served by a single JVM. Whether and when replication happens seems to me like a quality of service issue for different implementations of the cluster -- I don't think a single answer will suffice here. I can conceive of everything from never migrating an existing session (essentially what the current load balancing support provides) to duplicating every single change live. An interesting question is, how do you detect when a session has been changed? Obviously, you can detect setAttribute/removeAttribute, but what about changes to the *internal* state of the attributes themselves that the session does not know about? (I understand, but haven't verified, that some J2EE containers expect you to call setAttribute again, on the same attribute, to tell the container that you've modified something). Craig
Re: [PROPOSAL Tomcat 4.x] Cluster
Craig R. McClanahan typed the following on 11:18 AM 5/7/2001 -0700 An interesting question is, how do you detect when a session has been changed? Obviously, you can detect setAttribute/removeAttribute, but what about changes to the *internal* state of the attributes themselves that the session does not know about? I think we have to consider the session to be dirty at the end of any request in which it was accessed. Kief
Re: [PROPOSAL Tomcat 4.x] Cluster
On Mon, 7 May 2001, Kief Morris wrote: Craig R. McClanahan typed the following on 11:18 AM 5/7/2001 -0700 An interesting question is, how do you detect when a session has been changed? Obviously, you can detect setAttribute/removeAttribute, but what about changes to the *internal* state of the attributes themselves that the session does not know about? I think we have to consider the session to be dirty at the end of any request in which it was accessed. That's certainly feasible, but I'd bet we find it's too conservative a view given the potential impact on performance (i.e. needless replications). Kief Craig