Tomcat Developers:
This is a forward of a message that I sent to Bip and Craig a few days ago,
regarding distributed session managment (aka Clustering). I haven't gotten
any feedback just yet, so I thought I'd throw this out to the whole dev
list.
The current implementation is broken. The following message explains
why and describes some possible solutions to this problem.
This feature (e.g. distributed session management) is an absolute
requirement
for any deployment that needs to scale beyond a single Tomcat instance, and
does not want the overhead of depending on JDBC for storing sessions.
I've also attached, at the bottom of this message, Two 'preliminary' RMI
interfaces
that describe (see scenario 1 below) how I think this session server and
it's
clients (e.g. tomcat instances) should converse.
SessionServer - to be implemented by the remote session manager/server
SessionClient - to be implemented by clients (e.g. Tomcat) of the remote
session manager/server.
I'm interested in making a contribution in this area and am anxious to
receive
some feedback from the dev-list members on this subject.
Regards,
Tom Drake
Email: [EMAIL PROTECTED]
----- Original Message -----
From: Tom Drake
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Saturday, November 10, 2001 10:48 PM
Subject: Tomcat: Distributed Session Management revisited
Bip:
I've looked closely at the existing catalina distributed session
management code, and determined that the following problems
exist. Since, I'm new to the code, it's highly likely that I've missed
something. Please correct any errors in my analysis, and provide
any input / feedback. I'm interested in contributing to this and would
greatly appreciate any input you can provide.
Problems with current solution:
- Session updates are not communicated to the other nodes in the cluster
- If a new Tomcat instance is added to the cluster, it has no means of
discovering any pre-existing sessions (being managed by the other nodes),
(unless JDBCStore is used to persist sessions).
- If a Tomcat instance is brought down and brought back up 'later', it
will read stale session data from it's file system store
- The fact that DistributedManager derives from PersistentManager
appears to create these problems:
If a Tomcat instance is brought down, it will save all of it's active
sessions via a 'Store'. If the instance is brought up again later, it
will, read stale session data from an out-of-date store. If brought
down again, it could (potentially) write these out-of-date entries
over the top of the store.
Creates extra overhead with no benefit. Session persistance
is not the goal of 'clustering' in the first place. The goal
of clustering is to provide (part of) a seamless load-balancing
and/or fail-over solution. The idea being that I can have 'n'
Tomcats running, to spread the client load. I should be able
to bring up new instances and bring down running instances
on the fly, without 'losing' any client sessions.
----------------------------------------------------------------------------
-----------------------------------
I've come up with the following ideas for solution to these problems:
1) Separate Session manager
2) Cooperative Session management (extending the existing multicast
solution).
Here are some more detailed descriptions of these solutions:
1) Implement a separate, Session manager, whose job is to:
- provide a means for new Tomcat instances to register themselves.
Upon registration, this server may send a list of 'active' session
ids
back to the new Tomcat instance.
- keep an up-to-date copy of all sessions in a given 'cluster'.
- destroy sessions that are no longer valid (due to logout, or
timeout).
- communicate new, updated, and obsoleted Session ID's to all
'registered' cluster members, so that they may maintain their own
hash of 'known' session ids.
Tomcat instances would need to do the following:
- on startup, register itself with the remote session manager.
It may receive a list of active session id's from the remote session
manager as a response, so it must store these id's in it's own
internal
hash (of known session id's).
- to create a new session, it can simply do so in it's own memory
space,
(assuming that the MessageDigest is guaranteed to produce a
session identifier that's unique across the 'cluster'). Upon
completion
of the client HTTP/AJP/Warp request, serialize the new session object
and send it off to the remote manager.
- to retrieve a session, it would need to check it's own internal
Session
'cache' first, if not found, query the remote session manager. The
remote session manager would send the requested Session object back
to the caller, or, if not found, return an error.
- to update a session*, it would need to send a serialized copy of the
Session
to the remote session manager. The manager would store the updated
session
data, then notify all the 'other' registered Tomcat instances that
the given
Session has changed (it only needs to send the Session ID).
* Update notification depends on implementation of
HttpSessionAttributeListener and HttpSessionBindingEvent.
- to delete / invalidate a Session, it must send a request to the
remote
Session Manager. The remote manager, would then need to destroy
the Session, and send a message to all the 'other' Tomcat
instances,
containing the Session ID that is no longer valid (so that the
other
Tomcat's can destroy any related data).
NB 1: Tomcat can 'cache' Session objects, eliminating the need to
retrieve them
from the remote session manager. If the cached session is updated by a
different
tomcat instance, it will communicate the updated session to the remote
session
manager which will in turn notify all other Tomcat instances of the
update. The
'other' Tomcats can simply remove the Session from their internal cache
when
an update notification is recieved. If an instance needs to access a
Session
that's not in it's cache, it would have to retrieve the session from
the remote
session manager at that time.
NB 2: This Session manager could be made to run as a stand-alone
process
or as part of one of the Tomcat instances. The key is that only one of
them
can be 'managing' the sessions for a given 'cluster'. If this code is
present
inside Tomcat, then the Tomcats in the cluster could 'negotiate' to
decide
who is to be manager. If the manager is brought down, it could notify
the other cluster members of it's impending demise so that they can
elect a new manager, which will then get a copy of all 'active'
sessions
and then take over the manager functionality.
Personally, I like the idea of a separate standalone Session Manager
because
of it's simplicity. Furthermore, I'm in favor of an RMI based
solution; partly
because of it's simplicity, but also because of it's power. RMI
callbacks, provide
a very natural means of implementing this type of collaborative
client/server
application.
The potential drawback to this approach is that the single remote
server
could wind up being a bottleneck. However, it would generate much less
network traffic than the current solution. And, in the type of
environment
the 'cluster' is really targeted to run in (e.g. multiple boxes),
network
traffic is likely to be a major cause of latency (as opposed to being
CPU
bound in the Session Manager).
2) Cooperative Session management (extending the existing multicast
solution).
The existing solution would need to modified to:
- broadcast updated/deleted Session's to the other cluster members.
- provide a means for bringing a new Tomcat instance on-line (such that
it can
obtain all the stored Session's.
The big drawback to the current approach, I feel, is that it generates
a lot of UDP
traffic. Complete Sessions are broadcast to all members of the cluster.
This would
need to include all updated Session objects as well. Lot's of traffic,
that may not ever
be needed. Adding all updated sessions to the UDP traffic increases the
likelyhood
that UDP packets will get lost. Which, in-turn, reduces confidence that
all Tomcats
will have the 'same' Session data.
Regards,
Tom Drake
President, software/etc inc.
Cell: 408-505-6864
Email: [EMAIL PROTECTED]
----------------------------------------------------------------
org.apache.sessionMgr.SessionServer.java:
----------------------------------------------------------------
package org.apache.sessionMgr;
import java.rmi.*;
import javax.servlet.http.HttpSession;
/**
* The interface implemented by a Distributed Session Management server.
* Clients (e.g. Tomcat instances) may register themselves with such
* a server. Having done so, they must send any new or updated Sessions
* to the server. Clients may retrieve Session objects from the server
* (by sessionId). Clients must also pay attention to
<tt>invalidateSession</tt>
* messages received from the server and update their internal Session
caches
* accordingly.
*
* @author Tom Drake
* @version 1.0
*/
public interface SessionServer extends java.rmi.Remote {
/**
* Register the given client (e.g. Tomcat) with a SessionServer instance.
* After the client has registered itself with a SessionServer,
* the SessionServer will start sending the client
* <tt>invalidateSession(String sessionId)</tt> messages, as sessions
* are updated or deleted.
*/
public void register(SessionClient client) throws RemoteException;
/**
* Disassociate the client (e.g. Tomcat instance) from the Distributed
* Session Management Server. The client will no receive messages from
* the server. And the client may no longer send messages to the server
* unless, it re-registers itself with the server.
*/
public void deRegister(SessionClient client) throws RemoteException;
/**
* Store the given sessionId / session association. No other clients
* are notified of this, however, they may retrieve this session via
* the <tt>getSession</tt> method.
*/
public void addSession(String sessionId, HttpSession session) throws
RemoteException;
/**
* Retrieve the Session that corresponds with the given sessionId from
* the Session Server.
*
* @return a valid session object or null, if the sessionId was not
defined
* or the session was not valid.
*/
public HttpSession getSession(String sessionId) throws RemoteException;
/**
* Update the last access time of the corresponding Session. This method
* is called by clients, when a Session is accessed, but not otherwise
updated.
* Other clients are <b>not</b> notified in this case!
*/
public void updateSessionAccessTime(String sessionId) throws
RemoteException;
/**
* Replace the session associated with the given sessionId with the given
* session object, and notify other clients that their copies, if any, are
* now invalid.
*/
public void updateSession(String sessionId, HttpSession session) throws
RemoteException;
/**
* Inform the SessionServer to destroy the corresponding Session, and
notify
* other clients of it's destruction.
*/
public void destroySession(String sessionId) throws RemoteException;
}
----------------------------------------------------------------
org.apache.sessionMgr.SessionClient.java:
----------------------------------------------------------------
package org.apache.sessionMgr;
import java.rmi.*;
/**
* The SessionClient interface is implemented by clients (e.g. Tomcat
instances)
* of the SessionServer. These methods are invoked on the client by the
server
* to notify the client that either the given session was invalidated, or
that the server
* will be going down shortly.
*
* @author Tom Drake
* @version 1.0
*/
public interface SessionClient extends Remote {
/**
* Notify this client (e.g. Tomcat) that the the Session data that
corresponds
* to the given Session Id is no longer valid. This could mean that that
the
* Session has been updated, or that it has exceeded its maximum allowable
age.
* In any case, the client should respond to this message by removing any
* corresponding Session objects from it's own internal cache.
*/
public void invalidateSession(String sessionId) throws RemoteException;
/**
* Notify this client that the SessionServer will be shutting down.
* Any further attempts to contact this SessionServer will not be
successful
* (unless and until it is restarted).
*/
public void serverShutdownNotification() throws RemoteException;
}
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>