RE: Tomcat: Distributed Session Management revisited
In JCS (Java Caching System) in Turbine Stratum there is a rough distributed session layer. JCS might also be useful for storing session data even if it is not distributed. http://jakarta.apache.org/turbine/stratum/JavaCachingSystem.html JavaGroups may be good for a JCS distribution plugin as well. I'll take a look. . . . Aaron -Original Message- From: Filip Hanik [mailto:[EMAIL PROTECTED]] Sent: Friday, February 15, 2002 9:12 PM To: Tomcat Developers List Subject: Re: Tomcat: Distributed Session Management revisited hi, let me introduce myself. My name is Filip Hanik and I just rejoined this mailing list since my time has freed up a little bit lately. I was looking through the source code and the archives and was wondering what the status Tomcat has on session replication. When looking through the source code I saw package org.apache.catalina.cluster; anyway, let me get to the point. I'm one of the developers on an open source project called JavaGroups (www.javagroups.com). JavaGroups is a reliable group communication protocol which is perfectly suitable as a messaging protocol between your cluster nodes. Built into JavaGroups you already have * Reliable multicasting * Group membership changes (ie you will be notified when nodes in the cluster crashes or startup) * Highly configurable protocol stack * Works out of the box This means, that if you already have a cluster implementation, this already developed implementation can leverage javagroups as a communication protocol. Since I'm not a developer on Tomcat (Catalina) I can of course not dictate the decisions that are being made around this subject, but if there is a possibility for me to help out in the clustering implementation I would be more than happy to do so. have a great weekend. Filip ~ Namaste - I bow to the divine in you ~ Filip Hanik Software Architect [EMAIL PROTECTED] www.filip.net -- To unsubscribe, e-mail: mailto:tomcat-dev- [EMAIL PROTECTED] For additional commands, e-mail: mailto:tomcat-dev- [EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
hi, let me introduce myself. My name is Filip Hanik and I just rejoined this mailing list since my time has freed up a little bit lately. I was looking through the source code and the archives and was wondering what the status Tomcat has on session replication. When looking through the source code I saw package org.apache.catalina.cluster; anyway, let me get to the point. I'm one of the developers on an open source project called JavaGroups (www.javagroups.com). JavaGroups is a reliable group communication protocol which is perfectly suitable as a messaging protocol between your cluster nodes. Built into JavaGroups you already have * Reliable multicasting * Group membership changes (ie you will be notified when nodes in the cluster crashes or startup) * Highly configurable protocol stack * Works out of the box This means, that if you already have a cluster implementation, this already developed implementation can leverage javagroups as a communication protocol. Since I'm not a developer on Tomcat (Catalina) I can of course not dictate the decisions that are being made around this subject, but if there is a possibility for me to help out in the clustering implementation I would be more than happy to do so. have a great weekend. Filip ~ Namaste - I bow to the divine in you ~ Filip Hanik Software Architect [EMAIL PROTECTED] www.filip.net -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Mika, Costin: While pondering things yesterday, the thought occured to me as well, perhaps this whole notion of distributed management had a wider application (beyond simply HttpSessions). Having said that. I disagree, that Tomcat should expose an 'object replication service' api to web-application writers. Such a thing is clearly NOT a standard. Tomcat developers, on the other hand, may, over-time find other uses for such a mechanism. If we were to create such a generic object replication service, it should be packaged separately from Tomcat This, however, is secondary to my primary goal. So, I will be providing abstractions where it seems to make the most sense. But I definately want to keep things as simple as possible, and not 'over-design' this thing. I'd like to create a working solution that solves the real business problems (discussed ad nauseum on this list). Tom - Original Message - From: Mika Goeckel [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Thursday, November 15, 2001 3:24 AM Subject: Re: Tomcat: Distributed Session Management revisited | Costin, | | that point of view is really interesting. What about separating the | distribution part from the integration part of a integrated solution. | That would user's give the option to use the transparent session | replication or to use explicit object replication services. | The former would ease their live with the drawback of missing transaction | support, the latter would give the app-developer all control over it. | | M. | | - Original Message - | From: [EMAIL PROTECTED] | To: Tomcat Developers List [EMAIL PROTECTED] | Sent: Wednesday, November 14, 2001 6:27 PM | Subject: Re: Tomcat: Distributed Session Management revisited | | | To clarify: creating a Distributed Session Manager is a good idea, and | something that would be great for users. | | My problem is with designing it at container-level, as an implementation | of the servlet session API. | | Having all objects in a session distributed and no control or feedback is | not good. | | You could have a DSMServlet that would have some init parameters in | web.xml. There you can specify what classes/interfaces you want | 'distributed', or even what attributes ( by name ). | | Then you can use the existing listeners and notifications to detect when | those objects are saved in the session and do the distribution. | | You could also define a simple API allowing the user to control this - for | example and update() method to be called after the user changes an object. | | What's different here is that the behavior of servlet sessions doesn't | change - most objects can still be stored without an overhead. The user | gets to choose what objects to persist/distribute and he has control over | the process. And this will work in _any_ container, so the user can assume | the objects he marks as persistent/distributable will be this way on any | container ( you can't force people to switch to a different container to | use your webapp ) | | You can also define specialized interfaces to be implemented by the | objects that will be persisted/distributed. | | All of this can be done with only standard 2.3 support. A container may | provide aditional hooks ( valves, interceptors, etc) of course. | | Costin | | | -- | To unsubscribe, e-mail: | mailto:[EMAIL PROTECTED] | For additional commands, e-mail: | mailto:[EMAIL PROTECTED] | | | | -- | To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] | For additional commands, e-mail: mailto:[EMAIL PROTECTED] | | | -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Tomcat: Distributed Session Management revisited
Costin, that point of view is really interesting. What about separating the distribution part from the integration part of a integrated solution. That would user's give the option to use the transparent session replication or to use explicit object replication services. The former would ease their live with the drawback of missing transaction support, the latter would give the app-developer all control over it. More important is separating the implementation of the 'distribution' part as a stand-alone module. I'm absolutely -1 on putting this in the same space with the main tomcat4 - it's already a huge mess and adding this much complexity is going to make things worse. May be a good candidate for jakarta-commons, since it could be used for example in Avalon. I'm not even sure that a Distributed Stuff Management is even Servlet Engine Specific, it fit more on EJB repository land. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Understood. I'm looking at a locking mechanism (with a configurable timeout). I'll be sending some diagrams and interface specs out in the next day or two. Tom - Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 11:41 AM Subject: Re: Tomcat: Distributed Session Management revisited | | | On Tue, 13 Nov 2001, Tom Drake wrote: | | I want a distributed session store, where all sessions are known (or | are knowable) by all members of the cluster, with a built-in | fail-over mechanism? | | As you guys discuss this, don't forget a very important requirement in the | servlet specification with regards to distributable applications: | | [Servlet Spec 2.3, Section 7.7.2] Within an application | marked as distributable, all requests that are part of a | session must be handled by one virtual machine at a time. | | In effect, this means that a session can be migrated to a different server | only between requests. On a failure of the server currently handling | the session, you could migrate it to a different server, but this | operation must be atomic. | | Craig | | | | -- | To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] | For additional commands, e-mail: mailto:[EMAIL PROTECTED] | | | -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Wednesday, November 14, 2001 12:26 AM Subject: Re: Tomcat: Distributed Session Management revisited On Tue, 13 Nov 2001, Mika Goeckel wrote: I completely agree, that the API lacks proactive support for things in the background that may fail. But given the fact, that we support a reference implementation which has managed to provide really professional services to users (other ref implementations are just for demonstration, nobody would use them in production) and there are (commercial) solutions, that provide session fail-over in the limitations of this API, we **must** try to provide a Well, the cool thing about open source is that we _don't_ need to implement all the bloat that commercial solution have. Costin, I don't disagree with your opinion. We don't need to, because we work on a voluntary base. But don't you think that having the option to provide better or at least equally professional solutions is a good motivation? solution. The API does not specify, how often the container may try to provide that service or what means it utilizes to do that. Nothing is 100% and I think it is better to live with the uncertaincy we discuss here than with the more likely problem that an instance fails and there is no potential replacement. I think it's better to live with the certaincy that everything can ( and will ) fail and tomcat can't change this. The alternative is to give users the impression the data he puts in a session will be safe - and he may rely on that instead of using a transaction and a real API. Databases, EJB, etc are complex - but there's a reason to that. Well, we could argue about how much compexity is actually needed, but one thing is certain ( I hope ) - get/setAttribute is not enough, if you want data integrity you must use a different API ( in particular transactions ). Byte-comparison is not the worst solution. If we think about differential updates, byte comparison is a good candiate for that and surplus one that promises good performance. Byte compare every 5 seconds every object in session ? Let's say you just displayed the confirmation and charged the credit card, but the machine crashed just before you sent the order. ( or reverse - you sent it but didn't charged the credit card ). This should happen in below 5 seconds. Yep, but a single stand-alone instance is not invulnerable to that scenario. In fact a thoroughly designed cluster gives better chances. If the user wants to place things in a session that she does not need to be replicated, she has the option to declare them transient and write a getter that checks if the Attribute is present, otherwise reconstructs it (in the case of a picture, reloads it from disk). The user has the choice to design for performance or ease. We only need to document the options. So the user should change all his objects to implement some arbitrary pattern just to fit this into our solution ? What if the object is not user defined ( like most are ) ? Well, we have to create wrappers for each objects you store in a session. Try to explain this on tomcat-user ( or tomcat-dev ) ... If the only alternatice is to use a professional EJB server, he would need to change them as well. I don't say he has to mark these values transient, it's only an option. And transient is not an arbitrary option, it's core java since JLS1.1 (1998). Sessions have always been somewhat fragile, but as the container is free to use transactions when the session is passed to another instance, at least that can be made secure enough. So the guarantee to the user of the container would not be made weaker. If the transaction fails, the session stays with the JVM where it originally was. The fail-over functionality would not be possible, but the situation to the app-developer would stay stable. I think that the documentation must clearly communicate to app-developers the risks and shortfalls of a distributed application and then let them choose by themselves what best meets their requirements. Mika -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
[EMAIL PROTECTED] wrote: On Tue, 13 Nov 2001, Paul Speed wrote: I think the idea is that you'd byte compare on commit which ideally would happen at request boundaries. So in this case a single request becomes a transaction... which indeed opens up its own issues, but no bigger than the ones that were always there. Not good enough - when the request is completed the user already has the page confirming his order ( and maybe the card was already charged :-). Yes, but that all we have are requests in a session. In fact, that's the same case that fails in _every_ scenario that doesn't involve full EJB-like transaction support. As soon as you access one single piece of data that isn't covered by the transaction support, you lose some amount of failover recovery. And what's worse, far too many people will not realize that, and read the marketing stuff ( 'we support failover, session replication, etc') and believe it is a magic solution. That being said, there may still be a place for a session-based distribution mechanism that can support load balancing, hot-swapping of tomcats, and basic failover. It should definitely be an opt-in sort of thing though, ie: web apps that meet the restrictions can opt to setup tomcat to provide this feature. I agree it would be nice to have a tool that can store objects with fail-over, distribution, etc and using it as a _complement_ to the session ( maybe using the session id, expiration, etc ). I don't think this tool can be used using only the current servlet session API or that it should be used as a servlet session manager. distributed session environment. I think that's a given. Personally, I'm still trying to figure out if there are a large enough number of webapps that could be supported to make it worth the effort. (Heavy emphasis on effort.) I'm more worried about the number of webapps that would be written with the assumption that the session will be magically safe, instead of using transactions/database/EJB/ or whatever storage API. Costin -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
To clarify: creating a Distributed Session Manager is a good idea, and something that would be great for users. My problem is with designing it at container-level, as an implementation of the servlet session API. Having all objects in a session distributed and no control or feedback is not good. You could have a DSMServlet that would have some init parameters in web.xml. There you can specify what classes/interfaces you want 'distributed', or even what attributes ( by name ). Then you can use the existing listeners and notifications to detect when those objects are saved in the session and do the distribution. You could also define a simple API allowing the user to control this - for example and update() method to be called after the user changes an object. What's different here is that the behavior of servlet sessions doesn't change - most objects can still be stored without an overhead. The user gets to choose what objects to persist/distribute and he has control over the process. And this will work in _any_ container, so the user can assume the objects he marks as persistent/distributable will be this way on any container ( you can't force people to switch to a different container to use your webapp ) You can also define specialized interfaces to be implemented by the objects that will be persisted/distributed. All of this can be done with only standard 2.3 support. A container may provide aditional hooks ( valves, interceptors, etc) of course. Costin -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Pier, Tom, cool, the discussion is starting to become interesting. :-) comments below: - Original Message - From: Pier Fumagalli [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 3:04 AM Subject: Re: Tomcat: Distributed Session Management revisited On 13/11/2001 12:54 am, Tom Drake [EMAIL PROTECTED] wrote: Mika: Thanks for the reply. Here's some more thoughts on this subject. The primary problem that I see with the collaborative method (e.g. extending the multicast solution) is that all sessions will have to be sent to all cluster nodes. The number session updates that have to travel 'on the wire' is in relation to the number of nodes in the cluster. Linear growth, that's the best we can aim for... Further more, when a new tomcat is brought on-line, it must somehow retrieve a copy of all active sessions from somewhere. There is nothing in place for this currently. Using multicast is problematic. If a multicast request is made then all other nodes would respond with all sessions. So, some other approach would need to be taken which would result in two protocols being used to make this feature work. This seems too complicated. Not that complicated. Most of the work on elective processes has been done already in the scope of other projects, so, we would only need to adapt it to our scope... I agree with Pier, in my view that's separating the application layer (content) from the transportation control layer (where, how). --- Consider this scenario: A user establishes a session on node 1 (of a 10 node cluster), Tomcat would create a new session and transmit it to the multicast port, which would then transmit 10 copies of this session (1 to each cluster node). Now suppose that the next request from this user is sent to node 2, which causes an update to the session to occur. Again 11 copies of the Session are transferred. [...] NOTE: remember this is UDP traffic. The more packets that fly around, the greater the likely-hood of dropping packets. Dropped packets in this case means that some tomcat instances may have stale (or no) data for a given session. Indeed... Quite huge... Yes, multicast udp should only be used to autoconfigure the cluster (who's there, who's taking sessions etc.), which should also be configurable for non-multicast-environments. In that case lists of adresses are used to select who's the next to take over. In fact, if all node's are holding information about the peers, we don't need to have long lists. An upcoming node would need only one already configured node to ask the cluster's spread via TCP. It's join could be communicated via daisy-chain. (one message per member is linear). -- With a centralized session manager the following traffic would occur instead: node1 sends new session to server manager node 2 requests the given (session id) session from the server manager manager sends a copy of the session to node 2 node 2 updates the session and sends it back to the manager. manager sends the 'invalidateSession(sessionId)' method in each of nodes. (note: invalidateSession only contains the value of 'SessionId' + any additional RMI overhead. This is far smaller than a complete Session object) The number of session copies sent as the result of an update is 2. This number does not depend or vary based on the number of nodes. Now, let's add to the story. Let's say that Tomcat is smart enough to cache Session objects in it's memory space. Once Tomcat gets its hands on a 'Session' it keeps it until it becomes 'too old' or an 'invalidateSession(sessionId)' message is received from the remote Session Manager. This could cut down the the number of transfers of Session data from 2 to somewhere between 1 and 2. Yes, but in this case, we don't have redundancy of sessions... So, if the Tomcat which has the session dies, the whole session dies with him... - On Redundant Session Managers. There are a couple ways to achieve this. One way is to place two Session Managers in the network. One of them is the 'active' one, the other one could simply register itself as a client of the 'active' server. As a client, it can obtain copies of all new and changed sessions from the active server. If for some reason the active server needs to be brought down, it will send a message to all of it's clients (including the 'dormant' session manager) indicating that it's shutting down. The clients could, on receipt of this message, connect to the 'next' session server (in their pre-configured list of servers). The clients could simply carry on with the new server. Indeed... If the active server simply goes off the air for some mysterious reason. The clients would get
Re: Tomcat: Distributed Session Management revisited
Subject: Re: Tomcat: Distributed Session Management revisited | On 13/11/2001 12:54 am, Tom Drake [EMAIL PROTECTED] wrote: | | Mika: | | Thanks for the reply. Here's some more thoughts on this subject. | | The primary problem that I see with the collaborative method | (e.g. extending the multicast solution) is | that all sessions will have to be sent to all cluster nodes. The | number session updates that have to travel 'on the wire' is in | relation to the number of nodes in the cluster. | | Linear growth, that's the best we can aim for... | | Further more, when a new tomcat is brought on-line, it must | somehow retrieve a copy of all active sessions from somewhere. | There is nothing in place for this currently. Using multicast | is problematic. If a multicast request is made then all other nodes | would respond with all sessions. So, some other approach would | need to be taken which would result in two protocols being used | to make this feature work. This seems too complicated. | | Not that complicated. Most of the work on elective processes has been done | already in the scope of other projects, so, we would only need to adapt it | to our scope... | | --- | Consider this scenario: | | A user establishes a session on node 1 (of a 10 node cluster), | Tomcat would create a new session and transmit it to the | multicast port, which would then transmit 10 copies of this | session (1 to each cluster node). | Now suppose that the next request from this user is sent to | node 2, which causes an update to the session to occur. Again | 11 copies of the Session are transferred. | [...] | NOTE: remember this is UDP traffic. The more packets that | fly around, the greater the likely-hood of dropping packets. | Dropped packets in this case means that some tomcat | instances may have stale (or no) data for a given session. | | Indeed... Quite huge... | | -- | With a centralized session manager the following traffic would | occur instead: | | node1 sends new session to server manager | node 2 requests the given (session id) session from the server manager | manager sends a copy of the session to node 2 | node 2 updates the session and sends it back to the manager. | manager sends the 'invalidateSession(sessionId)' method in each of nodes. | (note: invalidateSession only contains the value of 'SessionId' + any | additional |RMI overhead. This is far smaller than a complete Session object) | | The number of session copies sent as the result of an update is 2. | This number does not depend or vary based on the number of nodes. | | Now, let's add to the story. Let's say that Tomcat is smart enough to cache | Session objects in it's memory space. Once Tomcat gets its hands on a | 'Session' | it keeps it until it becomes 'too old' or an 'invalidateSession(sessionId)' | message is | received from the remote Session Manager. This could cut down the the number | of transfers of Session data from 2 to somewhere between 1 and 2. | | Yes, but in this case, we don't have redundancy of sessions... So, if the | Tomcat which has the session dies, the whole session dies with him... | | - | On Redundant Session Managers. | | There are a couple ways to achieve this. One way is to place two Session | Managers in the network. One of them is the 'active' one, the other one could | simply register itself as a client of the 'active' server. As a client, it can | obtain copies of all new and changed sessions from the active server. If for | some reason the active server needs to be brought down, it will send a message | to all of it's clients (including the 'dormant' session manager) indicating | that it's shutting down. The clients could, on receipt of this message, | connect to the 'next' session server (in their pre-configured list of | servers). The clients could simply carry on with the new server. | | Indeed... | | If the active server simply goes off the air for some mysterious reason. The | clients would get a RemoteException the next time they tried to talk to the | server. This would be their clue to 'cut-over' to the other server (as | described above). | | But how would they know where the sessions ended up | | Last point. Sending Session delta's instead of the entire Session: | | This should be doable. The main thing that we care about are Session | attributes which are changed by the application. It's up to the | web-application to replace these values into the Session if their contents | change. This is enough for us to be able to track which attributes have | actually changed. | | This can actually be done if we consider every operation on a session | (adding/replacing/removing an attribute) and atomic operation | | Let's see if I can complicate things a little bit :) (Love doing that). | | Let's imagine to have
Re: Tomcat: Distributed Session Management revisited
Pier, Mikal: I agree, I think the juices are flowing. See below Tom - Original Message - From: Mika Goeckel [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 1:37 AM Subject: Re: Tomcat: Distributed Session Management revisited | Pier, Tom, | | cool, the discussion is starting to become interesting. :-) | | comments below: | | - Original Message - | From: Pier Fumagalli [EMAIL PROTECTED] | To: Tomcat Developers List [EMAIL PROTECTED] | Sent: Tuesday, November 13, 2001 3:04 AM | Subject: Re: Tomcat: Distributed Session Management revisited | | | On 13/11/2001 12:54 am, Tom Drake [EMAIL PROTECTED] wrote: | | Mika: | | Thanks for the reply. Here's some more thoughts on this subject. | | The primary problem that I see with the collaborative method | (e.g. extending the multicast solution) is | that all sessions will have to be sent to all cluster nodes. The | number session updates that have to travel 'on the wire' is in | relation to the number of nodes in the cluster. | | Linear growth, that's the best we can aim for... | | Further more, when a new tomcat is brought on-line, it must | somehow retrieve a copy of all active sessions from somewhere. | There is nothing in place for this currently. Using multicast | is problematic. If a multicast request is made then all other nodes | would respond with all sessions. So, some other approach would | need to be taken which would result in two protocols being used | to make this feature work. This seems too complicated. | | Not that complicated. Most of the work on elective processes has been | done | already in the scope of other projects, so, we would only need to adapt it | to our scope... | | I agree with Pier, in my view that's separating the application layer | (content) from the transportation control layer (where, how). | Point taken, however, I strongly believe in keeping things simple. I'd not want to introduce extra communication channels unless there there is a REALLY good reason to do so. | | --- | Consider this scenario: | | A user establishes a session on node 1 (of a 10 node cluster), | Tomcat would create a new session and transmit it to the | multicast port, which would then transmit 10 copies of this | session (1 to each cluster node). | Now suppose that the next request from this user is sent to | node 2, which causes an update to the session to occur. Again | 11 copies of the Session are transferred. | [...] | NOTE: remember this is UDP traffic. The more packets that | fly around, the greater the likely-hood of dropping packets. | Dropped packets in this case means that some tomcat | instances may have stale (or no) data for a given session. | | Indeed... Quite huge... | | Yes, multicast udp should only be used to autoconfigure the cluster (who's | there, who's taking sessions etc.), which should also be configurable for | non-multicast-environments. In that case lists of adresses are used to | select who's the next to take over. In fact, if all node's are holding | information about the peers, we don't need to have long lists. An upcoming | node would need only one already configured node to ask the cluster's spread | via TCP. It's join could be communicated via daisy-chain. (one message per | member is linear). | This is certainly a reasonable approach. However, we could use the SNMP approach to auto-discovery instead of multicast. This would simplify configuration (e.g. no need for multicast port) and achieve the goal using a fairly standard way. Having said that. I'm not lobbying for this approach, simply tossing into the ether. One last comment about the 'anything-but simple network management protocol'. Limited SNMP mib implementation support within tomcat / session server may be helpful in those environments that depend heavily on SNMP management (read telco's). Again, I really don't want to take a left turn here, it just made me think about SNMP. | | -- | With a centralized session manager the following traffic would | occur instead: | | node1 sends new session to server manager | node 2 requests the given (session id) session from the server manager | manager sends a copy of the session to node 2 | node 2 updates the session and sends it back to the manager. | manager sends the 'invalidateSession(sessionId)' method in each of | nodes. |(note: invalidateSession only contains the value of 'SessionId' + any | additional | RMI overhead. This is far smaller than a complete Session object) | | The number of session copies sent as the result of an update is 2. | This number does not depend or vary based on the number of nodes. | | Now, let's add to the story. Let's say that Tomcat is smart enough to | cache | Session objects in it's memory space. Once Tomcat gets its hands
Re: Tomcat: Distributed Session Management revisited
SNMP, ah ja. I've got no knowledge at all 'bout that, so fight with some other lobbyists :-) SessionManager/ServletContainer dualism: If we don't create a separate SessionManager residing in it's own JVM, but make it an integral capability of TC, we have the following benefits: - we save one copy: When a new session is created and we have a separate network of SMs, it needs to be copied to at least two SMs, if we have it in TC, it only needs to be copied to one other TC. (If we aim single redundance) - if one TC is the owner and the other the escrow, the owner never needs to ask if the session is uptodate or invalid, and it can't get stale. The replication of changes can be done after the request, so no time burden within the request itself. If the escrow want's to use the session, it only needs to inform the owner and they change roles (or if possible the escrow passes the request back to the owner) Frequently all servers ping their known escrows and owners to ensure they're still present. - deserialisation should not be a problem, because in that ClassLoader Context, the user-session objects are known. (correct me if I'm wrong here) AutoConf what about JNDI to register cluster nodes? It is around anyway. In that case an upcoming TC would just search the JNDI service for registered nodes with his own ClusterName, and register itself with it. Getting back a NamingEnumeration, it could decide itself, which of the others to link with. Mika - Original Message - From: Tom Drake [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 4:47 PM Subject: Re: Tomcat: Distributed Session Management revisited Pier, Mikal: I agree, I think the juices are flowing. See below Tom - Original Message - From: Mika Goeckel [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 1:37 AM Subject: Re: Tomcat: Distributed Session Management revisited | Pier, Tom, | | cool, the discussion is starting to become interesting. :-) | | comments below: | | - Original Message - | From: Pier Fumagalli [EMAIL PROTECTED] | To: Tomcat Developers List [EMAIL PROTECTED] | Sent: Tuesday, November 13, 2001 3:04 AM | Subject: Re: Tomcat: Distributed Session Management revisited | | | On 13/11/2001 12:54 am, Tom Drake [EMAIL PROTECTED] wrote: | | Mika: | | Thanks for the reply. Here's some more thoughts on this subject. | | The primary problem that I see with the collaborative method | (e.g. extending the multicast solution) is | that all sessions will have to be sent to all cluster nodes. The | number session updates that have to travel 'on the wire' is in | relation to the number of nodes in the cluster. | | Linear growth, that's the best we can aim for... | | Further more, when a new tomcat is brought on-line, it must | somehow retrieve a copy of all active sessions from somewhere. | There is nothing in place for this currently. Using multicast | is problematic. If a multicast request is made then all other nodes | would respond with all sessions. So, some other approach would | need to be taken which would result in two protocols being used | to make this feature work. This seems too complicated. | | Not that complicated. Most of the work on elective processes has been | done | already in the scope of other projects, so, we would only need to adapt it | to our scope... | | I agree with Pier, in my view that's separating the application layer | (content) from the transportation control layer (where, how). | Point taken, however, I strongly believe in keeping things simple. I'd not want to introduce extra communication channels unless there there is a REALLY good reason to do so. | | --- | Consider this scenario: | | A user establishes a session on node 1 (of a 10 node cluster), | Tomcat would create a new session and transmit it to the | multicast port, which would then transmit 10 copies of this | session (1 to each cluster node). | Now suppose that the next request from this user is sent to | node 2, which causes an update to the session to occur. Again | 11 copies of the Session are transferred. | [...] | NOTE: remember this is UDP traffic. The more packets that | fly around, the greater the likely-hood of dropping packets. | Dropped packets in this case means that some tomcat | instances may have stale (or no) data for a given session. | | Indeed... Quite huge... | | Yes, multicast udp should only be used to autoconfigure the cluster (who's | there, who's taking sessions etc.), which should also be configurable for | non-multicast-environments. In that case lists of adresses are used to | select who's the next to take over. In fact, if all node's are holding | information about the peers, we don't need to have long lists
Re: Tomcat: Distributed Session Management revisited
On 13/11/2001 04:38 pm, Mika Goeckel [EMAIL PROTECTED] wrote: SNMP, ah ja. I've got no knowledge at all 'bout that, so fight with some other lobbyists :-) Same here... SessionManager/ServletContainer dualism: If we don't create a separate SessionManager residing in it's own JVM, but make it an integral capability of TC, we have the following benefits: - we save one copy: When a new session is created and we have a separate network of SMs, it needs to be copied to at least two SMs, if we have it in TC, it only needs to be copied to one other TC. (If we aim single redundance) Indeed it would save bandwidth... - if one TC is the owner and the other the escrow, the owner never needs to ask if the session is uptodate or invalid, and it can't get stale. The replication of changes can be done after the request, so no time burden within the request itself. If the escrow want's to use the session, it only needs to inform the owner and they change roles (or if possible the escrow passes the request back to the owner) Frequently all servers ping their known escrows and owners to ensure they're still present. The only problem I could see with that is synchronization of accesses from different points, but I believe that is a solvable problem... - deserialisation should not be a problem, because in that ClassLoader Context, the user-session objects are known. (correct me if I'm wrong here) Nope, you're right on that. AutoConf what about JNDI to register cluster nodes? It is around anyway. In that case an upcoming TC would just search the JNDI service for registered nodes with his own ClusterName, and register itself with it. Getting back a NamingEnumeration, it could decide itself, which of the others to link with. One thing that can be done with my approach of multicasting is automatic load balancing... To any request of who can hold this session, each manager can return a load index, and the decision on where the session should be stored primarily and in replica should be based on that. Using JNDI that can be done, but I don't want to end up in a situation where a single host holds 80% of the sessions while the others are free... If the managers could update their JNDI registrations with a load factor every X seconds, that would be acceptable... Pier -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Hi, Interesting discussion, it's good to see some on this distribution issue, the devils always in the detail! See comments | But how would they know where the sessions ended up All session managers keep a copy of all sessions. So, it doesn't matter which server a client talks to. This idea of all SM's keeping copies of all sessions seems to go against the point of having multiple SMs. If increasing SMs is a way to provide redundancy then this approach removes scaleability. If increasing SMs is to provide scaleability then having all sessions in all managers won't work. If you hit the bottleneck then you can't just add a new SM to solve any load problem. Why do SMs need to have copies of all, why not just keep those sessions that are created in the SM there. The session ID can contain the SM service id which can be adopted by a service that takes over from any failed one. If we become more granular, such that individual session attribute changes are communicated independantly (rather than a complete copy of the session) our need for locking may diminish. However, I'm still not convinced about the need for locking to begin with. Agreed, I am not convinced locking is necessary either, I have a distributed session implementation that uses session proxies that talk to services and pass only deltas of changes to the session. Session services notify proxies that have the session of changes which get refreshed lazily when required. SMs are remote over RMI and the proxy is local to the container. Rgds Antony -- Antony Bowesman Teamware Group [EMAIL PROTECTED] phone: +358 9 5128 2562 fax : +358 9 5128 2705 intra / extra / Internet solutions at www.teamware.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Tomcat: Distributed Session Management revisited
Pier: Great discussion points. I really appreciate your thoughtful feedback. My comment about Tomcat caching session data does not preclude it from being stored in the remote session server. Indeed, this would be required. My thought was this, in a multi-node network if multiple contiguous requests (for the same session) are handled by the same tomcat node, then that tomcat node should not be forced to retrieve a copy of the session from the session server for each request. It only needs to go back to the session server for the session if it doesn't have a 'valid' copy. Remember that if another tomcat instance causes the session to be updated, then the server will tell all the clients to invalidate that session. This caching works when intervening requests are handled by more than one node and that do not actually update the session attributes. Notice also, in my concept, there are no delays built into the architecture (other than the natural delays caused by sending data over the network). The session server can simply respond to callers on-demand. I've discussed some time ago about this subject and adding the session stuff in the Connector, maybe webapp/warp could be tuned for this purpose. It's clear that the persistance and replication of session data is needed for HA systems, and many solves it by using the EJB backend as repository, WebSphere for example. In some case it appears also very expensive to try to add automatically this persistance (maybe something to be added to server.xml , some webapps could live without session data even, in a soft restart mode). There is also the problem of finding a fast and portable network protocol (multicast may not run on all system, Broadcast UDP need recovery code). What's the status of mod_backhand (http://www.backhand.org/) and Apache ? -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: Pier Fumagalli [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 8:56 AM Subject: Re: Tomcat: Distributed Session Management revisited | On 13/11/2001 04:38 pm, Mika Goeckel [EMAIL PROTECTED] wrote: | | SNMP, ah ja. I've got no knowledge at all 'bout that, so fight with some | other lobbyists :-) | | Same here... Didn't mean to take a left turn. Sorry I mentioned it. | | SessionManager/ServletContainer dualism: | If we don't create a separate SessionManager residing in it's own JVM, but | make it an integral capability of TC, we have the following benefits: | - we save one copy: | When a new session is created and we have a separate network of SMs, it | needs to be copied to at least two SMs, if we have it in TC, it only needs | to be copied to one other TC. (If we aim single redundance) | | Indeed it would save bandwidth... I want a distributed session store, where all sessions are known (or are knowable) by all members of the cluster, with a built-in fail-over mechanism? I want to be able to scale my web server by simply adding more standalone Tomcats and possibly session managers. I want to be able to use a brand-x HTTP load-balancer that redirects web-traffic on a request-by-request basis to the tomcat webserver that it thinks can best handle the request. I also want to be able to bring down individual Tomcats without destroying any user sessions. Apache's 'smart' approach (that remembers which JSESSIONID's are hosted by which Tomcat Servers) doesn't let me bring down individual Tomcat servers without losing sessions. This means that Tomcat servers are simply not 'hot-swappable' in this configuration. ASFAICT, minimal redundance is all that is required. There's simply no need to keep a gratuitous number of session copies around. | | - if one TC is the owner and the other the escrow, the owner never needs to | ask if the session is uptodate or invalid, and it can't get stale. The | replication of changes can be done after the request, so no time burden | within the request itself. | If the escrow want's to use the session, it only needs to inform the owner | and they change roles (or if possible the escrow passes the request back to | the owner) | Frequently all servers ping their known escrows and owners to ensure they're | still present. | | The only problem I could see with that is synchronization of accesses from | different points, but I believe that is a solvable problem... | | - deserialisation should not be a problem, because in that ClassLoader | Context, the user-session objects are known. (correct me if I'm wrong here) | | Nope, you're right on that. This is only an issue in the event that the Session Manager is a seperate entity. | | AutoConf what about JNDI to register cluster nodes? It is around anyway. | In that case an upcoming TC would just search the JNDI service for | registered nodes with his own ClusterName, and register itself with it. | Getting back a NamingEnumeration, it could decide itself, which of the | others to link with. | | One thing that can be done with my approach of multicasting is automatic | load balancing... To any request of who can hold this session, each | manager can return a load index, and the decision on where the session | should be stored primarily and in replica should be based on that. Using | JNDI that can be done, but I don't want to end up in a situation where a | single host holds 80% of the sessions while the others are free... If the | managers could update their JNDI registrations with a load factor every X | seconds, that would be acceptable... | One thing to remember here is that the number of 'clients' in our discussion is always fixed - it is the number of Tomcat web servers in the 'cluster'. The load of the session managers is a direct function of the load on it's clients. Hopefully, the load balancer on the front end (either Apache round-robin, or some firmware solution) is doing a 'reasonable' job of spreading the load across web servers / tomcats. Therefore, as long as the number of Tomcats served by each Session manager is approximately the same, we can deduce that the load placed on the session managers will ALSO be reasonably well balanced. If my deduction is correct, then there should be no need for posting load factors, and continual switching back and forth between session managers. Lets create some more examples: 1) 10 Tomcat webservers (1-10). Servers 1 and 2 happen to be identifed as 'Session Managers' as well as web servers. Servers 3-10 are just plain web servers, not session managers. In this scenario, Tomcat servers 1 and 2 are burdened by satisfying session requests (queries updates) from the other 9 servers, as well as handling their own web-traffic. They must also initiate communication to the other 9 servers whenever a session is invalidated (due to update, maxAge
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: GOMEZ Henri [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 9:15 AM Subject: RE: Tomcat: Distributed Session Management revisited ... stuff deleted ... | Notice also, in my concept, there are no delays built into the | architecture | (other than the natural delays caused by sending data over the | network). | The session server can simply respond to callers on-demand. | | I've discussed some time ago about this subject and adding the session | stuff in the Connector, maybe webapp/warp could be tuned for this | purpose. Agreed, this distributed session management feature, however it winds up being implemented, would certainly NOT be the default behavior provided by Tomcat and would require mucking around with server.xml in order to make this work. | | It's clear that the persistance and replication of session data is | needed for HA systems, and many solves it by using the EJB backend | as repository, WebSphere for example. In some case it appears also HA is only part of the picture. What I'm wanting to achieve is seamless scaling by adding more servers. Also, while EJB could be used for this purpose, it adds a dependancy that I don't wish to add. | very expensive to try to add automatically this persistance | (maybe something to be added to server.xml , some webapps could | live without session data even, in a soft restart mode). Again this could depend on the 'SessionManager' object configured for the given context/web-app in server.xml. | | There is also the problem of finding a fast and portable network | protocol (multicast may not run on all system, Broadcast UDP need recovery | code). | RMI is certainly as portable a protocol as you can hope to find (at least in our Java context). Is it fast enough? I would argue that it is fast enough that J2EE is built entirely upon it. It must therefore be fast enough for us. It also has the advantage of simplicity. I'm a big fan of simplicity, even if I can't sew. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Tom Drake wrote: I want a distributed session store, where all sessions are known (or are knowable) by all members of the cluster, with a built-in fail-over mechanism? As you guys discuss this, don't forget a very important requirement in the servlet specification with regards to distributable applications: [Servlet Spec 2.3, Section 7.7.2] Within an application marked as distributable, all requests that are part of a session must be handled by one virtual machine at a time. In effect, this means that a session can be migrated to a different server only between requests. On a failure of the server currently handling the session, you could migrate it to a different server, but this operation must be atomic. Craig -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On 13/11/2001 06:53 pm, Tom Drake [EMAIL PROTECTED] wrote: | SNMP, ah ja. I've got no knowledge at all 'bout that, so fight with some | other lobbyists :-) | | Same here... Didn't mean to take a left turn. Sorry I mentioned it. Oh, I mean, I don't mind... The only thing is that I have no clue on how SMTP works :) :) :) I want a distributed session store, where all sessions are known (or are knowable) by all members of the cluster, with a built-in fail-over mechanism? I want to be able to scale my web server by simply adding more standalone Tomcats and possibly session managers. I want to be able to use a brand-x HTTP load-balancer that redirects web-traffic on a request-by-request basis to the tomcat webserver that it thinks can best handle the request. I also want to be able to bring down individual Tomcats without destroying any user sessions. Apache's 'smart' approach (that remembers which JSESSIONID's are hosted by which Tomcat Servers) doesn't let me bring down individual Tomcat servers without losing sessions. This means that Tomcat servers are simply not 'hot-swappable' in this configuration. ASFAICT, minimal redundance is all that is required. There's simply no need to keep a gratuitous number of session copies around. Well, for fault tolerance, we require at minimum TWO copies of a session hosted on two different JVMs, because, in case one of those two fails, the other one needs to take over. So, yes, minimal redundancy can be achieved. The fact of having a smart load balancer in the front doesn't preclude or modify the behavior of the backend. Thinking about it, the modules up front could actually perform their load balancing depending on the configuration of the backend, if those information could be shared amongst both the web server and the servlet container... This is only an issue in the event that the Session Manager is a seperate entity. But at that point, if we want to use a session manager separated from the servlet engine, we could simply provide the session manager with the same WAR file given to the servlet engine, and forget about it... One thing to remember here is that the number of 'clients' in our discussion is always fixed - it is the number of Tomcat web servers in the 'cluster'. The load of the session managers is a direct function of the load on it's clients. Hopefully, the load balancer on the front end (either Apache round-robin, or some firmware solution) is doing a 'reasonable' job of spreading the load across web servers / tomcats. Therefore, as long as the number of Tomcats served by each Session manager is approximately the same, we can deduce that the load placed on the session managers will ALSO be reasonably well balanced. If my deduction is correct, then there should be no need for posting load factors, and continual switching back and forth between session managers. You assume that the number of servlet containers is fixed and well known, and that the round robin algorithm in front is kinda smart... What if one of the container crashes and I want to deploy another one on the fly? What if I want to scale without having to reboot my whole pool? IMO, on the fly addition/removal of servlet containers and/or session managers is a must. Lets create some more examples: 1) 10 Tomcat webservers (1-10). Servers 1 and 2 happen to be identifed as 'Session Managers' as well as web servers. Servers 3-10 are just plain web servers, not session managers. In this scenario, Tomcat servers 1 and 2 are burdened by satisfying session requests (queries updates) from the other 9 servers, as well as handling their own web-traffic. They must also initiate communication to the other 9 servers whenever a session is invalidated (due to update, maxAge, or on demand). They must also communicate all session deltas to the 'other' session manager. 2) 10 Tomcat webservers, all 10 are identified as 'Session Managers' In this scenario each Tomcat must communicate session deltas to each of the other 9 servers. All servers must perform significant extra work in order to keep their Session store up-to-date. 3) 10 Tomcat webservers, 2 separate Session Managers. Tomcats 1-5 point to SM1, Tomcats 6-10 point to SM 2. In this scenario, each Tomcat only communicates with 1 session manager. Each session manager communicates session deltas with the other SM, and with only the Tomcat servers that it connect themselves to it (5 in this example) on an as-needed basis (e.g. when the Tomcat instance asks for the session data). Each SM must also send tell all it's clients when sessions are invalidated. Fail-over could be handled in a similar manner in all scenarios. Addition of a new SessionManager (or SessionManager capable Tomcat) could be handled in a similar manner in all scenarios. Hmm... I don't agree with those three scenarios... I would love to see a configuration where
Re: Tomcat: Distributed Session Management revisited
On 13/11/2001 07:41 pm, Craig R. McClanahan [EMAIL PROTECTED] wrote: On Tue, 13 Nov 2001, Tom Drake wrote: I want a distributed session store, where all sessions are known (or are knowable) by all members of the cluster, with a built-in fail-over mechanism? As you guys discuss this, don't forget a very important requirement in the servlet specification with regards to distributable applications: [Servlet Spec 2.3, Section 7.7.2] Within an application marked as distributable, all requests that are part of a session must be handled by one virtual machine at a time. In effect, this means that a session can be migrated to a different server only between requests. On a failure of the server currently handling the session, you could migrate it to a different server, but this operation must be atomic. So, basically, we have to design a lock/unlock mechanism (that complicates stuff). It would be easier to achieve without that requirement... (god knows why Danny added it). Pier -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On 13/11/2001 07:59 pm, Mika Goeckel [EMAIL PROTECTED] wrote: Scenario: 4) 10 Tomcat webservers acting as SessionManagers for Sessions initially created by themselves or being assigned responsibility afterwards for a specific session. In this scenario every Tomcat backs up sessions to a configured number of others (for simple redundancy only with one other tc) The session is primarily held on the TC which created it or most recently used it. The other instance(s) hold a passive copy for fail-savety. a) A dumb load.balancing facility passes a request to a completely different TC which needs control over the connected session. From a regularly updated list or by asking it's peers it get's the (actual) owner of that session and requests to take over ownership. The old owner drop's ownership after the session has been taken over. Alternatively the old owner becomes the backup and the old backup drops. b) A clever load-balancing facility remembers where the last request of this session went and directs the next request to the same TC. The TC has all what it needs to fulfill the request and communicates the changes of the session afterwards. In Scenario a) you have one session copy before the request and one change communication afterwards. in b) only change communication. That's precisely what I'm thinking about... In your scenario, the TC needs at least the certitude that the session is uptodate, so it goes asking the SessionManager. If yes, it's case 4 b) If no, it's case 4a) but in both cases the overhead of asking the SessionManager. If the primary owner goes down, the secondary becomes primary owner and has to find a new secondary. If that happens, 4a) applies because the load-balancer needs to choose another instance and the likelyhood that it finds the backup TC by chance is 1/(number of nodes - 1). Assumption: We leave the load-balancing to the load balancer. Indeed... Without that assumption we could be even more clever: A TC gets a request and hasn't got the session. But from the list or by asking it's peers it get's knowledge who is the primary owner. It asks mod_webapp to go and send the request to that TC. In that case we end up at 4b) in nearly every case. No session copying during the request, only changes afterwards. If, in theory, the front module doing load balancing was able to see on the spot who has the session (who's primary, who's backup) and adequately redirect the request on the fly (it's possible to do it), then we achieve sessions stickyness at the same time, and fail over in case things go wrong... After having written that I notice that we are very close in our thinking. Your approach is somewhat clearer by the sake of having one more copy of a session, instead if the SessionManager part of a TC stores it's copies in the TCs session store (memory presumably). Then we have exactly the same. I'm playing around with the TC src at the moment. I've had a look on JNDIRealm to find out how a Context is usually created. I'm missing a static service utility with a simple getInitialContext method and more than that Log4J and are there really JUnit tests around? I've not seen a single one. Can't help you on that... But, if we customize the lookup tables abstracting it from JNDI, we could write also some C code for the web-server modules that could participate in our session pooling group, and direct requests where they should be, two pigeons with a single shot :) Pier -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Can't help you on that... But, if we customize the lookup tables abstracting it from JNDI, we could write also some C code for the web-server modules that could participate in our session pooling group, and direct requests where they should be, two pigeons with a single shot :) Something in the response like and by the way, that are my replica holders ? Or a dedicated communication protocol? The former is easier, but what if you have more than one frontend? So they would need to communicate as well M. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Mika Goeckel wrote: Date: Tue, 13 Nov 2001 21:19:35 +0100 From: Mika Goeckel [EMAIL PROTECTED] Reply-To: Tomcat Developers List [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Subject: Re: Tomcat: Distributed Session Management revisited Hi Craig, am I understanding right, that handling in this context means the part of execution when the servlet's service routine is called? Would the container be allowed to fetch a session after the request has reached it but before the servlet's code is called? It is not legal that the following scenario occur: * Two simultaneous requests for the same session. * Your container processes these requests in different JVMs. Details of when the restriction starts are basically dependent on the container's implementation -- but it's the result that must be obeyed. The reason for the restriction is pretty obvious when you think about this series of events (in chronological order): * Request 1 sent to server A * Request 2 sent to server B * Request 1 grabs session and calls session.setAttribute(foo, bar). * Request 2 grabs session and calls session.getAttribute(foo). On a server that properly implements the restriction, request 2 will always see the foo attribute, just as would occur in a non-distributed environment (which, by definition, would be processing both requests in the same JVM on different threads). Thus, from the application developer's perspective, you don't have to worry about the possibility that session attributes might be getting accessed or modified on multiple JVMs at the same time. It also means that the application can implement thread-safety locking with synchronized and have it work correctly on a single JVM or multiple JVM container. This isn't possible if the same session attribute can be accessed from multiple JVMs simultaneously. Is it theological to ask if a proxy session object that would call the methods of a session object in another JVM would violate that requirement? From the application developers point of view he would not see a difference... It would be possible to do this for the HttpSession methods themselves (the container would know what's going on), but what do you do about session attributes? HttpSession session = request.getSession(); MyObject mo = (MyObject) session.getAttribute(foo); mo.setName(bar); This cannot be done transparently unless MyObject class is actually an RMI or Corba reference, and even then the app would have to deal with the possibility of exceptions caused by the container's activities, not it's own. The whole idea is that the programming model for the application developer doesn't change in a distributable application. The fact that it makes life tougher on the container developer is what makes this particular functionality quite interesting to implement :-). Mika :wq Craig -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Pier Fumagalli wrote: So, basically, we have to design a lock/unlock mechanism (that complicates stuff). It would be easier to achieve without that requirement... (god knows why Danny added it). See the answer I just sent for more details -- not enforcing this restriction would make the application writer's job basically impossible. (For example, how do you do synchronized locks across multiple JVMs to avoid simultaneous updates to a session attribute?) Basically, we're going to want some sort of sticky routing like what JK already does, but with the added ability to migrate sessions in between requests for load balancing and/or high availability. Pier Craig -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Cool, sounds like having a primary owner and front-end redirection to it solves that without lock distribution. Means that an owner can't allow another TC to take over a session whilst processing a request, but as he knows when he's finished, that's easy. M. - Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 9:31 PM Subject: Re: Tomcat: Distributed Session Management revisited On Tue, 13 Nov 2001, Mika Goeckel wrote: Date: Tue, 13 Nov 2001 21:19:35 +0100 From: Mika Goeckel [EMAIL PROTECTED] Reply-To: Tomcat Developers List [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Subject: Re: Tomcat: Distributed Session Management revisited Hi Craig, am I understanding right, that handling in this context means the part of execution when the servlet's service routine is called? Would the container be allowed to fetch a session after the request has reached it but before the servlet's code is called? It is not legal that the following scenario occur: * Two simultaneous requests for the same session. * Your container processes these requests in different JVMs. Details of when the restriction starts are basically dependent on the container's implementation -- but it's the result that must be obeyed. The reason for the restriction is pretty obvious when you think about this series of events (in chronological order): * Request 1 sent to server A * Request 2 sent to server B * Request 1 grabs session and calls session.setAttribute(foo, bar). * Request 2 grabs session and calls session.getAttribute(foo). On a server that properly implements the restriction, request 2 will always see the foo attribute, just as would occur in a non-distributed environment (which, by definition, would be processing both requests in the same JVM on different threads). Thus, from the application developer's perspective, you don't have to worry about the possibility that session attributes might be getting accessed or modified on multiple JVMs at the same time. It also means that the application can implement thread-safety locking with synchronized and have it work correctly on a single JVM or multiple JVM container. This isn't possible if the same session attribute can be accessed from multiple JVMs simultaneously. Is it theological to ask if a proxy session object that would call the methods of a session object in another JVM would violate that requirement? From the application developers point of view he would not see a difference... It would be possible to do this for the HttpSession methods themselves (the container would know what's going on), but what do you do about session attributes? HttpSession session = request.getSession(); MyObject mo = (MyObject) session.getAttribute(foo); mo.setName(bar); This cannot be done transparently unless MyObject class is actually an RMI or Corba reference, and even then the app would have to deal with the possibility of exceptions caused by the container's activities, not it's own. The whole idea is that the programming model for the application developer doesn't change in a distributable application. The fact that it makes life tougher on the container developer is what makes this particular functionality quite interesting to implement :-). Mika :wq Craig -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
One question - wouldn't be better if the 'distributed session management' would be first designed and discussed _outside_ of the ServletSession context ? In other words, a SessionManager that would store and provide fail-over, etc for serializable objects. It can have a rich interface, including support for transaction, control sync/async storage, etc. There are many limits in what you can do with a servlet session - all you have is a get/set with Objects ( serializable or not ). No way to guess if the user cares about the object or is just a cache ( in which case it can be a blob that the user just doesn't want to read again from net/disk ). No way to tell the user that his object was stored sucessfully or lost ( you need transactions and exceptions for that ). Designing the SessionManager as a standalone component would make a lot of sense - you can then integrate it with the servlet sessions, or the user could use its richer API. IMHO no sane user should store something in a servlet session and assume the operation will be sucessful - and the session manager can only give a dangerous ilusion that this is possible. Nothing can be guaranteed to allways succeed ( no database or network application can do that ) - and with an API that doesn't provide any feedback there's little you can do. Writing a servlet application assuming the session data will be safe is at least not portable ( assuming it is possible, which I doubt ). At least with a standalone SessionManager you can include the lib in your war and have it working on any container. ( by standalone I mean a user-space util, independent of the servlet container, but with hooks - so it could be hooked into a container as a default session manager or used as an addon ) Costin 1) 10 Tomcat webservers (1-10). Servers 1 and 2 happen to be identifed as 'Session Managers' as well as web servers. Servers 3-10 are just plain web servers, not session managers. In this scenario, Tomcat servers 1 and 2 are burdened by satisfying session requests (queries updates) from the other 9 servers, as well as handling their own web-traffic. They must also initiate communication to the other 9 servers whenever a session is invalidated (due to update, maxAge, or on demand). They must also communicate all session deltas to the 'other' session manager. The best scenario, but tell me if I misunderstand something after you've reviewed case 4. 2) 10 Tomcat webservers, all 10 are identified as 'Session Managers' In this scenario each Tomcat must communicate session deltas to each of the other 9 servers. All servers must perform significant extra work in order to keep their Session store up-to-date. 3) 10 Tomcat webservers, 2 separate Session Managers. Tomcats 1-5 point to SM1, Tomcats 6-10 point to SM 2. In this scenario, each Tomcat only communicates with 1 session manager. Each session manager communicates session deltas with the other SM, and with only the Tomcat servers that it connect themselves to it (5 in this example) on an as-needed basis (e.g. when the Tomcat instance asks for the session data). Each SM must also send tell all it's clients when sessions are invalidated. Fail-over could be handled in a similar manner in all scenarios. Addition of a new SessionManager (or SessionManager capable Tomcat) could be handled in a similar manner in all scenarios. In principle, I completely agree. All I want to express is, that I think we don't need SessionManagers as assigned responsibility and would save traffic. Scenario: 4) 10 Tomcat webservers acting as SessionManagers for Sessions initially created by themselves or being assigned responsibility afterwards for a specific session. In this scenario every Tomcat backs up sessions to a configured number of others (for simple redundancy only with one other tc) The session is primarily held on the TC which created it or most recently used it. The other instance(s) hold a passive copy for fail-savety. a) A dumb load.balancing facility passes a request to a completely different TC which needs control over the connected session. From a regularly updated list or by asking it's peers it get's the (actual) owner of that session and requests to take over ownership. The old owner drop's ownership after the session has been taken over. Alternatively the old owner becomes the backup and the old backup drops. b) A clever load-balancing facility remembers where the last request of this session went and directs the next request to the same TC. The TC has all what it needs to fulfill the request and communicates the changes of the session afterwards. In Scenario a) you have one session copy before the request and one change communication afterwards. in b) only change communication. In your scenario, the TC needs at least the certitude that the session is
Re: Tomcat: Distributed Session Management revisited
See below. - Original Message - From: Pier Fumagalli [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 12:23 PM Subject: Re: Tomcat: Distributed Session Management revisited | ASFAICT, minimal redundance is all that is required. There's simply | no need to keep a gratuitous number of session copies around. | | Well, for fault tolerance, we require at minimum TWO copies of a session | hosted on two different JVMs, because, in case one of those two fails, the | other one needs to take over. So, yes, minimal redundancy can be achieved. | The fact of having a smart load balancer in the front doesn't preclude or | modify the behavior of the backend. Agreed on your first point. As to the second. I'm not so sure. There are alot of assumptions that can be made because we have essentially a fixed number of possible clients (yes, you should be able to add more servers on the fly, and shut them down independantly as well). In concrete terms, and IMHO, as long as the Tomcat clients are fairly evenly distributed across the session servers, we will be as 'load-balanced' as we will ever need to be. | | Thinking about it, the modules up front could actually perform their load | balancing depending on the configuration of the backend, if those | information could be shared amongst both the web server and the servlet | container... True. | | This is only an issue in the event that the Session Manager is a seperate | entity. | | But at that point, if we want to use a session manager separated from the | servlet engine, we could simply provide the session manager with the same | WAR file given to the servlet engine, and forget about it... True. But solving this serialization problem is pretty easy. But I suppose we're delving into details here that aren't really too important just yet. | | One thing to remember here is that the number of 'clients' | in our discussion is always fixed - it is the number of Tomcat | web servers in the 'cluster'. The load of the session managers | is a direct function of the load on it's clients. Hopefully, the load | balancer on the front end (either Apache round-robin, or some | firmware solution) is doing a 'reasonable' job of spreading the | load across web servers / tomcats. Therefore, as long as the | number of Tomcats served by each Session manager is | approximately the same, we can deduce that the load placed | on the session managers will ALSO be reasonably well balanced. | If my deduction is correct, then there should be no need for | posting load factors, and continual switching back and forth | between session managers. | | You assume that the number of servlet containers is fixed and well known, | and that the round robin algorithm in front is kinda smart... What if one of | the container crashes and I want to deploy another one on the fly? What if I | want to scale without having to reboot my whole pool? IMO, on the fly | addition/removal of servlet containers and/or session managers is a must. | I'll take your point that the relationships between servlet containers and their session managers may need to change dynamically (as servlet containers are added or removed, or as session managers are added or removed). I definately agree that this is a requirement. This being the case, I think we can handle this simply by adding some code that lets the session managers figure out how to spread the load by asking each other how many clients they are supporting. If any are supporting too many or too few, some client server relationships could be shuffled around until an acceptable tolerance is reached. (You may have already said this in your email - above - just in different words). I suppose the 'policy' used for making such a decision (e.g. how many and which clients should be supported by any given session manager) could be as simple as I have described or be based on a more sophisticated algorithm that takes into account measured / available bandwidth, memory, cpu speed, etc Perhaps the 'load managing' bits should be abstracted into it's own (set of) interface(s), with a very simple concrete implementation, initially. | Lets create some more examples: | | 1) 10 Tomcat webservers (1-10). Servers 1 and 2 happen to be |identifed as 'Session Managers' as well as web servers. |Servers 3-10 are just plain web servers, not session managers. | |In this scenario, Tomcat servers 1 and 2 are burdened by satisfying |session requests (queries updates) from the other 9 servers, as well |as handling their own web-traffic. They must also initiate communication |to the other 9 servers whenever a session is invalidated (due to update, |maxAge, or on demand). They must also communicate all session deltas |to the 'other' session manager. | | 2) 10 Tomcat webservers, all 10 are identified as 'Session Managers' | |In this scenario each Tomcat must communicate session deltas
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 11:41 AM Subject: Re: Tomcat: Distributed Session Management revisited | As you guys discuss this, don't forget a very important requirement in the | servlet specification with regards to distributable applications: | | [Servlet Spec 2.3, Section 7.7.2] Within an application | marked as distributable, all requests that are part of a | session must be handled by one virtual machine at a time. | | In effect, this means that a session can be migrated to a different server | only between requests. On a failure of the server currently handling | the session, you could migrate it to a different server, but this | operation must be atomic. | This may be a stupid question, but how can we know when a given servlet container is 'done' with the session? The problems of creating a network-wide 'semaphore' for each session are many and varied. We'd need to have support for time-outs. This may have some serious performance implications as well. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 12:31 PM Subject: Re: Tomcat: Distributed Session Management revisited | | | On Tue, 13 Nov 2001, Mika Goeckel wrote: | | Date: Tue, 13 Nov 2001 21:19:35 +0100 | From: Mika Goeckel [EMAIL PROTECTED] | Reply-To: Tomcat Developers List [EMAIL PROTECTED] | To: Tomcat Developers List [EMAIL PROTECTED] | Subject: Re: Tomcat: Distributed Session Management revisited | | Hi Craig, | | am I understanding right, that handling in this context means the part of | execution when the servlet's service routine is called? Would the container | be allowed to fetch a session after the request has reached it but before | the servlet's code is called? | | | It is not legal that the following scenario occur: | * Two simultaneous requests for the same session. | * Your container processes these requests in different JVMs. | | Details of when the restriction starts are basically dependent on the | container's implementation -- but it's the result that must be obeyed. | | The reason for the restriction is pretty obvious when you think about | this series of events (in chronological order): | * Request 1 sent to server A | * Request 2 sent to server B | * Request 1 grabs session and calls session.setAttribute(foo, bar). | * Request 2 grabs session and calls session.getAttribute(foo). | | On a server that properly implements the restriction, request 2 will | always see the foo attribute, just as would occur in a non-distributed | environment (which, by definition, would be processing both requests in | the same JVM on different threads). Thus, from the application | developer's perspective, you don't have to worry about the possibility | that session attributes might be getting accessed or modified on multiple | JVMs at the same time. | | It also means that the application can implement thread-safety locking | with synchronized and have it work correctly on a single JVM or multiple | JVM container. This isn't possible if the same session attribute can be | accessed from multiple JVMs simultaneously. | | Is it theological to ask if a proxy session object that would call the | methods of a session object in another JVM would violate that requirement? | From the application developers point of view he would not see a | difference... | | | It would be possible to do this for the HttpSession methods | themselves (the container would know what's going on), but what do you do | about session attributes? | | HttpSession session = request.getSession(); | MyObject mo = (MyObject) session.getAttribute(foo); | mo.setName(bar); I believe that, in this case, it is incumbent upon the application to call session.setAttribute(foo, mo); | This cannot be done transparently unless MyObject class is actually an RMI | or Corba reference, and even then the app would have to deal with the | possibility of exceptions caused by the container's activities, not it's | own. | | The whole idea is that the programming model for the application developer | doesn't change in a distributable application. The fact that it makes | life tougher on the container developer is what makes this particular | functionality quite interesting to implement :-). | | Mika | :wq | | | Craig | | | -- | To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] | For additional commands, e-mail: mailto:[EMAIL PROTECTED] | | | -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Cc: Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 1:05 PM Subject: Re: Tomcat: Distributed Session Management revisited | | One question - wouldn't be better if the 'distributed session management' | would be first designed and discussed _outside_ of the ServletSession | context ? I agree. | | In other words, a SessionManager that would store and provide | fail-over, etc for serializable objects. It can have a rich interface, | including support for transaction, control sync/async storage, etc. | | There are many limits in what you can do with a servlet session - all you | have is a get/set with Objects ( serializable or not ). No way to guess if | the user cares about the object or is just a cache ( in which case it can | be a blob that the user just doesn't want to read again from net/disk ). | No way to tell the user that his object was stored sucessfully or lost ( | you need transactions and exceptions for that ). | | Designing the SessionManager as a standalone component would make a lot of | sense - you can then integrate it with the servlet sessions, or the user | could use its richer API. True, but I'm not sure that exposing a Session Manager api to the application programmer is a good idea. It's not part of the Servlet API, for one thing. For another, many people have a difficult enough time just understanding the existing api. (witness recent discussions on the tomcat-user list re: non-serializable objects in sessions). | IMHO no sane user should store something in a servlet session and assume | the operation will be sucessful - and the session manager can only give a | dangerous ilusion that this is possible. Nothing can be guaranteed to | allways succeed ( no database or network application can do that ) - and | with an API that doesn't provide any feedback there's little you can do. In any servlet container, one must be able to count on this functionality. In a distributed environment, it is encumbant on the application programmer to ensure that any objects placed in the servlet container are serializable (It says this in the servlet spec - I don't remember the section number - possibly 7.7.3) In fact, JSP's depend on this behavior. | | Writing a servlet application assuming the session data will be safe | is at least not portable ( assuming it is possible, which I doubt ). At | least with a standalone SessionManager you can include the lib in your war | and have it working on any container. | | ( by standalone I mean a user-space util, independent of the servlet | container, but with hooks - so it could be hooked into a container as a | default session manager or used as an addon ) | Indeed, this would be an advantage of such a session manager. | | Costin | | | | | 1) 10 Tomcat webservers (1-10). Servers 1 and 2 happen to be | identifed as 'Session Managers' as well as web servers. | Servers 3-10 are just plain web servers, not session managers. | | In this scenario, Tomcat servers 1 and 2 are burdened by satisfying | session requests (queries updates) from the other 9 servers, as well | as handling their own web-traffic. They must also initiate | communication | to the other 9 servers whenever a session is invalidated (due to | update, | maxAge, or on demand). They must also communicate all session deltas | to the 'other' session manager. | | The best scenario, but tell me if I misunderstand something after you've | reviewed case 4. | | | 2) 10 Tomcat webservers, all 10 are identified as 'Session Managers' | | In this scenario each Tomcat must communicate session deltas to each | of the other 9 servers. All servers must perform significant extra | work | in order to keep their Session store up-to-date. | | 3) 10 Tomcat webservers, 2 separate Session Managers. |Tomcats 1-5 point to SM1, Tomcats 6-10 point to SM 2. | | In this scenario, each Tomcat only communicates with 1 | session manager. Each session manager communicates | session deltas with the other SM, and with only the Tomcat | servers that it connect themselves to it (5 in this example) | on an as-needed basis (e.g. when the Tomcat instance asks | for the session data). Each SM must also send tell all it's clients | when sessions are invalidated. | | Fail-over could be handled in a similar manner in all scenarios. | Addition of a new SessionManager (or SessionManager capable Tomcat) | could be handled in a similar manner in all scenarios. | | In principle, I completely agree. All I want to express is, that I think we | don't need SessionManagers as assigned responsibility and would save | traffic. | | Scenario: | | 4) 10 Tomcat webservers acting as SessionManagers for Sessions initially | created by themselves or being assigned responsibility afterwards
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Tom Drake wrote: Date: Tue, 13 Nov 2001 13:21:20 -0800 From: Tom Drake [EMAIL PROTECTED] Reply-To: Tomcat Developers List [EMAIL PROTECTED], Tom Drake [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Subject: Re: Tomcat: Distributed Session Management revisited - Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 11:41 AM Subject: Re: Tomcat: Distributed Session Management revisited | As you guys discuss this, don't forget a very important requirement in the | servlet specification with regards to distributable applications: | | [Servlet Spec 2.3, Section 7.7.2] Within an application | marked as distributable, all requests that are part of a | session must be handled by one virtual machine at a time. | | In effect, this means that a session can be migrated to a different server | only between requests. On a failure of the server currently handling | the session, you could migrate it to a different server, but this | operation must be atomic. | This may be a stupid question, but how can we know when a given servlet container is 'done' with the session? When the last Servlet.service() method returns (for a servlet 2.2 container), or additionally when the last Filter.doFilter() method returns (for a servlet 2.3 container. That's the point at which the session is no longer in the application's control. The problems of creating a network-wide 'semaphore' for each session are many and varied. We'd need to have support for time-outs. This may have some serious performance implications as well. Some useful lessons should be available in the way that the mod_backhand Apache module approaches these sorts of issues. It was presented at the last two ApacheCons, or you can find references to it via search engines. Craig -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
See intermixed. On Tue, 13 Nov 2001, Tom Drake wrote: Date: Tue, 13 Nov 2001 13:27:23 -0800 From: Tom Drake [EMAIL PROTECTED] Reply-To: Tomcat Developers List [EMAIL PROTECTED], Tom Drake [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Subject: Re: Tomcat: Distributed Session Management revisited - Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 12:31 PM Subject: Re: Tomcat: Distributed Session Management revisited | | | On Tue, 13 Nov 2001, Mika Goeckel wrote: | | Date: Tue, 13 Nov 2001 21:19:35 +0100 | From: Mika Goeckel [EMAIL PROTECTED] | Reply-To: Tomcat Developers List [EMAIL PROTECTED] | To: Tomcat Developers List [EMAIL PROTECTED] | Subject: Re: Tomcat: Distributed Session Management revisited | | Hi Craig, | | am I understanding right, that handling in this context means the part of | execution when the servlet's service routine is called? Would the container | be allowed to fetch a session after the request has reached it but before | the servlet's code is called? | | | It is not legal that the following scenario occur: | * Two simultaneous requests for the same session. | * Your container processes these requests in different JVMs. | | Details of when the restriction starts are basically dependent on the | container's implementation -- but it's the result that must be obeyed. | | The reason for the restriction is pretty obvious when you think about | this series of events (in chronological order): | * Request 1 sent to server A | * Request 2 sent to server B | * Request 1 grabs session and calls session.setAttribute(foo, bar). | * Request 2 grabs session and calls session.getAttribute(foo). | | On a server that properly implements the restriction, request 2 will | always see the foo attribute, just as would occur in a non-distributed | environment (which, by definition, would be processing both requests in | the same JVM on different threads). Thus, from the application | developer's perspective, you don't have to worry about the possibility | that session attributes might be getting accessed or modified on multiple | JVMs at the same time. | | It also means that the application can implement thread-safety locking | with synchronized and have it work correctly on a single JVM or multiple | JVM container. This isn't possible if the same session attribute can be | accessed from multiple JVMs simultaneously. | | Is it theological to ask if a proxy session object that would call the | methods of a session object in another JVM would violate that requirement? | From the application developers point of view he would not see a | difference... | | | It would be possible to do this for the HttpSession methods | themselves (the container would know what's going on), but what do you do | about session attributes? | | HttpSession session = request.getSession(); | MyObject mo = (MyObject) session.getAttribute(foo); | mo.setName(bar); I believe that, in this case, it is incumbent upon the application to call session.setAttribute(foo, mo); This violates the principle that the application programming model should not change, but it's certainly feasible to say if you want load balancing to work on this container, you have to call HttpSession.setAttribute() whenever you modify an attribute's properties. By itself, though, this doesn't provide any support for locking against simultaneous updates (i.e. what you do in synchronized blocks in a single VM). It's a little funny funny ... by the time we invent API to solve all these problems, you've just defined a pretty fair chunk of the functionality of EJBs ... | This cannot be done transparently unless MyObject class is actually an RMI | or Corba reference, and even then the app would have to deal with the | possibility of exceptions caused by the container's activities, not it's | own. | | The whole idea is that the programming model for the application developer | doesn't change in a distributable application. The fact that it makes | life tougher on the container developer is what makes this particular | functionality quite interesting to implement :-). | | Mika | :wq | | | Craig Craig -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Tom Drake wrote: | One question - wouldn't be better if the 'distributed session management' | would be first designed and discussed _outside_ of the ServletSession | context ? I agree. | Designing the SessionManager as a standalone component would make a lot of | sense - you can then integrate it with the servlet sessions, or the user | could use its richer API. True, but I'm not sure that exposing a Session Manager api to the application programmer is a good idea. It's not part of the Servlet API, for one thing. For another, many people have a difficult enough time just understanding the existing api. (witness recent discussions on the tomcat-user list re: non-serializable objects in sessions). Well, I have a difficult enugh time understanding how someone could stretch the existing session API ( getAttribute/setAttribute, no exception, arbitrary Objects ) to be used for anything but simple storage of data with no guarantee. Fiting a (mini) fault-tolerant, object oritened database in those 2 methods is quite challenging, and I doubt too many people will be able to use this. In addition, the servlet API is not intended as an API for data storage - there are other APIs that are designed for that. The session support is nice and convenient - but stretching it beyond what it can express can create many problems and unexpected behavior. | IMHO no sane user should store something in a servlet session and assume | the operation will be sucessful - and the session manager can only give a | dangerous ilusion that this is possible. Nothing can be guaranteed to | allways succeed ( no database or network application can do that ) - and | with an API that doesn't provide any feedback there's little you can do. In any servlet container, one must be able to count on this functionality. In a distributed environment, it is encumbant on the application programmer to ensure that any objects placed in the servlet container are serializable (It says this in the servlet spec - I don't remember the section number - possibly 7.7.3) What I'm saying is that there is no way to guarantee that a certain operation will succeed, writing to a file or socket can throw an exception - while setAttribute() can't, so there is no way to tell the user that his operation failed. And the fact that an object is Serializable doesn't mean the user wants it copied over network every time he changes an attribute - it may be just a big photo he can retrieve from disk if it's not in memory. You can't guess which objects are just cached in memory and which are important - so you have to save everything. Again - the problem is the attempt to fit an API that was not designed for safe data storage into the wrong problem. This is deep into object-oriented database problem space - Serializable was designed to allow the object to be saved/restored, but it is also not the right API for an OODB - it can't detect changes in a field ( and call persist ), can't diferentiate what/when the user wants persisted ( there is no explicit method ), etc. Costin -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 1:25 PM Subject: Re: Tomcat: Distributed Session Management revisited ... stuff deleted ... | | It would be possible to do this for the HttpSession methods | | themselves (the container would know what's going on), but what do you do | | about session attributes? | | | | HttpSession session = request.getSession(); | | MyObject mo = (MyObject) session.getAttribute(foo); | | mo.setName(bar); | | I believe that, in this case, it is incumbent upon the application to call | | session.setAttribute(foo, mo); | | | This violates the principle that the application programming model should | not change, but it's certainly feasible to say if you want load balancing | to work on this container, you have to call HttpSession.setAttribute() | whenever you modify an attribute's properties. | | By itself, though, this doesn't provide any support for locking against | simultaneous updates (i.e. what you do in synchronized blocks in a | single VM). | | It's a little funny funny ... by the time we invent API to solve all these | problems, you've just defined a pretty fair chunk of the functionality of | EJBs ... | Locking issues aside, this presents a fair problem for the servlet container. How to know if any session attribute was modified per your example. Perhaps we don't need to. Perhaps our mechanism could configured to be either pessimistic or optimistic about whether a web-app always tells the session if an attribute has changed. If configured to be pessimistic, TC could simply assume that an attribute may have changed, and therefore must always send the entire session object to another server. Of course, this would result in more network traffic and possibly reduced performance. If configured to be optimistic, TC could assume that unless session.setAttribute, or session.removeAttribute were called, no changes were made to the session, therefore, no data transfer would be required at the end of the request (aside from a 'releaseLock' message). Otherwise, only the 'changed' or 'deleted' attributes need to be sent to another server. This would probably result in a significant reduction in network traffic, and improved performance. This way, we could say, that any web-application will work with the expected semantics, but if you want to improve your performance, make sure that your web-app calls setAttribute any time an attribute value changes, and make a one-line change to server.xml that changes the tomcats behavior (to be 'optimistic'). -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
See my comments below. - Original Message - From: [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 2:08 PM Subject: Re: Tomcat: Distributed Session Management revisited | On Tue, 13 Nov 2001, Tom Drake wrote: | | | One question - wouldn't be better if the 'distributed session management' | | would be first designed and discussed _outside_ of the ServletSession | | context ? | | I agree. | | | Designing the SessionManager as a standalone component would make a lot of | | sense - you can then integrate it with the servlet sessions, or the user | | could use its richer API. | | True, but I'm not sure that exposing a Session Manager api to the | application | programmer is a good idea. It's not part of the Servlet API, for one thing. | For another, many people have a difficult enough time just understanding | the existing api. (witness recent discussions on the tomcat-user list re: | non-serializable objects in sessions). | | Well, I have a difficult enugh time understanding how someone could | stretch the existing session API ( getAttribute/setAttribute, no | exception, arbitrary Objects ) to be used for anything but simple storage | of data with no guarantee. That's how I read the 2.3 serlvet spec. | | Fiting a (mini) fault-tolerant, object oritened database in those 2 | methods is quite challenging, and I doubt too many people will be able to | use this. | | In addition, the servlet API is not intended as an API for data storage - | there are other APIs that are designed for that. The session support is | nice and convenient - but stretching it beyond what it can express can | create many problems and unexpected behavior. | But it is an api for storing things temporarily - for the lifetime of a session. JSP's have built-in support for session scoped application beans - which can be arbitrarily complex objects. I think you have to be able to depend on the integrity of HttpSession attributes. SRV 7.7.2 opens up with : Within an application marked as distributable, all requests that are part of a session must handled by one virtual machine at a time. The container must be able to handle all objects placed into instances of the HttpSession class using the setAttribute or putValue methods appropriately. The following restrictions are imposed to meet these conditions: . The container must accept objects that implement the Serializable interface . The container may choose to support storage of other designated objects in the HttpSession, such as references to Enterprise JavaBean components and transactions. . Migration of sessions will be handled by container-specific facilities. | | | IMHO no sane user should store something in a servlet session and assume | | the operation will be sucessful - and the session manager can only give a | | dangerous ilusion that this is possible. Nothing can be guaranteed to | | allways succeed ( no database or network application can do that ) - and | | with an API that doesn't provide any feedback there's little you can do. | | In any servlet container, one must be able to count on this functionality. | In a distributed environment, it is encumbant on the application | programmer to ensure that any objects placed in the servlet container | are serializable (It says this in the servlet spec - I don't remember the | section number - possibly 7.7.3) | | What I'm saying is that there is no way to guarantee that a certain | operation will succeed, writing to a file or socket can throw an | exception - while setAttribute() can't, so there is no way to tell the | user that his operation failed. True, but, in a distributed environment, if network operations fail, there's typically not too much that we can be expected to do, aside from returning a message to the client. | | And the fact that an object is Serializable doesn't mean the user wants it | copied over network every time he changes an attribute - it may be just a | big photo he can retrieve from disk if it's not in memory. You can't guess | which objects are just cached in memory and which are important - so you | have to save everything. However, if the user has stored a Serializeable object in her session in a distributed environment, the documented semantics (in the servlet spec section 7.7.2) indicate that if a session attribute is seriablizable you can expect the container may move the session to other nodes in the network. You only have to save the Serialized objects - interestingly, you don't actually have to use writeObject(), but you have to retain the closure that it provides. | | Again - the problem is the attempt to fit an API that was not designed | for safe data storage into the wrong problem. | | This is deep into object-oriented database problem space - Serializable | was designed to allow the object to be saved/restored, but it is also not | the right API for an OODB - it can't detect
Re: Tomcat: Distributed Session Management revisited
Tom Drake wrote: - Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 1:25 PM Subject: Re: Tomcat: Distributed Session Management revisited ... stuff deleted ... | | It would be possible to do this for the HttpSession methods | | themselves (the container would know what's going on), but what do you do | | about session attributes? | | | | HttpSession session = request.getSession(); | | MyObject mo = (MyObject) session.getAttribute(foo); | | mo.setName(bar); | | I believe that, in this case, it is incumbent upon the application to call | | session.setAttribute(foo, mo); | | | This violates the principle that the application programming model should | not change, but it's certainly feasible to say if you want load balancing | to work on this container, you have to call HttpSession.setAttribute() | whenever you modify an attribute's properties. | | By itself, though, this doesn't provide any support for locking against | simultaneous updates (i.e. what you do in synchronized blocks in a | single VM). | | It's a little funny funny ... by the time we invent API to solve all these | problems, you've just defined a pretty fair chunk of the functionality of | EJBs ... | Locking issues aside, this presents a fair problem for the servlet container. How to know if any session attribute was modified per your example. I'm not saying this is necessarily a good idea, but you can byte compare the resulting session serialization to see if the session objects have changed. All you have to do is keep a local copy of the original session during the request. Not very pretty, but is a solution that wasn't discussed. I tend to agree with Costin that the session API isn't well suited for failover/distribution. I don't think it's impossible though. Plus, if you decide to develop an API separate from the session... it really starts to look like EJB. -Paul -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
- Original Message - From: Paul Speed [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 11:30 PM Subject: Re: Tomcat: Distributed Session Management revisited Tom Drake wrote: - Original Message - From: Craig R. McClanahan [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 1:25 PM Subject: Re: Tomcat: Distributed Session Management revisited ... stuff deleted ... | | It would be possible to do this for the HttpSession methods | | themselves (the container would know what's going on), but what do you do | | about session attributes? | | | | HttpSession session = request.getSession(); | | MyObject mo = (MyObject) session.getAttribute(foo); | | mo.setName(bar); | | I believe that, in this case, it is incumbent upon the application to call | | session.setAttribute(foo, mo); | | | This violates the principle that the application programming model should | not change, but it's certainly feasible to say if you want load balancing | to work on this container, you have to call HttpSession.setAttribute() | whenever you modify an attribute's properties. | | By itself, though, this doesn't provide any support for locking against | simultaneous updates (i.e. what you do in synchronized blocks in a | single VM). | | It's a little funny funny ... by the time we invent API to solve all these | problems, you've just defined a pretty fair chunk of the functionality of | EJBs ... | Locking issues aside, this presents a fair problem for the servlet container. How to know if any session attribute was modified per your example. I'm not saying this is necessarily a good idea, but you can byte compare the resulting session serialization to see if the session objects have changed. All you have to do is keep a local copy of the original session during the request. Not very pretty, but is a solution that wasn't discussed. I tend to agree with Costin that the session API isn't well suited for failover/distribution. I don't think it's impossible though. Plus, if you decide to develop an API separate from the session... it really starts to look like EJB. I completely agree, that the API lacks proactive support for things in the background that may fail. But given the fact, that we support a reference implementation which has managed to provide really professional services to users (other ref implementations are just for demonstration, nobody would use them in production) and there are (commercial) solutions, that provide session fail-over in the limitations of this API, we **must** try to provide a solution. The API does not specify, how often the container may try to provide that service or what means it utilizes to do that. Nothing is 100% and I think it is better to live with the uncertaincy we discuss here than with the more likely problem that an instance fails and there is no potential replacement. Byte-comparison is not the worst solution. If we think about differential updates, byte comparison is a good candiate for that and surplus one that promises good performance. If the user wants to place things in a session that she does not need to be replicated, she has the option to declare them transient and write a getter that checks if the Attribute is present, otherwise reconstructs it (in the case of a picture, reloads it from disk). The user has the choice to design for performance or ease. We only need to document the options. Mik :wq -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Mika Goeckel wrote: I completely agree, that the API lacks proactive support for things in the background that may fail. But given the fact, that we support a reference implementation which has managed to provide really professional services to users (other ref implementations are just for demonstration, nobody would use them in production) and there are (commercial) solutions, that provide session fail-over in the limitations of this API, we **must** try to provide a Well, the cool thing about open source is that we _don't_ need to implement all the bloat that commercial solution have. solution. The API does not specify, how often the container may try to provide that service or what means it utilizes to do that. Nothing is 100% and I think it is better to live with the uncertaincy we discuss here than with the more likely problem that an instance fails and there is no potential replacement. I think it's better to live with the certaincy that everything can ( and will ) fail and tomcat can't change this. The alternative is to give users the impression the data he puts in a session will be safe - and he may rely on that instead of using a transaction and a real API. Databases, EJB, etc are complex - but there's a reason to that. Well, we could argue about how much compexity is actually needed, but one thing is certain ( I hope ) - get/setAttribute is not enough, if you want data integrity you must use a different API ( in particular transactions ). Byte-comparison is not the worst solution. If we think about differential updates, byte comparison is a good candiate for that and surplus one that promises good performance. Byte compare every 5 seconds every object in session ? Let's say you just displayed the confirmation and charged the credit card, but the machine crashed just before you sent the order. ( or reverse - you sent it but didn't charged the credit card ). This should happen in below 5 seconds. If the user wants to place things in a session that she does not need to be replicated, she has the option to declare them transient and write a getter that checks if the Attribute is present, otherwise reconstructs it (in the case of a picture, reloads it from disk). The user has the choice to design for performance or ease. We only need to document the options. So the user should change all his objects to implement some arbitrary pattern just to fit this into our solution ? What if the object is not user defined ( like most are ) ? Well, we have to create wrappers for each objects you store in a session. Try to explain this on tomcat-user ( or tomcat-dev ) ... Costin -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Mika Goeckel wrote: [ snip ] I'm not saying this is necessarily a good idea, but you can byte compare the resulting session serialization to see if the session objects have changed. All you have to do is keep a local copy of the original session during the request. Not very pretty, but is a solution that wasn't discussed. I tend to agree with Costin that the session API isn't well suited for failover/distribution. I don't think it's impossible though. Plus, if you decide to develop an API separate from the session... it really starts to look like EJB. I completely agree, that the API lacks proactive support for things in the background that may fail. But given the fact, that we support a reference implementation which has managed to provide really professional services to users (other ref implementations are just for demonstration, nobody would use them in production) and there are (commercial) solutions, that provide session fail-over in the limitations of this API, we **must** try to provide a solution. The API does not specify, how often the container may try to provide that service or what means it utilizes to do that. Nothing is 100% and I think it is better to live with the uncertaincy we discuss here than with the more likely problem that an instance fails and there is no potential replacement. For what it's worth, I completely agree. Failover is _never_ something that the app developer can completely ignore... no matter how much functionality the container provides. Developing distributed applications takes a little thought at the very least. And failover is just a simple distributed model. I've been reading all of this with great interest and waiting to see where it settles. I have alot of experience with various forms of distributed applications and it's interesting to see where they are similar. I'm really tempted to explore how a jini solution might be architected... just from the curiousity side more than anything. (I've been looking for a good excuse to dive deeper into jini.) Byte-comparison is not the worst solution. If we think about differential updates, byte comparison is a good candiate for that and surplus one that promises good performance. Interesting. I hadn't thought about differential updates using serialized streams. They tend to be kind of random but it might work. Also, I have written some classes before that can decode the binary streams as meta-data and now I'm thinking there might even be a clever diff that can be done by actually interpretting the data during the diff. I'll have to think about that some more. If the user wants to place things in a session that she does not need to be replicated, she has the option to declare them transient and write a getter that checks if the Attribute is present, otherwise reconstructs it (in the case of a picture, reloads it from disk). The user has the choice to design for performance or ease. We only need to document the options. Agreed. -Paul -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
[EMAIL PROTECTED] wrote: On Tue, 13 Nov 2001, Mika Goeckel wrote: I completely agree, that the API lacks proactive support for things in the background that may fail. But given the fact, that we support a reference implementation which has managed to provide really professional services to users (other ref implementations are just for demonstration, nobody would use them in production) and there are (commercial) solutions, that provide session fail-over in the limitations of this API, we **must** try to provide a Well, the cool thing about open source is that we _don't_ need to implement all the bloat that commercial solution have. :) solution. The API does not specify, how often the container may try to provide that service or what means it utilizes to do that. Nothing is 100% and I think it is better to live with the uncertaincy we discuss here than with the more likely problem that an instance fails and there is no potential replacement. I think it's better to live with the certaincy that everything can ( and will ) fail and tomcat can't change this. The alternative is to give users the impression the data he puts in a session will be safe - and he may rely on that instead of using a transaction and a real API. Databases, EJB, etc are complex - but there's a reason to that. Well, we could argue about how much compexity is actually needed, but one thing is certain ( I hope ) - get/setAttribute is not enough, if you want data integrity you must use a different API ( in particular transactions ). Byte-comparison is not the worst solution. If we think about differential updates, byte comparison is a good candiate for that and surplus one that promises good performance. Byte compare every 5 seconds every object in session ? Let's say you just displayed the confirmation and charged the credit card, but the machine crashed just before you sent the order. ( or reverse - you sent it but didn't charged the credit card ). This should happen in below 5 seconds. I think the idea is that you'd byte compare on commit which ideally would happen at request boundaries. So in this case a single request becomes a transaction... which indeed opens up its own issues, but no bigger than the ones that were always there. The main issue is that the app has no control over this transaction. The case where things get strange is if the JVM dies in the middle of processing a single request. The request may have already committed real data to the DB, app server, whatever... and yet the session state up to the point of failure would be lost. Even five second polling wouldn't fix that case. In fact, that's the same case that fails in _every_ scenario that doesn't involve full EJB-like transaction support. As soon as you access one single piece of data that isn't covered by the transaction support, you lose some amount of failover recovery. Nothing short of full transaction support will ever cover the case of the dying JVM... and in some rare cases I think that will even fail. That being said, there may still be a place for a session-based distribution mechanism that can support load balancing, hot-swapping of tomcats, and basic failover. It should definitely be an opt-in sort of thing though, ie: web apps that meet the restrictions can opt to setup tomcat to provide this feature. If the user wants to place things in a session that she does not need to be replicated, she has the option to declare them transient and write a getter that checks if the Attribute is present, otherwise reconstructs it (in the case of a picture, reloads it from disk). The user has the choice to design for performance or ease. We only need to document the options. So the user should change all his objects to implement some arbitrary pattern just to fit this into our solution ? What if the object is not user defined ( like most are ) ? Well, we have to create wrappers for each objects you store in a session. Try to explain this on tomcat-user ( or tomcat-dev ) ... I agree... in these cases, the webapp could not be used with a distributed session environment. I think that's a given. Personally, I'm still trying to figure out if there are a large enough number of webapps that could be supported to make it worth the effort. (Heavy emphasis on effort.) -Paul -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
See below - Original Message - From: Paul Speed [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Tuesday, November 13, 2001 3:48 PM Subject: Re: Tomcat: Distributed Session Management revisited ... stuff deleted ... | | The main issue is that the app has no control over this transaction. | The case where things get strange is if the JVM dies in the middle | of processing a single request. The request may have already | committed real data to the DB, app server, whatever... and yet the | session state up to the point of failure would be lost. Even five | second polling wouldn't fix that case. | | In fact, that's the same case that fails in _every_ scenario that | doesn't involve full EJB-like transaction support. As soon as you | access one single piece of data that isn't covered by the | transaction support, you lose some amount of failover recovery. | | Nothing short of full transaction support will ever cover the case | of the dying JVM... and in some rare cases I think that will even fail. Yes, let's remember the 80/20 rule. The solution to this problem will come at a very high cost (if at all). Yet, this problem (servlet container that crashes in the middle of a transaction) is a corner case. Also, remember that if someone really needs this level of protection, they are far better off implementing their business and database logic inside a (bank of) J2EE server(s), and only depending on HttpSession objects for holding transitory information (like user-inputs from a mulit-page form, or the contents of a shopping cart prior to pressing the 'Commit Order' button. ) There are plenty of high-volume web-applications that need to be HA and scalable over 'n' servers but where J2EE is overkill. This is the problem domain that I am focusing on. | | That being said, there may still be a place for a session-based | distribution mechanism that can support load balancing, hot-swapping | of tomcats, and basic failover. It should definitely be an opt-in | sort of thing though, ie: web apps that meet the restrictions can | opt to setup tomcat to provide this feature. There is great value in this. Tom -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On Tue, 13 Nov 2001, Paul Speed wrote: I think the idea is that you'd byte compare on commit which ideally would happen at request boundaries. So in this case a single request becomes a transaction... which indeed opens up its own issues, but no bigger than the ones that were always there. Not good enough - when the request is completed the user already has the page confirming his order ( and maybe the card was already charged :-). In fact, that's the same case that fails in _every_ scenario that doesn't involve full EJB-like transaction support. As soon as you access one single piece of data that isn't covered by the transaction support, you lose some amount of failover recovery. And what's worse, far too many people will not realize that, and read the marketing stuff ( 'we support failover, session replication, etc') and believe it is a magic solution. That being said, there may still be a place for a session-based distribution mechanism that can support load balancing, hot-swapping of tomcats, and basic failover. It should definitely be an opt-in sort of thing though, ie: web apps that meet the restrictions can opt to setup tomcat to provide this feature. I agree it would be nice to have a tool that can store objects with fail-over, distribution, etc and using it as a _complement_ to the session ( maybe using the session id, expiration, etc ). I don't think this tool can be used using only the current servlet session API or that it should be used as a servlet session manager. distributed session environment. I think that's a given. Personally, I'm still trying to figure out if there are a large enough number of webapps that could be supported to make it worth the effort. (Heavy emphasis on effort.) I'm more worried about the number of webapps that would be written with the assumption that the session will be magically safe, instead of using transactions/database/EJB/ or whatever storage API. Costin -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Hi, I'm looking at the same area at the moment. and try to get my head around it maybe we can help each other... further comments below. - Original Message - From: Tom Drake [EMAIL PROTECTED] To: Tomcat Dev List [EMAIL PROTECTED] Sent: Monday, November 12, 2001 11:19 PM Subject: Fw: Tomcat: Distributed Session Management revisited Tomcat Developers: This is a forward of a message that I sent to Bip and Craig a few days ago, regarding distributed session managment (aka Clustering). I haven't gotten any feedback just yet, so I thought I'd throw this out to the whole dev list. The current implementation is broken. The following message explains why and describes some possible solutions to this problem. This feature (e.g. distributed session management) is an absolute requirement for any deployment that needs to scale beyond a single Tomcat instance, and does not want the overhead of depending on JDBC for storing sessions. I've also attached, at the bottom of this message, Two 'preliminary' RMI interfaces that describe (see scenario 1 below) how I think this session server and it's clients (e.g. tomcat instances) should converse. SessionServer - to be implemented by the remote session manager/server SessionClient - to be implemented by clients (e.g. Tomcat) of the remote session manager/server. I'm interested in making a contribution in this area and am anxious to receive some feedback from the dev-list members on this subject. Regards, Tom Drake Email: [EMAIL PROTECTED] - Original Message - From: Tom Drake To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Saturday, November 10, 2001 10:48 PM Subject: Tomcat: Distributed Session Management revisited Bip: I've looked closely at the existing catalina distributed session management code, and determined that the following problems exist. Since, I'm new to the code, it's highly likely that I've missed something. Please correct any errors in my analysis, and provide any input / feedback. I'm interested in contributing to this and would greatly appreciate any input you can provide. Problems with current solution: - Session updates are not communicated to the other nodes in the cluster No, session updates are frequently communicated to all other cluster members through the DistributedManager.run() method [processPersistenceChecks();]. ... a second look came up with that only idle sessions and overflow sessions are replicated... Anyway, that's a paradigm-thing... how accurate does a session need to be? After every change or just every couple of seconds. Should be configurable. ... I would vote for the cooperative approach, but I'd like to add some thoughts: Besides the primary session manager, there needs to be a backup session manager that captures the changes of sessions as well and is the crown prince of the primary session manager. This is to prevent sessions to be replicated to all other cluster instances (memory overhead) but to stay on the save side if the primary goes down (yep, both could crash, but what in live is 100%?). In that case the crown prince would take over and another cluster instance can take over the crown prince's role. Which server the primary manager is, should be either configurable or random. The cluster instances should be configurable. Multicast should only be used if the cluster instances are not configured to find out what other instances are there. The configuration should only specify the initial state, further instances should be addable at any time without the need to bring the cluster down. Another thought is, do sessions need to be replicated in whole, or could there be a way to replicate only the changes (network overhead). I know guys that store loads of things in sessions. We had a case where a whole search result (one complex object per row) was stored there, possibly up to a couple of megs... RMI would be my first approach as well, but I would try to keep the communication details separated from the functional logic implementing the cluster. This would enable us later on to choose other means like JavaSpaces or JMS. RMI is the first choice if the cluster is near by, but what against a cluster over a WAN if the requirements allow slow/deferred replication? RMI could not do that job reliable. Cheers, Mika ^X^C -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On 12/11/2001 10:19 pm, Tom Drake [EMAIL PROTECTED] wrote: Tomcat Developers: This is a forward of a message that I sent to Bip and Craig a few days ago, regarding distributed session managment (aka Clustering). I haven't gotten any feedback just yet, so I thought I'd throw this out to the whole dev list. This is always the wiser choice... The current implementation is broken. The following message explains why and describes some possible solutions to this problem. This feature (e.g. distributed session management) is an absolute requirement for any deployment that needs to scale beyond a single Tomcat instance, and does not want the overhead of depending on JDBC for storing sessions. I've also attached, at the bottom of this message, Two 'preliminary' RMI interfaces that describe (see scenario 1 below) how I think this session server and it's clients (e.g. tomcat instances) should converse. SessionServer - to be implemented by the remote session manager/server SessionClient - to be implemented by clients (e.g. Tomcat) of the remote session manager/server. I'm interested in making a contribution in this area and am anxious to receive some feedback from the dev-list members on this subject. I looked down at the implementation, and so far I have two comments: the first one is very stupid (AKA, package names :), but I'm concerned about one thing: If we have a cluster of SessionClient(s), then sessions are stored all on the SessionServer, and that introduces a single point of failure. I mean, what happens if the SessionServer goes down? AFAICS, all clients will loose their own sessions (unless sessions are persisted to disk as in JDBC, and the server doesn't come back up in a reasonable time)... So, I'm wondering... Could it be possible to cluster also the server? Like in NETBIOS networking, where each client is also server, and if the current server (the one called Primary Master) goes down, another one takes on automagically... Should go back and dig a little bit back on CIFS and how they do it at this point... Pier -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
Mika: Thanks for the reply. Here's some more thoughts on this subject. The primary problem that I see with the collaborative method (e.g. extending the multicast solution) is that all sessions will have to be sent to all cluster nodes. The number session updates that have to travel 'on the wire' is in relation to the number of nodes in the cluster. Further more, when a new tomcat is brought on-line, it must somehow retrieve a copy of all active sessions from somewhere. There is nothing in place for this currently. Using multicast is problematic. If a multicast request is made then all other nodes would respond with all sessions. So, some other approach would need to be taken which would result in two protocols being used to make this feature work. This seems too complicated. --- Consider this scenario: A user establishes a session on node 1 (of a 10 node cluster), Tomcat would create a new session and transmit it to the multicast port, which would then transmit 10 copies of this session (1 to each cluster node). Now suppose that the next request from this user is sent to node 2, which causes an update to the session to occur. Again 11 copies of the Session are transferred. The number of session copies sent as the result of an update is computed as follows: # of nodes + 1 node 1 sends new session to multicast port (kernel-level manager) multicast port sends new session to node1 multicast port sends new session to node2 multicast port sends new session to node3 ... multicast port sends new session to node10 node 2 sends updated session to multicast port multicast port sends new session to node1 multicast port sends new session to node2 multicast port sends new session to node3 ... multicast port sends new session to node10 node 3 11 more session copies are sent... NOTE: remember this is UDP traffic. The more packets that fly around, the greater the likely-hood of dropping packets. Dropped packets in this case means that some tomcat instances may have stale (or no) data for a given session. -- With a centralized session manager the following traffic would occur instead: node1 sends new session to server manager node 2 requests the given (session id) session from the server manager manager sends a copy of the session to node 2 node 2 updates the session and sends it back to the manager. manager sends the 'invalidateSession(sessionId)' method in each of nodes. (note: invalidateSession only contains the value of 'SessionId' + any additional RMI overhead. This is far smaller than a complete Session object) The number of session copies sent as the result of an update is 2. This number does not depend or vary based on the number of nodes. Now, let's add to the story. Let's say that Tomcat is smart enough to cache Session objects in it's memory space. Once Tomcat gets its hands on a 'Session' it keeps it until it becomes 'too old' or an 'invalidateSession(sessionId)' message is received from the remote Session Manager. This could cut down the the number of transfers of Session data from 2 to somewhere between 1 and 2. - On Redundant Session Managers. There are a couple ways to achieve this. One way is to place two Session Managers in the network. One of them is the 'active' one, the other one could simply register itself as a client of the 'active' server. As a client, it can obtain copies of all new and changed sessions from the active server. If for some reason the active server needs to be brought down, it will send a message to all of it's clients (including the 'dormant' session manager) indicating that it's shutting down. The clients could, on receipt of this message, connect to the 'next' session server (in their pre-configured list of servers). The clients could simply carry on with the new server. If the active server simply goes off the air for some mysterious reason. The clients would get a RemoteException the next time they tried to talk to the server. This would be their clue to 'cut-over' to the other server (as described above). Last point. Sending Session delta's instead of the entire Session: This should be doable. The main thing that we care about are Session attributes which are changed by the application. It's up to the web-application to replace these values into the Session if their contents change. This is enough for us to be able to track which attributes have actually changed. Tom - Original Message - From: Mika Goeckel [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED]; Tom Drake [EMAIL PROTECTED] Sent: Monday, November 12, 2001 3:14 PM Subject: Re: Tomcat: Distributed Session Management revisited | Hi, | | I'm looking at the same area at the moment. and try to get my head around | it maybe we can help each other... further comments below. | | - Original Message - | From: Tom Drake [EMAIL
Re: Tomcat: Distributed Session Management revisited
Pier: As far as package names are concerned, I'll gladly defer to the Tomcat gods for guidance. See my response to Mika for some thoughts on 'redundant' session managers. This is certainly doable, and from a fail-over perspective, probably a requirement. Tomcat could be configured with a list of session manager rmi urls, the first one on the list to be considered the 'active' server. It could then fall-back to the 'next' one in the list if needed. My thought is that Tomcat should maintain it's own copy of the Sessions (at least for some proscribed time). It can use it's own copy of the Session if it hasn't been told (by the remote session manager) that a session is invalid (see SessionClient.invalidateSession(sessionId)). This possibly reduces the number of round-trips, and lets the server continue to operate in the event of a complete failure of the remote session manager. Tom - Original Message - From: Pier Fumagalli [EMAIL PROTECTED] To: Tomcat Developers List [EMAIL PROTECTED] Sent: Monday, November 12, 2001 4:28 PM Subject: Re: Tomcat: Distributed Session Management revisited | On 12/11/2001 10:19 pm, Tom Drake [EMAIL PROTECTED] wrote: | | Tomcat Developers: | | This is a forward of a message that I sent to Bip and Craig a few days ago, | regarding distributed session managment (aka Clustering). I haven't gotten | any feedback just yet, so I thought I'd throw this out to the whole dev | list. | | This is always the wiser choice... | | The current implementation is broken. The following message explains | why and describes some possible solutions to this problem. | | This feature (e.g. distributed session management) is an absolute | requirement | for any deployment that needs to scale beyond a single Tomcat instance, and | does not want the overhead of depending on JDBC for storing sessions. | | I've also attached, at the bottom of this message, Two 'preliminary' RMI | interfaces | that describe (see scenario 1 below) how I think this session server and | it's | clients (e.g. tomcat instances) should converse. | SessionServer - to be implemented by the remote session manager/server | SessionClient - to be implemented by clients (e.g. Tomcat) of the remote | session manager/server. | | I'm interested in making a contribution in this area and am anxious to | receive | some feedback from the dev-list members on this subject. | | I looked down at the implementation, and so far I have two comments: the | first one is very stupid (AKA, package names :), but I'm concerned about one | thing: If we have a cluster of SessionClient(s), then sessions are stored | all on the SessionServer, and that introduces a single point of failure. I | mean, what happens if the SessionServer goes down? | AFAICS, all clients will loose their own sessions (unless sessions are | persisted to disk as in JDBC, and the server doesn't come back up in a | reasonable time)... | | So, I'm wondering... Could it be possible to cluster also the server? Like | in NETBIOS networking, where each client is also server, and if the | current server (the one called Primary Master) goes down, another one | takes on automagically... | | Should go back and dig a little bit back on CIFS and how they do it at this | point... | | Pier | | | -- | To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] | For additional commands, e-mail: mailto:[EMAIL PROTECTED] | | | -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On 12/11/2001 11:14 pm, Mika Goeckel [EMAIL PROTECTED] wrote: I would vote for the cooperative approach, but I'd like to add some thoughts: Besides the primary session manager, there needs to be a backup session manager that captures the changes of sessions as well and is the crown prince of the primary session manager. This is to prevent sessions to be replicated to all other cluster instances (memory overhead) but to stay on the save side if the primary goes down (yep, both could crash, but what in live is 100%?). In that case the crown prince would take over and another cluster instance can take over the crown prince's role. Which server the primary manager is, should be either configurable or random. The cluster instances should be configurable. Multicast should only be used if the cluster instances are not configured to find out what other instances are there. The configuration should only specify the initial state, further instances should be addable at any time without the need to bring the cluster down. That's exactly how CIFS works in terms of browsing lists... Every time a node goes up or down, an election is performed and the one who wins takes over the primary place. The problem, though, is that in our case we also need to replicate data across several managers, and not only the information exchanged over at election... Another thought is, do sessions need to be replicated in whole, or could there be a way to replicate only the changes (network overhead). I know guys that store loads of things in sessions. We had a case where a whole search result (one complex object per row) was stored there, possibly up to a couple of megs... That's definitely a problem, because if you replicate that session data over to a N number of session managers, the growth is linear (N*(size+overhead)). RMI would be my first approach as well, but I would try to keep the communication details separated from the functional logic implementing the cluster. This would enable us later on to choose other means like JavaSpaces or JMS. RMI is the first choice if the cluster is near by, but what against a cluster over a WAN if the requirements allow slow/deferred replication? RMI could not do that job reliable. Indeed... But you have to consider that the state of the session needs to be lockable and transactional... Interesting... Think think think... The session doesn't have a commit (freak, too bad), so, to we are unable to know whether a particular servlet engine is getting that session for reading or for reading/writing (and that complicates things, because basically, we have to consider that all accesses to sessions are read-writes, and in a distributed session environment, we need to lock data around). Let's assume (simple case ever) that we have two servlet containers (ServA and ServB), and one session manager (single point of failure, let's call it Sess). ServA receives a request, gets its session from Sess and locks it, and then the servlet whops gets into an infinite loop... The user, not seeing anything coming back at him, hits stop and reload, this time his request goes to ServB... Now, ServB tries to access the same exact session, but whops, the session is locked on Sess by ServA... What should we do? I believe we need to introduce a concept of timeout, in which a particular server is allowed to lock the session for as long as he likes... That to some extent is the root of the problem (concurrent accesses by different servlet containers of the same session), once that is solved in an acceptable manner (I don't see any other thing but locking sessions), then the rest is only deciding down at network level how could we replicate and access/lock sessions... Pier -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Tomcat: Distributed Session Management revisited
On 13/11/2001 12:54 am, Tom Drake [EMAIL PROTECTED] wrote: Mika: Thanks for the reply. Here's some more thoughts on this subject. The primary problem that I see with the collaborative method (e.g. extending the multicast solution) is that all sessions will have to be sent to all cluster nodes. The number session updates that have to travel 'on the wire' is in relation to the number of nodes in the cluster. Linear growth, that's the best we can aim for... Further more, when a new tomcat is brought on-line, it must somehow retrieve a copy of all active sessions from somewhere. There is nothing in place for this currently. Using multicast is problematic. If a multicast request is made then all other nodes would respond with all sessions. So, some other approach would need to be taken which would result in two protocols being used to make this feature work. This seems too complicated. Not that complicated. Most of the work on elective processes has been done already in the scope of other projects, so, we would only need to adapt it to our scope... --- Consider this scenario: A user establishes a session on node 1 (of a 10 node cluster), Tomcat would create a new session and transmit it to the multicast port, which would then transmit 10 copies of this session (1 to each cluster node). Now suppose that the next request from this user is sent to node 2, which causes an update to the session to occur. Again 11 copies of the Session are transferred. [...] NOTE: remember this is UDP traffic. The more packets that fly around, the greater the likely-hood of dropping packets. Dropped packets in this case means that some tomcat instances may have stale (or no) data for a given session. Indeed... Quite huge... -- With a centralized session manager the following traffic would occur instead: node1 sends new session to server manager node 2 requests the given (session id) session from the server manager manager sends a copy of the session to node 2 node 2 updates the session and sends it back to the manager. manager sends the 'invalidateSession(sessionId)' method in each of nodes. (note: invalidateSession only contains the value of 'SessionId' + any additional RMI overhead. This is far smaller than a complete Session object) The number of session copies sent as the result of an update is 2. This number does not depend or vary based on the number of nodes. Now, let's add to the story. Let's say that Tomcat is smart enough to cache Session objects in it's memory space. Once Tomcat gets its hands on a 'Session' it keeps it until it becomes 'too old' or an 'invalidateSession(sessionId)' message is received from the remote Session Manager. This could cut down the the number of transfers of Session data from 2 to somewhere between 1 and 2. Yes, but in this case, we don't have redundancy of sessions... So, if the Tomcat which has the session dies, the whole session dies with him... - On Redundant Session Managers. There are a couple ways to achieve this. One way is to place two Session Managers in the network. One of them is the 'active' one, the other one could simply register itself as a client of the 'active' server. As a client, it can obtain copies of all new and changed sessions from the active server. If for some reason the active server needs to be brought down, it will send a message to all of it's clients (including the 'dormant' session manager) indicating that it's shutting down. The clients could, on receipt of this message, connect to the 'next' session server (in their pre-configured list of servers). The clients could simply carry on with the new server. Indeed... If the active server simply goes off the air for some mysterious reason. The clients would get a RemoteException the next time they tried to talk to the server. This would be their clue to 'cut-over' to the other server (as described above). But how would they know where the sessions ended up Last point. Sending Session delta's instead of the entire Session: This should be doable. The main thing that we care about are Session attributes which are changed by the application. It's up to the web-application to replace these values into the Session if their contents change. This is enough for us to be able to track which attributes have actually changed. This can actually be done if we consider every operation on a session (adding/replacing/removing an attribute) and atomic operation Let's see if I can complicate things a little bit :) (Love doing that). Let's imagine to have a pool of session managers (SA, SB, SC...) and a pool of servlet containers (T1, T2, T3...). The first thing we want to do is bring up our session managers. Once we start them SA, SB, SC and SD are available to accept sessions. Then we start our