Re: RE: RE: [JBoss-dev] distributed HttpSessions - advice...
Look there might be more than one way to skin this cat, I am of the opinion that the EJB solution, can be completely simple and transparent, the semantic that julian requested was a findbyprimary key that maps trivially to the ejb framework. it is up to us to show that the container can be dedicated *using the distributed hastable* as the persistence (load and store would be get and set on the whole object, the object being the http session). I want to prove that EJB is a distributed SYSTEM construct, the moment we start seeing transactions accessing these stores (session) we would be able to include these trivially. So give me a chance to prove that point, something tells me that the EJB bashing is a by product of ignorance and strict spec interpretation. That said, all you really need is a distributed hashtable. There's already one implemented as the Distributed State Service. No coding, this is already done. Can't get more simpler than that. Implement an HttpSession as a serializable object. get the session from the DSS at the beginning of an HTTP request, put the session back into the DSS at the end of the request. Simple as that. Well at least for the 1st iteration :) slow down cowboy, you have been going fast these last days, you still need the finder semantic that Julian is asking for, not that it is complex but it stills needs to be there and it if you use EJB it is free and julian can work on it right now while we use the DSS to replicate the state with the code you have already written, the fancy invm cache we were talking about with Sacha can be your first iteration distributed hashtable. 2. Also, sub-partitioning should really be handled at the HAPartition level and should be totally transparent to every other clustering feature/service. Configuring sub-partition sizes and whether automatic-sub-partitioning should be turned on or not should been done solely in the HAPartition MBean. don't worry about this at first, the automatic topology partitioning is fancy but I don't foresee needing it immediately. 3. I think the complexity starts to become a factor when you start thinking about concurrent access to HttpSessions. IMHO, it is quite common to have 2 different HTTP requests accessing the same HTTP session concurrently. I think you're going to need some sort of distributed locking for HTTPSessions otherwise people are going to come across some really funky errors when they start using this stuff. Of course, you don't need distributed locking if you require the load-balancer to give you sticky sessions. Yes and this can be abstracted UNDER the EJB hood, marcf Bill Bill, let this be, if I am wrong I will eat my words, but I something tells me this is what EJBs were really given to us for, heck I remember Vlada Matena pitching sessions and entity at the same level (user constructs) to SAP. i.e. not even these guys saw the system potential of the EJB construct, tweaking the persistence and putting a custom interceptor stack (no security says Julian...ok) would really prove the point in spades, I think, let me try it. marcf __ View this jboss-dev thread in the online forums: http://jboss.org/forums/thread.jsp?forum=66thread=6221 ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
1. I don't think you can use EJBs to implement HttpSessions, see my other email. That said, all you really need is a distributed hashtable. There's already one implemented as the Distributed State Service. No coding, this is already done. Can't get more simpler than that. Implement an HttpSession as a serializable object. get the session from the DSS at the beginning of an HTTP request, put the session back into the DSS at the end of the request. Simple as that. Well at least for the 1st iteration :) 2. Also, sub-partitioning should really be handled at the HAPartition level and should be totally transparent to every other clustering feature/service. Configuring sub-partition sizes and whether automatic-sub-partitioning should be turned on or not should been done solely in the HAPartition MBean. 3. I think the complexity starts to become a factor when you start thinking about concurrent access to HttpSessions. IMHO, it is quite common to have 2 different HTTP requests accessing the same HTTP session concurrently. I think you're going to need some sort of distributed locking for HTTPSessions otherwise people are going to come across some really funky errors when they start using this stuff. Of course, you don't need distributed locking if you require the load-balancer to give you sticky sessions. Bill -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of marc fleury Sent: Monday, December 24, 2001 3:26 PM To: Sacha Labourey; [EMAIL PROTECTED] Subject: RE: RE: [JBoss-dev] distributed HttpSessions - advice... |OK. Will try to find so time to work on this from the 28th. In one hour, I |enter a black hole and will only go out in 90 hours... Impressive, huh? Thank you dr spock, thanks for listening... I also always knew you worked like a madman and that your weeks were really 90 hours long. Julian you don't have to wait for the 28 th to get started, just code with the entity bean, it won't be colocated at first. Create a HTTPSession do lookup/load store and let us bring the time on it down, it is our duty. marcf ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
Re: RE: [JBoss-dev] distributed HttpSessions - advice...
When you work with a SFSB, the client keeps a reference on the bean. This reference (stub) contains the list of target server that own a copy of the state (generally a subset of the whole cluster members). Fine. hehe why I love EJBs by Marc Fleury What you are describing and what Julian wanted is the capacity to do a findByPrimaryKey(id) on the session id when I say that the SFSB will work I mean just the part that replicates state in VM across nodes. The persistence engine here should be a JavaGroups based replicator, you have most of that logic built. Now, we have a location-problem: how could the getStateForSession be performed will minimal impact (i.e. minimize network round-trip delays). Either we use a dummy solution that will first find a bean with a given session id and then get its state or we use a customized-way. The first solution is a LEGO solution i.e. we use what is already present and build a new concept but may not be really really smart in term of performance (imagine the work needed for each HTTP call!) For the customized-way, we may have some kind of service (we can implement it as a SLSB, of a JMX service) that Jetty contact to get the state of a given sessionId. As Jetty could reuse this service, only a single call to this service is needed for each http call: getStateForSession. Any optimization is then done in this service. Two implementation ideas (at least): 1) Each node contains a kind of lookup table (SessionId, Targets) which hold, for a given sessionId, the list of nodes that own a copy of the state. The service then contact, sequentially, each of these target until a working one is found (well, most of the time, the first will be ok). In the background, the SessionState service (already implemented for SFSB) is used. Does the following make sense come in with an Http request, you have a session. You lookup the Entity EJB, with the primary key of the session, if you have configured the bean to be replicated on the node where you are, then the first ejbLoad would dump the state on the target node, this is transparent and simple. The entity semantic is what we need to locate an entity httpsession that exists across the nodes. We can download the ejbLoad state and cache the state on that node, the refresh should be done with the SFSB way of propagating change to the cluster. It would be a custom persistence engine for EJB in JBoss. I think it also shows the power of EJB in JBoss and how customizable they are. We never really showed the power of externalizable interceptors and plugins persistence per bean, this is the time and place to do it. marcf __ View this jboss-dev thread in the online forums: http://jboss.org/forums/thread.jsp?forum=66thread=6221 ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
Re: RE: [JBoss-dev] distributed HttpSessions - advice...
It would also show the colocation feature that you and Bill have built in the clustering of JBoss. If the bean is configured on the target Jetty node then we colocate and take care of state synchronization under the covers, if we are not on the same node, then the logic from jetty IS THE SAME, he looks up an entity loads stuff from it, stores stuff to it, and he never knows whether the bean is local or not. If it is local then it is all optimized stack, if it is not then it is calls to another VM. Even in the last case, the state IS NOT STORED TO DB, in fact the FIRST case I would show is a persistence engine that is just in VM, therefore IT DOES NOT DO ANYTHING WITH THE DB, they are sessions and they are not supposed to survive crashes. If we want persistence then having the bean replicated across VMs with your excellent SFSB replicator slightly tuned to be a very simple CMP engine (that just supports the findByPrimaryKey) then we are done. I say we try to leverage the framework you have put in place Sacha, I think we will be pleasantly surprised. So julian 1- work with the entityl, even if for now it is not optimized. jndi.lookup(HttpSessions); session = httpSesionHome.findByPrimaryKey(myID); create if not present. get/put stuff to it. You don't know if that is colocated or not. 2- Sacha. a- create in VM CMP that is just a hashmap of HTTPSessions. NO DATABASE CALLS JUST IN VM. the net result will be jetty making calls to the one node serving http sessions (he doesn't know which one just uses the code above). It should be reasonably fast as you only need the network stack to serialize from one vm to the other and there is no DB work involved AT ALL. Can you create that CMP engine? the only finder should be the find by primary key. b- when that is working and he is working with it, you can get really fancy and say you have a replicated EJB HTTPSESSIONS on your own node, we populate the cache as the loadbalancers send requests your way (they can be really dumb) and we use ejbLoad to bring the state here, register with JavaGroups for multicast notifications of state change, the colocation will be done automatically with the stuff that you and bill put in the proxies etc etc does this make sense? can you indulge me and try this? __ View this jboss-dev thread in the online forums: http://jboss.org/forums/thread.jsp?forum=66thread=6221 ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
2- Sacha. a- create in VM CMP that is just a hashmap of HTTPSessions. NO DATABASE CALLS JUST IN VM. the net result will be jetty making calls to the one node serving http sessions (he doesn't know which one just uses the code above). It should be reasonably fast as you only need the network stack to serialize from one vm to the other and there is no DB work involved AT ALL. Can you create that CMP engine? the only finder should be the find by primary key. b- when that is working and he is working with it, you can get really fancy and say you have a replicated EJB HTTPSESSIONS on your own node, we populate the cache as the loadbalancers send requests your way (they can be really dumb) and we use ejbLoad to bring the state here, register with JavaGroups for multicast notifications of state change, the colocation will be done automatically with the stuff that you and bill put in the proxies etc etc does this make sense? can you indulge me and try this? Marc, Yes, I can try this (and anyway, I will have to: you propose this since September... I can't make this last longer... ;) ) While very elegant, the problem I have with this solution concerns scalability. With the current SFSB code, if you have 10 nodes, state is not owned by the all 10 nodes. Instead, we create sub-partitions. For example, we would have 5 sub-partitions, each containing 2 nodes, and each node would contain its own state + the backup of the other node in its same sub-partition. Thus, while the number of node grows, the load is totally linear and we do not end up with each node having to store in-VM the state of 10 nodes! This imply that when you work with a particular SFSB, its stub *knows* which are the members of the sub-partition and will not ask for its state to node 7, if only node 1 and 2 own its state. In the case you describe, the findByPrimaryKey call can be run on any node. Thus, the location (read location not colocation) problem i've described in my previous post: how to make the findByPrimaryKey call nice enough to address the call a good node of the cluster for a particular sessionId. As for caching, optimisations, ... it would be possible only when working with a well-parameterized hard load-balancers that would always direct calls to the same node: the own that owns a copy of the sate. in this, no network call is necessary to get the state. But if the calls are made on a bad node, each time an invocation comes in, the persistence engine will need to contact a good node to get the state. Otherwise, we need to implement a distributed-cache engine. Currently, this distributed-cache (hit-miss) engine is only available *inside* a sub-partition. You follow me? Cheers, Sacha ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
... we could imagine to have two layers... 1) the persistence layer: we keep it as for SFSB with sub-partitions, ... 2) the cache layer: here we implement a new, very simple, distributed cache. The first time a findByPrimaryKey occurs, the cache will not contain the data and a costly find will determine who owns the state for the particular sessionid. The result is cached in the distributed cache. For the next invocations, if the call is performed on the same node (thanks to the nice load-balancers), the cache will use the local value. If the data is modified on another node, the cache is invalidated throught multicast. So we have two layers. The first to obtain scalability under high load, the second to obtain performance because we don't know who really owns a state. Petit papa Noël... -Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]De la part de Sacha Labourey Envoyé : lundi, 24 décembre 2001 17:35 À : marc fleury; [EMAIL PROTECTED] Objet : RE: RE: [JBoss-dev] distributed HttpSessions - advice... 2- Sacha. a- create in VM CMP that is just a hashmap of HTTPSessions. NO DATABASE CALLS JUST IN VM. the net result will be jetty making calls to the one node serving http sessions (he doesn't know which one just uses the code above). It should be reasonably fast as you only need the network stack to serialize from one vm to the other and there is no DB work involved AT ALL. Can you create that CMP engine? the only finder should be the find by primary key. b- when that is working and he is working with it, you can get really fancy and say you have a replicated EJB HTTPSESSIONS on your own node, we populate the cache as the loadbalancers send requests your way (they can be really dumb) and we use ejbLoad to bring the state here, register with JavaGroups for multicast notifications of state change, the colocation will be done automatically with the stuff that you and bill put in the proxies etc etc does this make sense? can you indulge me and try this? Marc, Yes, I can try this (and anyway, I will have to: you propose this since September... I can't make this last longer... ;) ) While very elegant, the problem I have with this solution concerns scalability. With the current SFSB code, if you have 10 nodes, state is not owned by the all 10 nodes. Instead, we create sub-partitions. For example, we would have 5 sub-partitions, each containing 2 nodes, and each node would contain its own state + the backup of the other node in its same sub-partition. Thus, while the number of node grows, the load is totally linear and we do not end up with each node having to store in-VM the state of 10 nodes! This imply that when you work with a particular SFSB, its stub *knows* which are the members of the sub-partition and will not ask for its state to node 7, if only node 1 and 2 own its state. In the case you describe, the findByPrimaryKey call can be run on any node. Thus, the location (read location not colocation) problem i've described in my previous post: how to make the findByPrimaryKey call nice enough to address the call a good node of the cluster for a particular sessionId. As for caching, optimisations, ... it would be possible only when working with a well-parameterized hard load-balancers that would always direct calls to the same node: the own that owns a copy of the sate. in this, no network call is necessary to get the state. But if the calls are made on a bad node, each time an invocation comes in, the persistence engine will need to contact a good node to get the state. Otherwise, we need to implement a distributed-cache engine. Currently, this distributed-cache (hit-miss) engine is only available *inside* a sub-partition. You follow me? Cheers, Sacha ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
Sacha I think most usages will use 2-3 nodes of jboss simply replicated with the same beans everywhere. No complex stuff, no sub-partition etc etc, only when we grow in super clusters of 10 machines do we start gaining stuff from more advanced design. There is a very simple design, the one here we can do in no time. This also has no database impact (we don't work with the database). And I seriously think we can go far with such a simple design, there is nothing Julian needs to know on the entity side except the findSession (findByPrimaryKey) and that's it. Clustered nodes. |Marc, | |Yes, I can try this (and anyway, I will have to: you propose this since |September... I can't make this last longer... ;) ) Yes I propose this over and over, but it is so simple a design, that we are doing nothing that we don't do already. We need a Hashmap for a cmp engine (how trivial is that), and DONE. when we move to SFSB state replication we can start worrying about sub-partitions and then you will have to worry about load balancers being smart and all, when we can say look configure your cluster to have 5 nodes fully replicated, in fact do the admin from the central web server and you are done... this is simple and we probably won't run into the upper limit of this system, where the sub-partitions start making a difference, I really think so and it is so thin that it would be silly not to see it work or crash. |While very elegant, the problem I have with this solution concerns |scalability. very high-end scalability. |With the current SFSB code, if you have 10 nodes, state is not owned by the |all 10 nodes. Instead, we create sub-partitions. For example, we would have |5 sub-partitions, each containing 2 nodes, and each node would contain its |own state + the backup of the other node in its same sub-partition. Thus, |while the number of node grows, the load is totally linear and we |do not end |up with each node having to store in-VM the state of 10 nodes! But you only have the beans on pinned nodes, if I understand your description correctly, I could be wrong. Here we need the beans on all nodes in the jetty cluster hence the simple solution, just replicate configuration across your node and have mirror images running. It is fancy but only in particular cases do I see this being useful, I could be wrong. I don't need 5x2 I need 10. | This imply |that when you work with a particular SFSB, its stub *knows* which are the |members of the sub-partition and will not ask for its state to node 7, if |only node 1 and 2 own its state. Here the persistence engine would need to be rewritten to ejbLoad locally, and then you follow the bean, if you don't want to replicate too many then the architecture would be the pyramid is main servers for HTTPSESSION :3 servers running mirror images edge servers for JETTY: 10 servers pinging the mirrors. web millions and millions and billions you have 3=10 scaling which in my experience of SAP takes care of even alien load. This is a standard SAPr/3 configuration... don't come with scalability issues. Here we have serialization between the jetty/jboss session as they are on different nodes. It can still be a very fancy install, no need to sub-partition or anything it is an admin issue and the code is simple, not trivial but simple (SFSB like in VM on the t Another solution (still simple to code would be) main server = jetty server we have 5 servers. the jetty stuff has the colocated bean, your fancy colocation proxy logic *you already have* determines to call thelocal bean, it is super fast as there is no network and the pyramid I described above is really a circle of mirror instances with FULL STACK. Both would work really nicely and I don't know that we need anything more complex than this simple EJB structure. You have done the work on EJB clustering, time to see it pay off. This is it. |In the case you describe, the findByPrimaryKey call can be run on any node. |Thus, the location (read location not colocation) problem i've |described |in my previous post: how to make the findByPrimaryKey call nice enough to |address the call a good node of the cluster for a particular sessionId. we are talking about the clustered colocated case The findByPrimaryKey is running on the home of the colocated bean, if the cache has that bean then you return immediately, if he doesnt then ejbLoad knows to pull the initial state for that session and register the listeners for that session. If the session needs exclusive access then we just provide the locking mechanisms and that is configurable. |As for caching, optimisations, ... it would be possible only when working |with a well-parameterized hard load-balancers that would always |direct calls |to the same node: the own that owns a copy of the sate. What I am thinking of is that load balancers are effective when stateless, if we use mirror clusters, completely symetric, you don't know or need topology information on the cluster.
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
|we could imagine to have two layers... | |1) the persistence layer: we keep it as for SFSB with sub-partitions, ... | |2) the cache layer: here we implement a new, very simple, |distributed cache. ok, but I still think that making the cluster fully symmetrical will go to good height and will keep the coding and *administration* so simple that it would be silly not to do that in first iteration. |The first time a findByPrimaryKey occurs, the cache will not contain the |data and a costly find will determine who owns the state for the |particular sessionid. The result is cached in the distributed cache. yes, that cache is the EJB cache for the entity but the cache implementation is really a distributed invm cmp. |For the next invocations, if the call is performed on the same node (thanks |to the nice load-balancers), the cache will use the local value. I see you would have the load balancers be smart enough to know that a given session is living on a node. Even if we don't assume this for now it would work. I am pretty sure that we can go the distance with 5 mirror nodes in circle or the pyramid, the circle has no network overhead the pyramid minimizes state synchro while not limited front end machines. My gut feeling is that circle is the way to go it is simple to code and simple to administer (fully symetric), plus pyramid is what SAP did. |If the data |is modified on another node, the cache is invalidated throught multicast. For example that is simple. Also simple locking would be useful, must be optional. |Petit papa Noël... Ouais ben c'est toi le petit papa noel... ecoute, rien que le CMP hashmap qui ne fait RIEN du tout dans la DB serait un super plus pour jetty. On parle pyramide avec un CMP engine et le reste au dessus. Ensuite inVm cluster et notif et on a un cercle... a tester gamin c'a peut aller loin, pour pas cher, ca va etre simple et robuste. Simple robuste pas cher, j'aime c'est notre domaine ca, le mass-technology... go mon para go marcf | | -Message d'origine- | De : [EMAIL PROTECTED] | [mailto:[EMAIL PROTECTED]]De la part de | Sacha Labourey | Envoyé : lundi, 24 décembre 2001 17:35 | À : marc fleury; [EMAIL PROTECTED] | Objet : RE: RE: [JBoss-dev] distributed HttpSessions - advice... | | | 2- Sacha. | a- create in VM CMP that is just a hashmap of HTTPSessions. NO | DATABASE CALLS JUST IN VM. the net result will be jetty making | calls to the one node serving http sessions (he doesn't know | which one just uses the code above). It should be reasonably | fast as you only need the network stack to serialize from one vm | to the other and there is no DB work involved AT ALL. Can you | create that CMP engine? the only finder should be the find by | primary key. | | b- when that is working and he is working with it, you can get | really fancy and say you have a replicated EJB HTTPSESSIONS on | your own node, we populate the cache as the loadbalancers send | requests your way (they can be really dumb) and we use ejbLoad to | bring the state here, register with JavaGroups for multicast | notifications of state change, the colocation will be done | automatically with the stuff that you and bill put in the proxies | etc etc | | does this make sense? can you indulge me and try this? | | Marc, | | Yes, I can try this (and anyway, I will have to: you propose this since | September... I can't make this last longer... ;) ) | | While very elegant, the problem I have with this solution concerns | scalability. | | With the current SFSB code, if you have 10 nodes, state is not | owned by the | all 10 nodes. Instead, we create sub-partitions. For example, we | would have | 5 sub-partitions, each containing 2 nodes, and each node would |contain its | own state + the backup of the other node in its same sub-partition. Thus, | while the number of node grows, the load is totally linear and we | do not end | up with each node having to store in-VM the state of 10 nodes! This imply | that when you work with a particular SFSB, its stub *knows* which are the | members of the sub-partition and will not ask for its state to node 7, if | only node 1 and 2 own its state. | | In the case you describe, the findByPrimaryKey call can be run on | any node. | Thus, the location (read location not colocation) problem | i've described | in my previous post: how to make the findByPrimaryKey call nice enough to | address the call a good node of the cluster for a particular sessionId. | | As for caching, optimisations, ... it would be possible only when working | with a well-parameterized hard load-balancers that would always | direct calls | to the same node: the own that owns a copy of the sate. in this, | no network | call is necessary to get the state. But if the calls are made on a bad | node, each time an invocation comes in, the persistence engine | will need to | contact a good node to get the state. | | Otherwise, we need to implement a distributed-cache engine. | Currently
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
you have 3=10 scaling which in my experience of SAP takes care of even alien load. alien load he he he... :) I think so and it would be pretty, correct me if I am wrong but the initial design is so simple that we could be done with it in a week. Only exclusive access to the HttpSession would require more work but even then it could be simple. OK. Will try to find so time to work on this from the 28th. In one hour, I enter a black hole and will only go out in 90 hours... Impressive, huh? Dr. Spock ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development
RE: RE: [JBoss-dev] distributed HttpSessions - advice...
|OK. Will try to find so time to work on this from the 28th. In one hour, I |enter a black hole and will only go out in 90 hours... Impressive, huh? Thank you dr spock, thanks for listening... I also always knew you worked like a madman and that your weeks were really 90 hours long. Julian you don't have to wait for the 28 th to get started, just code with the entity bean, it won't be colocated at first. Create a HTTPSession do lookup/load store and let us bring the time on it down, it is our duty. marcf ___ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development