-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kyle,
On 9/7/12 12:19 PM, kharp...@oreillyauto.com wrote: > Chris: >> Assembling the sessions into a Collection is likely to be very >> fast, since it's just copying references around: the size of the >> individual sessions should not matter. Of course, pushing all >> those bytes to the other servers... > >> Perhaps Tomcat does something like serialize the session to a >> big binary structure and then sends that (which sounds insane -- >> streaming binary data is how that should be done -- but I haven't >> checked to code to be sure). > > It appears that tomcat is serializing all the data into a singular > structure, rather than a collection of references. :( > Watching VisualVM plot heap usage during replication, it nearly > doubles (in my test env, this was the only thing running so that > makes sense). That certainly sound like Tomcat is chewing-through a lot of heap space. Without understanding the implementation, I can't comment on whether or not that is really necessary, but I have to imagine that a streaming-serialization (or at least buffered, where one session is serialized at a time and then streamed) would be superior. > If you're sure Tomcat is only making references, then I'd propose > there is a problem with the JVM dereferencing the collection > elements and double-counting the memory used. It's very unlikely that the JVM is making that kind of mistake. Also, I haven't looked at a single line of Tomcat's session-distribution code so I'm not in a position to make accurate assertions about its implementation. > Either way, it's enough to make the JVM report a doubling of heap > usage and a raise to the heap allocation. As soon as replication > is done, heap use goes back to normal. I've attached a screenshot > to the zip file. Sounds like your analysis is reasonable. I'll look at your data and make further comments. In the meantime, I've heard over the years one particular thing from people I feel know what they are talking about: don't use HttpSession. It's not that the servlet spec's session-management isn't useful, it's that it is very hard to make it scale properly when using most container-managed clustering implementations. First, it almost always uses Java Serialization which is not terribly efficient. Second, it is very coarse-grained: you replicate the entire session or the whole thing fails (unless you use some vendor-specific tricks like Rainer's suggestion of suppressing certain attributes). If you have a lot of data in the session that doesn't *need* to be replicated (like caches, etc.) then you might be wasting a lot of time. So, what to use instead? Memcached comes up often as a similar solution to the problem of clustered temporary data storage. Why memcached instead of HttpSession clustering? I think about half of the answer is that when you have to manually push your own data to a remote cache (instead of allowing the container to handle the details), you tend to get very frugal with the amount of data you end up pushing. We don't use distributed sessions on our product even though (almost) everything in the session is serializable. Even so, I've thought of writing a wrapper for certain objects (like caches) that we store in the session that never need to be replicated (at least not in their complete forms). I haven't done it because, well, we don't need it. Stick-sessions with failover is good enough for us, and we don't have to saturate our network with session-replication chatter. - -chris -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlBKN9oACgkQ9CaO5/Lv0PB3LACgsrVWsuWWkb0ckfIPeiNUMoq4 8fcAoIb0FQU/2EsET1AmIHGkX20si4lG =xKjd -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org