Re: Who wants my Cassandra session manager for Tomcat?

Pid * Thu, 05 Apr 2012 00:47:07 -0700

On 4 Apr 2012, at 17:19, Morten Jorgensen
<morten.jorgen...@openjawtech.com> wrote:


> Thanks again for your comments. More replies below:
>>>> That's interesting.  Can you share some details about how it works?
>>> Sure. It is quite simple. Cassandra is effectively a multi-level
>>> distributed hash-map, so it lends itself very well do storing session
>>> attributes.
>>>
>>> The session manager maintains two column families (like tables), one to
>>> hold session meta-data such as the last access timestamp, etc. and one
>>> column family to hold session attributes. Storing or reading a session
>>> attribute is simply a matter of writing it using the session ID as the
>>> row ID, and the session attribute name as the column name, and the
>>> session attribute value as the column value.
>>>
>>> Session attributes are read and written independently, so the entire web
>>> session does not have to be loaded into memory - only the session
>>> attributes that are actually required to service a request are read.
>>> This greatly reduces the memory footprint of the web applications that I
>>> am developing for my employer.
>> I'd be concerned about how chatty that was.
>>
>> Devil's advocate question: why store data in the session if it's not needed?
> Good question. For large web applications, and particularly web-based UIs 
> with multiple
> user screens, you would have certain data in your session for the various 
> screens/pages.

Not if I could avoid it, I wouldn't. I might have user data or refs
that I need for each page, but everything else goes in request scope.


> Not all pages need _all_ data in your session, and since the session manager 
> loads session
> attributes only when the web app code asks for it, only the data that is 
> required for the
> current page is loaded from Cassandra.
>>
>>> For improved performance I have added a write-through and a write-back
>>> cache, implemented as servlet filters. The cache is flushed or written
>>> back once the current request has finished processing. I am sure there
>>> is room for improvement here, as multiple concurrent requests for the
>>> same session should be served using the same cache instance.
>> But... (more devil's advocating, sorry) while this should address the
>> chattiness* problem, doesn't it mean that your solution is invasive and
>> can't be really deployed without modifying an app?
> The session manager works without this cache, but is slow. The cache is 
> configured
> as a filter configured in web.xml. The code of a web app won't have to be 
> changed,
> but you need to update your web.xml to use the session manager effectively.

Adding a Filter means modifying the app. </pedant>


>> * is that even a word?
> Yes it is, according to dictionary.com
>>
>>> The Manager does not maintain any references to Session instances at
>>> all, allowing them to be garbage collected at any time. This makes
>>> things very simple, as Cassandra holds all session state, and the
>>> session managers in my Tomcat nodes only act as a cache in front of
>>> Cassandra.
>>>
>>> The nature of Cassandra and the Tomcat's implementation of web sessions
>>> go together extremely well. I am surprised that nothing like this exists
>>> already. It is a square hole, square peg sort of scenario.
>> I'm not entirely sure I agree.
>>
>> Cassandra trades off consistency for availability and partition
>> tolerance, whereas I'd suggest a session management solution would want
>> to trade partition tolerance for consistency and availability.
>>
>> I'm also not sure that the comparison between column store and session
>> attribute map stands up beyond the initial/apparent similarity between
>> data type.
>>
>> Cassandra is write-optimised and hits disk (on at least two nodes for
>> HA) for every write AFAIK.
> Cassandra allows you choose your consistency level. I use a quorum write, 
> which
> writes to (N/2)+1 Cassandra nodes, where the Cassandra ring contains N nodes.

But as you say, you've discovered why this is slow for a webapp, and
you have to add a cache to each request to fix it.

I'd suggest you'd be better off just loading data into the request
scope directly, rather than indirectly.


> I think this makes sense for web session data, and my current implementation
> has this consistency-level hard-coded. I think it would probably make sense to
> allow this to be configured.
>>
>>> I also have an implementation of the Map interface that stores the
>>> values of each entry as a session attribute. The way many developers
>>> write web applications is to have a "session bean" (a session attribute)
>>> that contains a Map that maintains the actual session attributes. This
>>> is OK if the entire session is persisted as a whole, but it won't
>>> perform very well with the Cassandra session manager (or the Delta
>>> Session Manager from what I understand). A developer can replace their
>>> session bean's HashMap with the SessionMap utility, and the session
>>> attributes will be treated as proper session attributes by the session
>>> manager.
>> Is there not a way to do this internally&  therefore transparently to
>> the developer?  Otherwise you're introducing more dependencies and
>> creating more of a framework than a pluggable manager.
> I don't think there is a clean way of doing this without overriding the 
> default Map
> implementations of the JVM. But, I think storing session data as individual 
> session
> attributes rather than large object hierarchies is good (but not common)
> programming practice. It allows the session container/manager to manage
> read/write operations of the session attributes separately. This practice 
> should
> benefit not only my Cassandra session manager but also the existing Delta 
> manager.


>>>> 1. Be relatively self-contained -- i.e. not require much in the way of
>>>> changes to existing classes
>>> There are no changes to existing classes. My session manager implements
>>> the existing org.apache.catalina.Manager interface.
>> Instead of the filter, could you use a Valve?
> For the cache? The main reason why I use a filter is to be able to tie a cache
> object to a thread-local variable for the period for which the request is 
> being
> processed. As soon as the response is streamed to the client the cache is 
> released.
> If Tomcat already contains some internal reference to the current request 
> then I
> won't need to use a filter in this manner.

It must do, right?

A Valve is similar to a Filter but has access to the internal
representations of the request/session so would mean you don't need to
interfere with the app.

> I am not a fan of thread-local variables,
> so I'd very much like to remove the dependency on having this filter in place.
>
> Morten
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Re: Who wants my Cassandra session manager for Tomcat?

Reply via email to