The typical approach to scaling out the web tier of Java apps has been to use a load balancer with sticky sessions and then do session replication between the backing nodes. Unfortunately this doesn't scale well because at some point the coordination across nodes has too much overhead. I think that when most people get there they end up dumping the session replication piece and then depend on sticky sessions and keeping a server running forever or just letting customers loose sessions (I see this happen all the time on the united.com website).

A better approach is to move to a stateless web tier. Most of the new Java web frameworks take this approach (Play Framework and many of the new Scala web frameworks). That makes it so sticky sessions and session replication aren't needed. If there is cross-request state required then there are a few options for where to put it: - On the client if it's UI state (Ajax / HTML5 are moving us in this direction)
 - Into an external RDBMS or NoSQL DB if it needs to be durable
- Into an external cache system (like Memcached) if it doesn't need to be durable

With a stateless web tier you can simply add (or remove) web nodes when needed. This is the model we are promoting for apps running on Heroku.

BTW: I recently publishing a blog on how to transparently move web app session state to an external MongoDB system:
http://www.jamesward.com/2011/11/30/using-mongodb-for-a-java-web-apps-httpsession

In that example I used Jetty since it has a pluggable session manager. But you can also do the same thing with Tomcat.

-James


On 01/26/2012 06:03 AM, Carl Jokl wrote:
This is hopefully a quick question just to rationalise my
preconceptions about being able to scale up enterprise applications to
high numbers of cores and possibly clusters of computers.

My understanding is that this is something which has been successfully
achieved using JavaEE and Application Servers. I also understand that
to my knowledge that the application server is implemented to allow
itself to run in a clustered way i.e. scale over more than one
machine. In order to take advantage of this I understand that
applications running in this environment are required to functioning
within a number of constraints to enable the application server to be
able to scale or cluster the code.

I appreciate also that just throwing a system on an application server
will not automatically cause it to scale well and a lot of care must
be taken to eliminate or mitigate synchronisation bottlenecks and
manage the amount of shared state / data that the application servers
have to keep synchronised between them.

At this point this is still all theoretical knowledge as I have never
had the opportunity to work with a system that operates in this way.

Does Spring support this kind of clustering or is it really only EJB
that is designed for this?

Also I understand that other platforms, particularly C++ can also be
used for highly scalable systems. My understanding here though is that
it is usually a matter of building everything from scratch as C++
isn't really geared towards a general purpose application server
model. I would also assume a similar situation with Objective-C.

Even for Java, trying to build a highly scalable server system that
can be run on clusters of servers is a major undertaking if trying to
do it from scratch rather than using an existing application server.

Also, due to the number of companies that have and do run highly
scalable systems on Applications servers, access to individuals with
expertise in this area or documentation and example real applications
using this technology are more readily available than for other
technologies.

Trying to implement this kind of system from scratch even in Java (or
another JVM language) and in particular using lower level languages
like C++ is such an undertaking that it would be considered foolhardy
unless backed by a major company with deep pockets and lots of time as
well as by a team of very highly skilled people. For example Amazon is
able to use C++ in their highly scaled architecture but Amazon is a
very big company with a lot of resources behind it.

I just want to get some feedback as to whether my current perceptions
are fair or whether I am wrong about any of these assumptions. It
might be a bit of a moot point as I am not sure if a discussion about
this would come up for me but if it does I would like to feel prepared
with well informed and fair views on the matter.


--
You received this message because you are subscribed to the Google Groups "The Java 
Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Reply via email to