> Couple of queries,
> 1. Each client getting its queue will be costly from
> management/implementation perspective. Like 120K queues maintained and
> each one taking care of its registered listener(s)
> 2. Was there any noticeable delay in processing, with this architecture
> 3. Will this work for protocols like SMPP, which sends a message and
> expects a response with an ID. (may be it could work, its just that
> the client needs to be a bit smarter)
>
> thanks
> ashish
1.) The way Terracotta handles clustering of JVMs is useful to mitigate the
cost of this. The Terracotta server manages the objects that have been
"shared" by various JVMs in the cluster. Once an object is shared, it can
actually be flushed out of the JVMs and only stored on the Terracotta server.
The Terracotta server itself is capable of paging these objects out to disk.
So, even if a particular MINA system has 20,000 client connections with a queue
for each, a lot of them can actually be paged out of memory assuming they are
inactive. The ones that are needed are paged into the JVM as necessary.
(Terracotta does various operations to minimize traffic such as sending diffs
of objects, and uses caching strategies to attempt to keep frequently hit
objects in the JVM). So, this keeps memory usage due to the queues to a
minimum.
Second...Terracotta works by the standard synchronization primitives you're
used to. To minimize the work that one of the MINA servers has to do to listen
to the queues, I tried to take advantage of this. I wrapped each queue with a
sort of observer/observable pattern. In front of this, I added a sort of
"select" like interface in the MINA servers. So...when the MINA server is
inactive, it is sitting there in the typical multithreaded manner waiting for
work. When one of the front end servers pushes a job onto a queue, the MINA
server is notified, which ends up pushing that queue on a list of ready queues
for that MINA server, and then triggers a notify.
So...sure, it is somewhat costly to maintain queues and such, and you could do
it in a more efficient way perhaps depending on your exact requirements. For
our case, we wanted to avoid the possibility of Out of Memory exceptions on our
various cluster members, as well as provide for the option of queued jobs which
will be processed as possible by the MINA servers. With some of the care we
took, it doesn't really seem to put too much load on any of the particular
systems...and hopefully in cases where the systems are overloaded by a burst of
activity, they will be capable of working through it, just at a reduced
response time.
2.) With a ton of clients connected, sending a message destined for a single
client through the front end server (so, front end server --> terracotta
server --> MINA server, and back through), did not add much in the way of
delay. Message throughput on the tens or hundreds of milliseconds level per
message is not exactly critical to our application, so we're not exactly at
that level however. We have also done some testing where 10,000 to 20,000
clients will be connected, and we send broadcast operations that are intended
to go to every client, and then mix in some targeted operations as well. This
can be kind of slow. Terracotta distributes lock ownership and acquisition,
and we have some locks that become highly contested when the system gets busy
with messages passing between multiple front end servers and MINA backend
servers. This is something we see as a great opportunity for performance
enhancement when it comes to it. Terracotta itself
offers pretty high levels of throughput. Their open source offering doesn't
support it, but they have a paid-for enterprise version that supports striping
of servers which brings the throughput much higher.
3.) Our system is setup so that operations come full circle and the response
ends up back at the front end server and returned to the caller. In short, the
process is kind of like this...FRONT END SERVER creates a JOB, the JOB is then
dispersed to queues for any clients it is destined for. (1 for a single client
operation, all queues for a broadcast, or anything in between). The FRONT END
SERVER does a wait() on the JOB. Meanwhile, BACKEND SERVERS have been notified
of JOBS put on the queues. They communicate however they need to with their
client, and then they put the response from the client back into a mapping of
client-->result inside the JOB. Once the number of results matches the number
of clients, a notify() occurs, and the FRONT END SERVER now proceeds, pulling
the results for the clients out of the JOB, and passing the result back to
whomever made the call to the FRONT END SERVER. It is very much like a typical
multithreaded
application, since the Terracotta JVM clustering shares object graphs and
synchronization primitives across the cluster.
As an alternative to the synchronous approach above, we also allow for a JOB to
be submitted and a job ID immediately returned. This asynchronous approach
allows a client to potentially come back in and query the status of a job in
the system. This might be used when someone wants to submit an operation that
will be broadcasted to all clients, or some large number that will take a while
to process.
I'm not sure how well it would work to try and create a generic approach that
would fit the needs of different protocols and such on top of MINA. Most of
the clustering happened a level above MINA, more in our application logic. It
worked well for us though since we have a custom protocol to implement anyway,
and weren't just picking up something like FTP and hoping for it to
automatically scale and be highly available with MINA.
Chris Popp