It's hard to answer without more information, especially regarding data
retention requirements (hint: things can be made a lot faster if you don't
have to keep persisting them to disk)

General principles though:

   - work in memory as much as possible, discs are SLOW
   - disk-based databases are also slow, though caching helps here.  If you
   MUST use a db, then consider optimised solutions.  voltdb has about the
   best performance going for very fast writes, monetdb is similarly
   impressive for read performance with complex queries.  A NoSQL solution
   (redis, cassandra, etc.) may also be the best fit, depending on your use
   case.
   - architect things so you can scale by adding more machines
   - favour a stateless design, or enforce session affinity through your
   load balancer
   - JSON is good, but also consider protocols buffers or MessagePack as a
   way or countering serialisation overhead.  Avoid XML like the plague
   - cache aggressively wherever it makes sense to do so.  If you'll have
   thousands of requests for the same resource then use Varnish, it works well
   even with a sub-1s time to live.
   - Don't cache in the Java heap, garbage collection algorithms aren't
   particularly good with such a usage pattern.  memcached is a much nicer
   choice.  Better still, use varnish if you're able to cache at the protocol
   layer.
   - take a look at the actor paradigm, it's a very effective way to deal
   with clustering and passing messages between machines.  Akka 2.0 is shaping
   up to be very powerful in this area.
   - don't lose track of the need to balance performance vs time to market.
    You can always find a way to make things faster given an infinite time
   budget, but that never happens in the real world.

As for case studies...

   - Facebook lean heavily on cassandra and hadoop to do much of their
   heavy lifting, they've also made massive investments in speeding up and
   compiling PHP, which suggests that it probably wasn't the best initial
   choice of language they could have made for their front-end.

   - Twitter, famously, got a significant speed boost by dropping a lot of
   Ruby code from their perfomance-critical systems, replacing it with Scala
   instead.  They also implemented their own graph database.

and, yes, LMAX and the disruptor pattern is nothing short of amazing.


On 6 January 2012 11:28, Rakesh <[email protected]> wrote:

> Hi guys,
>
> I was wondering if you guys could educate me or at least point me to
> some useful resources.
>
> Lets say I was tasked with architecting a web application where I was
> expecting huge volumes of transactions, circa millions of transactions
> in a small hour or so window at peak times.
>
> I could do a traditional n-tier architecture with the web at one end,
> business/service layer in the middle and a big database at the other
> end. Perhaps even do JMS between components (with Active MQ).
>
> Would that be up to the job? What if it wasn't. What are my choices?
> From what I know, there are 2 options:
>
> 1. optimise for the single threaded model - something like what LMax
> has done (Martin Fowler has a post on his blog) and try and remove the
> DB from the loop. I (think) this also includes software transactional
> memory-type architectures?
>
> 2. explicitly move to a multi-threaded model.
>
> Is that roughly the options? What do Facebook and Twitter do to manage
> the huge load?
>
> All feedback welcome.
>
> Cheers
>
> R
>
> --
> You received this message because you are subscribed to the Google Groups
> "The Java Posse" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/javaposse?hl=en.
>
>


-- 
Kevin Wright
mail: [email protected]
gtalk / msn : [email protected]
quora: http://www.quora.com/Kevin-Wright
google+: http://gplus.to/thecoda
<[email protected]>
twitter: @thecoda
vibe / skype: kev.lee.wright
steam: kev_lee_wright

"My point today is that, if we wish to count lines of code, we should not
regard them as "lines produced" but as "lines spent": the current
conventional wisdom is so foolish as to book that count on the wrong side
of the ledger" ~ Dijkstra

-- 
You received this message because you are subscribed to the Google Groups "The 
Java Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Reply via email to