Re: [The Java Posse] Maximising performance

Kevin Wright Fri, 06 Jan 2012 06:15:43 -0800

Yes, and no.

STM if a fairly nice solution, you can also get some good clean
transaction-esque behaviour by using actors, so I wouldn't rule out that
line of behaviour entirely.


As for single threaded... I guess it depends entirely on your use case.
 There's definitely an argument to be made in favour of the asynchronous
approach as exemplified by node.js, but I wouldn't feel comfortable trying
to force a design into that model if it was clearly not going to be a good
fit.  The real issue here is figuring out exactly what kind of control you
need over access to shared resources, and if you want to optimise for
throughput or latency.



On 6 January 2012 14:03, Kirk Pepperdine <[email protected]> wrote:

> All nice points.
>
> I would add.. avoid transactions like the plague.
> Single threaded will be easier to scale out.. (think http)
> +1 on keep things in memory.. in fact I've customers that no longer put
> disks in their machines.. which surprisingly increases reliability.. (which
> really shouldn't be surprising).
>
> Regards,
> Kirk
>
> On 2012-01-06, at 1:25 PM, Kevin Wright wrote:
>
> It's hard to answer without more information, especially regarding data
> retention requirements (hint: things can be made a lot faster if you don't
> have to keep persisting them to disk)
>
> General principles though:
>
>    - work in memory as much as possible, discs are SLOW
>    - disk-based databases are also slow, though caching helps here.  If
>    you MUST use a db, then consider optimised solutions.  voltdb has about the
>    best performance going for very fast writes, monetdb is similarly
>    impressive for read performance with complex queries.  A NoSQL solution
>    (redis, cassandra, etc.) may also be the best fit, depending on your use
>    case.
>    - architect things so you can scale by adding more machines
>    - favour a stateless design, or enforce session affinity through your
>    load balancer
>    - JSON is good, but also consider protocols buffers or MessagePack as
>    a way or countering serialisation overhead.  Avoid XML like the plague
>    - cache aggressively wherever it makes sense to do so.  If you'll have
>    thousands of requests for the same resource then use Varnish, it works well
>    even with a sub-1s time to live.
>    - Don't cache in the Java heap, garbage collection algorithms aren't
>    particularly good with such a usage pattern.  memcached is a much nicer
>    choice.  Better still, use varnish if you're able to cache at the protocol
>    layer.
>    - take a look at the actor paradigm, it's a very effective way to deal
>    with clustering and passing messages between machines.  Akka 2.0 is shaping
>    up to be very powerful in this area.
>    - don't lose track of the need to balance performance vs time to
>    market.  You can always find a way to make things faster given an infinite
>    time budget, but that never happens in the real world.
>
> As for case studies...
>
>    - Facebook lean heavily on cassandra and hadoop to do much of their
>    heavy lifting, they've also made massive investments in speeding up and
>    compiling PHP, which suggests that it probably wasn't the best initial
>    choice of language they could have made for their front-end.
>
>    - Twitter, famously, got a significant speed boost by dropping a lot
>    of Ruby code from their perfomance-critical systems, replacing it with
>    Scala instead.  They also implemented their own graph database.
>
> and, yes, LMAX and the disruptor pattern is nothing short of amazing.
>
>
> On 6 January 2012 11:28, Rakesh <[email protected]> wrote:
>
>> Hi guys,
>>
>> I was wondering if you guys could educate me or at least point me to
>> some useful resources.
>>
>> Lets say I was tasked with architecting a web application where I was
>> expecting huge volumes of transactions, circa millions of transactions
>> in a small hour or so window at peak times.
>>
>> I could do a traditional n-tier architecture with the web at one end,
>> business/service layer in the middle and a big database at the other
>> end. Perhaps even do JMS between components (with Active MQ).
>>
>> Would that be up to the job? What if it wasn't. What are my choices?
>> From what I know, there are 2 options:
>>
>> 1. optimise for the single threaded model - something like what LMax
>> has done (Martin Fowler has a post on his blog) and try and remove the
>> DB from the loop. I (think) this also includes software transactional
>> memory-type architectures?
>>
>> 2. explicitly move to a multi-threaded model.
>>
>> Is that roughly the options? What do Facebook and Twitter do to manage
>> the huge load?
>>
>> All feedback welcome.
>>
>> Cheers
>>
>> R
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "The Java Posse" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/javaposse?hl=en.
>>
>>
>
>
> --
> Kevin Wright
> mail: [email protected]
> gtalk / msn : [email protected]
> quora: http://www.quora.com/Kevin-Wright
> google+: http://gplus.to/thecoda
> <[email protected]>
> twitter: @thecoda
> vibe / skype: kev.lee.wright
> steam: kev_lee_wright
>
> "My point today is that, if we wish to count lines of code, we should not
> regard them as "lines produced" but as "lines spent": the current
> conventional wisdom is so foolish as to book that count on the wrong side
> of the ledger" ~ Dijkstra
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "The Java Posse" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/javaposse?hl=en.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "The Java Posse" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/javaposse?hl=en.
>



-- 
Kevin Wright
mail: [email protected]
gtalk / msn : [email protected]
quora: http://www.quora.com/Kevin-Wright
google+: http://gplus.to/thecoda
<[email protected]>
twitter: @thecoda
vibe / skype: kev.lee.wright
steam: kev_lee_wright

"My point today is that, if we wish to count lines of code, we should not
regard them as "lines produced" but as "lines spent": the current
conventional wisdom is so foolish as to book that count on the wrong side
of the ledger" ~ Dijkstra

-- 
You received this message because you are subscribed to the Google Groups "The 
Java Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Re: [The Java Posse] Maximising performance

Reply via email to