Re: performance

Jason Rennie Tue, 11 Mar 2008 15:22:50 -0700

On Tue, Mar 11, 2008 at 5:18 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:


> Yes.  Each task is launching a JVM.


Guess that would explain the slowness :)  Is HDFS tuned similarly?  We're
thinking of possibly distributing our data using HDFS but storing a
sufficiently small amount of data per node so that the linux kernel could
buffer it all into memory.  Is there much overhead in grabbing data from
HDFS if that data is stored locally?

Map reduce is not generally useful for real-time applications.  It is VERY
> useful for large scale data reductions done in advance of real-time
> operations.
>
> The basic issue is that the major performance contribution of map-reduce
> architectures is large scale sequential access of data stores.  That is
> pretty much in contradiction with real-time response.
>

Gotcha.  We'll consider switching to a batch-style approach, which it sounds
like Hadoop would be perfect for.

Thanks,

Jason

Re: performance

Reply via email to