Re: Reduce Performance

Thorsten Schuett Fri, 24 Aug 2007 00:59:45 -0700

On Thursday 23 August 2007, Doug Cutting wrote:
> Thorsten Schuett wrote:
> > During the copy phase of reduce, the cpu load was very low and vmstat
> > showed constant reads from the disk at ~15MB/s and bursty writes. At the
> > same time, data was sent over the loopback device at ~15MB/s. I don't see
> > what else could limit the performance here. The disk can certainly
> > provide the data at higher speeds.
>
> It can if the reads are sequential, but might not if they're random.
> That said, there could well be a Hadoop bottleneck here, but I still
> doubt that it is the loopback device, which is surely capable of greater
> than 15MB/s, no?
To me it looks like as if the copy operation reduces/limits my reduce 
performance. But we can probably agree that it is not a good idea to copy 
files around when running in a single node, especially when using http for 
copying.


Thorsten

Re: Reduce Performance

Reply via email to