Re: write and sort performance

Raghu Angadi Fri, 08 Jun 2007 15:48:23 -0700

Bwolen,

First of all, Hadoop is not optimized for small cluster or small burstsof writes/reads. There are some costs (like storing a copy locally andcopying it locally) that don't have benefits for small clusters with .

You could try using different disks (not just partitions) for tmpdirectory for Maps and for Datanode.

To compare single node write with Hadoop, you should run 'bin/hadoop-copyFromLocal - test' and pipe your dd command output there. May be youwill see 25% of 75MB you saw with native write. That is not unexpected.Not sure if you want to know all the details of why it is so. In yourtest you have many other one time costs of starting and stopping jobs etc.

I don't mean to say Hadoop can't do better.. its performance is steadilyimproving. But your expectations for toy application might be off.

If you want to figure out what the problem could be, you could startwith 'copyFromLocal' example above. Here you need to figure our whatDatanode process and Hadoop shell are doing at verious time (may be withstack traces).


Raghu.

Bwolen Yang wrote:

Please try Hadoop 0.13.0.  I don't know whether it will address your
concerns, but it should be faster and is much closer to what developers
are currently working on.


ok. It would also be good to see how DFS upgrade go between versions.
(looks like it got released today.  cool.)

For such a small cluster you'd probably be better running the jobtracker
and namenode on the same node and gain another slave.


When namenode and jobtracker were running on the same machine, I
notice failures due to losing contact with jobtracker.  This is why I
split the machines.

With regard to the performance details, it is really independent of
how many slaves I have.   The test is mainly trying to see how close
Hadoop compares to single node or scp, and what are the tuning
parameters to make things run faster.

Any suggestions on java profiling tools?

bwolen

Re: write and sort performance

Reply via email to