I'm adding support in Hadoop for Ceph (http://ceph.sourceforge.net/),
a distributed filesystem developed at UC Santa Cruz (http://
ssrc.cse.ucsc.edu/). Ceph runs entirely in userspace and is written
in C++. My current implementation is a subclass of FileSystem that
uses a bit of JNI glue to invoke the C++ Ceph client code.
I'm having trouble with a small test: RandomWriter, 4 TaskTracker
nodes, 5 maps per node, 10 MB per map, for a total of 200 MB over 20
Map tasks. I tried it on Hadoop with DFS, and it took about 30
seconds. Then, I ran the same test using Ceph. I changed
fs.default.name to "ceph:///"; added fs.ceph.impl as
org.apache.hadoop.fs.ceph.CephFileSystem; and left all other
configuration settings untouched. It ran horrifically slowly.
I ran the JobTracker and each TaskTracker in a separate terminal to
watch the output. One of the TaskTracker nodes gave me this:
07/06/01 00:16:49 INFO mapred.TaskRunner: task_0001_r_000000_0 Need
400 map output(s)
07/06/01 00:16:49 INFO mapred.TaskRunner: task_0001_r_000000_0 Need
400 map output location(s)
Then the JobTracker spawned 400 Map tasks:
07/06/01 00:23:11 INFO mapred.JobTracker: Adding task
'task_0001_m_000397_0' to tip tip_0001_m_000397, for tracker
'tracker_issdm-11.cse.ucsc.edu:50050'
07/06/01 00:23:12 INFO mapred.JobInProgress: Task
'task_0001_m_000396_0' has completed tip_0001_m_000396 successfully.
07/06/01 00:23:12 INFO mapred.TaskInProgress: Task
'task_0001_m_000396_0' has completed.
07/06/01 00:23:12 INFO mapred.JobInProgress: Choosing normal task
tip_0001_m_000398
07/06/01 00:23:12 INFO mapred.JobTracker: Adding task
'task_0001_m_000398_0' to tip tip_0001_m_000398, for tracker
'tracker_issdm-8.cse.ucsc.edu:50050'
07/06/01 00:23:13 INFO mapred.JobInProgress: Task
'task_0001_m_000397_0' has completed tip_0001_m_000397 successfully.
07/06/01 00:23:13 INFO mapred.TaskInProgress: Task
'task_0001_m_000397_0' has completed.
07/06/01 00:23:13 INFO mapred.JobInProgress: Choosing normal task
tip_0001_m_000399
07/06/01 00:23:13 INFO mapred.JobTracker: Adding task
'task_0001_m_000399_0' to tip tip_0001_m_000399, for tracker
'tracker_issdm-11.cse.ucsc.edu:50050'
I'm ending up with way too many Map tasks, and as a result the job
takes way too long to run.
I strongly suspect this is a problem with my implementation, but I'm
not sure where to start looking. What sort of problem on the
FileSystem side could cause MapReduce to spawn so many extra tasks?
How can I pin down the cause?
Thanks,
~ Esteban