Hadoop folks might be interested that we've used Hadoop to render some maps of the Internet address space.
Aggregated maps are at <http://www.isi.edu/ant/address>; we've rendered these both with and without Hadoop. The more intresting map that required Hadoop is at <http://www.isi.edu/ant/address/whole_internet>. This map is to scale, so pixels and IP addresses are one-to-one, and the printed result (at 600dpi) is more than 9' tall. Rendering this took 19 hours on our 52-core Hadoop cluster. (Printing then took another 36 hours on our single printer :-( ) We were using Hadoop streaming. Our use of Hadoop was not trouble free: we had trouble with both reduce jobs hanging and mappers running out of memory (details below). If others have seen these kind of problems, please let us know. Our full reduce job completed only 502 of the 503 reduces, so there are a few holes in the picture that shouldn't be there. When I checked on the status I saw two instances of the reducer running, both reporting hung at ~87% completion. But looking at the logs, I see things like this: 2007-09-29 16:59:34,628 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=36181401/0/0 in:3173=36181401/11400 [rec/s] out:0=0/11400 [rec/s] 2007-09-29 16:59:34,629 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=36181501/0/0 in:3173=36181501/11400 [rec/s] out:0=0/11400 [rec/s] 2007-09-29 16:59:34,629 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=36181601/0/0 in:3173=36181601/11400 [rec/s] out:0=0/11400 [rec/s] 2007-09-29 16:59:36,768 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished 2007-09-29 16:59:36,784 INFO org.apache.hadoop.streaming.PipeMapRed: MROutputThread done 2007-09-29 16:59:36,858 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished 2007-09-29 16:59:37,193 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException: subprocess exited successfully R/W/S=36181627/0/0 in:3172=36181627/11403 [rec/s] out:0=0/11403 [rec/s] minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null HOST=null USER=hadoop HADOOP_USER=null last Hadoop input: |null| last tool output: |null| Date: Sat Sep 29 16:59:36 PDT 2007 Broken pipe at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:105) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:324) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1800) I'm confused by the confliting messages: the in:/out: looks like it at least read lots of stuff. The "streaming.PipeMapRed: mapRedFinished" should b e a positive mesage, right? Then I get "Error running child" (bad) and "subprocess exited successfully" (good) and "Broken pipe" (bad). Which is it? And if it gets stuck, why doesn't hadoop just time these out and restart? Originally I had two reduces hung, but manually killing one caused hadoop to restart it and then it completed. When I re-ran the whole job, I ended up with more stuck jobs (~10 of Our mapper problem is that our custom inputreader was causing map to run out of memory. We don't think we leak memory, but we're trying to debug it. We'll post details later. -John
