I am attempting to debug some tasks that are failing on a particular input (the tasks hang until they time out and die). By examining $TMP/mapred/local/taskTracker/jobcache/ directory for the offending task and looking inside split.dta, I see the following input split location: hdfs://namenode-rd.imageshack.us:9000/user/hive/warehouse/img833_input/00034.tab417:10:32 10.101.4.7 Everything before the 417:10:32 part is just a path to a file in HDFS. How do I use "417:10:32 10.101.4.7" to give me the address of the particular block, and how can I dump the block using hadoop shell into a file? I assume there's a direct mapping between this and the block ID values that I see when I browse to that file in HDFS web UI (e.g., 8691049584976946484: 10.103.5.5:50010 10.101.1.5:50010 10.101.4.7:50010)
Thanks, --Leo
