________________________________ From: Björn-Elmar Macek [mailto:ma...@cs.uni-kassel.de] Sent: Tuesday, May 22, 2012 3:12 PM To: hdfs-user@hadoop.apache.org Subject: Hadoop Debugging in LocalMode (Breakpoints not reached)
Hi there, i am currently trying to get rid of bugs in my Hadoop program by debugging it. Everything went fine til some point yesterday. I dont know what exactly happened, but my program does not stop at breakpoints within the Reducer and also not within the RawComparator for the values which i do use for sorting my inputs in the ReducerIterator. (see the classes set for the conf below:) conf.setOutputValueGroupingComparator(TwitterValueGroupingComparator.class); conf.setReducerClass(RetweetReducer.class); The log looks like this: Warning: $HADOOP_HOME is deprecated. Listening for transport dt_socket at address: 5002 12/05/21 19:24:20 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/05/21 19:24:20 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 12/05/21 19:24:20 WARN snappy.LoadSnappy: Snappy native library not loaded 12/05/21 19:24:20 INFO mapred.FileInputFormat: Total input paths to process : 2 12/05/21 19:24:20 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: fs.default.name; Ignoring. 12/05/21 19:24:20 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.job.tracker; Ignoring. 12/05/21 19:24:20 INFO mapred.JobClient: Running job: job_local_0001 12/05/21 19:24:20 INFO util.ProcessTree: setsid exited with exit code 0 12/05/21 19:24:21 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1c4ff2c 12/05/21 19:24:21 INFO mapred.MapTask: numReduceTasks: 1 12/05/21 19:24:21 INFO mapred.MapTask: io.sort.mb = 100 12/05/21 19:24:22 INFO mapred.JobClient: map 0% reduce 0% 12/05/21 19:24:22 INFO mapred.MapTask: data buffer = 79691776/99614720 12/05/21 19:24:22 INFO mapred.MapTask: record buffer = 262144/327680 12/05/21 19:24:22 INFO mapred.MapTask: Starting flush of map output 12/05/21 19:24:22 INFO mapred.MapTask: Finished spill 0 12/05/21 19:24:22 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 12/05/21 19:24:23 INFO mapred.LocalJobRunner: file:/home/ema/INPUT-H/tweets_ext:0+968 12/05/21 19:24:23 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 12/05/21 19:24:23 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1e8c585 12/05/21 19:24:23 INFO mapred.MapTask: numReduceTasks: 1 12/05/21 19:24:23 INFO mapred.MapTask: io.sort.mb = 100 12/05/21 19:24:24 INFO mapred.MapTask: data buffer = 79691776/99614720 12/05/21 19:24:24 INFO mapred.MapTask: record buffer = 262144/327680 12/05/21 19:24:24 INFO mapred.MapTask: Starting flush of map output 12/05/21 19:24:24 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 12/05/21 19:24:24 INFO mapred.JobClient: map 100% reduce 0% 12/05/21 19:24:26 INFO mapred.LocalJobRunner: file:/home/ema/INPUT-H/tweets~:0+0 12/05/21 19:24:26 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done. 12/05/21 19:24:26 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@191e4c 12/05/21 19:24:26 INFO mapred.ReduceTask: ShuffleRamManager: MemoryLimit=709551680, MaxSingleShuffleLimit=177387920 12/05/21 19:24:27 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Need another 2 map output(s) where 0 is already in progress 12/05/21 19:24:27 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread started: Thread for merging on-disk files 12/05/21 19:24:27 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread waiting: Thread for merging on-disk files 12/05/21 19:24:27 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts) 12/05/21 19:24:27 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread started: Thread for merging in memory files 12/05/21 19:24:27 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread started: Thread for polling Map Completion Events 12/05/21 19:24:32 INFO mapred.LocalJobRunner: reduce > copy > 12/05/21 19:24:35 INFO mapred.LocalJobRunner: reduce > copy > 12/05/21 19:24:42 INFO mapred.LocalJobRunner: reduce > copy > 12/05/21 19:24:48 INFO mapred.LocalJobRunner: reduce > copy > 12/05/21 19:24:51 INFO mapred.LocalJobRunner: reduce > copy > 12/05/21 19:24:57 INFO mapred.LocalJobRunner: reduce > copy > ... etc ... Is there something i have missed? Thanks for your help in advance! Best regards, Björn-Elmar