You can choose to turn the speculative execution ON which might help you with few slow progressing tasks. mapred.map.tasks.speculative.execution and mapred.reduce.tasks.speculative.execution are the job conf options.
-----Original Message----- From: bharath vissapragada [mailto:[email protected]] Sent: Tuesday, September 27, 2011 1:22 AM To: [email protected] Subject: Benchmarking problems Hey, I need some help regarding hive. I trying to benchmark Hive with TPCH SF 100 dataset. For a simple SPJ query I ran (Select count(*) from supplier,customer where s_nationekey=c_nationkey) , out of my 13 reduce tasks , 12 completed in less than 2 hrs and 1 ran for 6 hours. Following are my cluster details : 10 Nodes (1 Master + 9 TTs+DNs) , 3.5GB ram per TT , 2 maps and 2 reducers max per TT, 600MB per task , 200MB io.sort.MB. I saw that no swapping occurred while running the reduce task . Following is the tail of the log on that machine ..where reduce ran for 6 hrs 2011-09-26 22:48:48,285 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 47881000000 rows 2011-09-26 22:48:48,607 INFO ExecReducer: ExecReducer: processed 1280835 rows: used memory = 4840896 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 finished. closing... 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarded 47881693522 rows 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 finished. closing... 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarded 47881693522 rows 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 finished. closing... 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 forwarded 0 rows 2011-09-26 22:48:48,608 WARN org.apache.hadoop.hive.ql.exec.GroupByOperator: Begin Hash Table flush at close: size = 1 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 forwarding 1 rows 2011-09-26 22:48:48,608 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://master:54310/tmp/hive-hadoop/hive_2011-09-26_16-36-07_678_4030630084749797567/_tmp.-mr-10002/000004_0 2011-09-26 22:48:48,609 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://master:54310/tmp/hive-hadoop/hive_2011-09-26_16-36-07_678_4030630084749797567/_tmp.-mr-10002/_tmp.000004_0 2011-09-26 22:48:48,609 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://master:54310/tmp/hive-hadoop/hive_2011-09-26_16-36-07_678_4030630084749797567/_tmp.-mr-10002/000004_0 2011-09-26 22:48:48,739 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 7 finished. closing... 2011-09-26 22:48:48,740 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 7 forwarded 0 rows 2011-09-26 22:48:48,847 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 Close done 2011-09-26 22:48:48,847 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 Close done 2011-09-26 22:48:48,847 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 Close done 2011-09-26 22:48:48,851 INFO org.apache.hadoop.mapred.TaskRunner: Task:attempt_201109261629_0001_r_000004_0 is done. And is in the process of commiting 2011-09-26 22:48:48,854 INFO org.apache.hadoop.mapred.TaskRunner: Task 'attempt_201109261629_0001_r_000004_0' done. One thing I noticed is that the stats of row forwarding are almost same across all the tasks ..however this task ran for 6hrs where as all other just ran for 1,2 hrs .. Any help? Thanks -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v
