I turned it off because , it was trying to launch 2 copies of every
task and they are hogging my TTs.

I am just curious abt one thing .. Are the reducers in JOIN CPU
intensive or do they consume a lot of memory ?

>From my monitoring the TT during reduce phase ..its was pretty clear
that there was no swapping ...however I was not sure abt the CPU
thingy ...

Any one with same experience / workaround for this problem ??

On Tue, Sep 27, 2011 at 11:19 PM, Aggarwal, Vaibhav <[email protected]> wrote:
> You can choose to turn the speculative execution ON which might help you with 
> few slow progressing tasks.
> mapred.map.tasks.speculative.execution and 
> mapred.reduce.tasks.speculative.execution are the job conf options.
>
>
> -----Original Message-----
> From: bharath vissapragada [mailto:[email protected]]
> Sent: Tuesday, September 27, 2011 1:22 AM
> To: [email protected]
> Subject: Benchmarking problems
>
> Hey,
>
> I need some help regarding hive. I trying to benchmark Hive with TPCH SF 100 
> dataset. For a simple SPJ query I ran (Select count(*) from supplier,customer 
> where s_nationekey=c_nationkey) ,
>
> out of my 13 reduce tasks , 12 completed in less than 2 hrs and 1 ran for 6 
> hours. Following are my cluster details :
>
> 10 Nodes (1 Master + 9 TTs+DNs) , 3.5GB ram per TT , 2 maps and 2 reducers 
> max per TT, 600MB per task , 200MB io.sort.MB.
>
> I saw that no swapping occurred while running the reduce task .
> Following is the tail of the log on that machine ..where reduce ran for 6 hrs
>
> 2011-09-26 22:48:48,285 INFO
> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding
> 47881000000 rows
> 2011-09-26 22:48:48,607 INFO ExecReducer: ExecReducer: processed
> 1280835 rows: used memory = 4840896
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.JoinOperator: 4 finished. closing...
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarded 47881693522 rows
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 finished. closing...
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarded 47881693522 rows
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 finished. closing...
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 forwarded 0 rows
> 2011-09-26 22:48:48,608 WARN
> org.apache.hadoop.hive.ql.exec.GroupByOperator: Begin Hash Table flush at 
> close: size = 1
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 forwarding 1 rows
> 2011-09-26 22:48:48,608 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS
> hdfs://master:54310/tmp/hive-hadoop/hive_2011-09-26_16-36-07_678_4030630084749797567/_tmp.-mr-10002/000004_0
> 2011-09-26 22:48:48,609 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file:
> FS 
> hdfs://master:54310/tmp/hive-hadoop/hive_2011-09-26_16-36-07_678_4030630084749797567/_tmp.-mr-10002/_tmp.000004_0
> 2011-09-26 22:48:48,609 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS
> hdfs://master:54310/tmp/hive-hadoop/hive_2011-09-26_16-36-07_678_4030630084749797567/_tmp.-mr-10002/000004_0
> 2011-09-26 22:48:48,739 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 7 finished.
> closing...
> 2011-09-26 22:48:48,740 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 7 forwarded 0 rows
> 2011-09-26 22:48:48,847 INFO
> org.apache.hadoop.hive.ql.exec.GroupByOperator: 6 Close done
> 2011-09-26 22:48:48,847 INFO
> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 Close done
> 2011-09-26 22:48:48,847 INFO
> org.apache.hadoop.hive.ql.exec.JoinOperator: 4 Close done
> 2011-09-26 22:48:48,851 INFO org.apache.hadoop.mapred.TaskRunner:
> Task:attempt_201109261629_0001_r_000004_0 is done. And is in the process of 
> commiting
> 2011-09-26 22:48:48,854 INFO org.apache.hadoop.mapred.TaskRunner: Task 
> 'attempt_201109261629_0001_r_000004_0' done.
>
>
> One thing I noticed is that the stats of row forwarding are almost same 
> across all the tasks ..however this task ran for 6hrs where as all other just 
> ran for 1,2 hrs ..
> Any help?
>
> Thanks
>
>
> --
> Regards,
> Bharath .V
> w:http://researchweb.iiit.ac.in/~bharath.v
>



-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Reply via email to