Moreover, the IP 10.224.174.71 is a different node, not the one executing
the reduce task. Why did that happen?
On Fri, Oct 20, 2017 at 3:37 PM, Daniel Bruce wrote:
> OK, more updates. Today I was running the query with Yarn and also turned
> on DEBUG logging. Here's what I
OK, more updates. Today I was running the query with Yarn and also turned
on DEBUG logging. Here's what I found from the task log for the dangling
task.
I noticed that after the RowContainer has been created by Hive, there are a
lot of IPC/RPC related logs (still printing), every 3 second apart.
Hi Gopal,
Thanks for your input! In my case I'm using MapReduce not Tez. I figured
I'd better be more specific so as to provide you more details.
For this job there are 298 maps and 74 reduces. All the maps completed real
fast within 1 minute, and 73 reduces completed in about 2 minutes.
Now
> . I didn't see data skew for that reducer. It has similar amount of
> REDUCE_INPUT_RECORDS as other reducers.
…
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 8000 rows for
> join key [4092813312923569]
The ratio of REDUCE_INPUT_RECORDS and REDUCE_INPUT_GROUPS is what is
Hi Feng,
I've seen exactly same problem with one of my queries. There is one reducer
hanging forever. I didn't see data skew for that reducer. It has similar
amount of REDUCE_INPUT_RECORDS as other reducers. But this number stopped
changing any more and just hanging..
Does anybody else know
The log is :
2017-04-10 01:34:22,375 INFO [main] org.apache.hadoop.mapred.FileInputFormat:
Total input paths to process : 1
2017-04-10 01:36:32,551 INFO [main] ExecReducer: ExecReducer: processing
200 rows: used memory = 101789096
2017-04-10 01:37:03,284 INFO [main]