Hi Feng,

I've seen exactly same problem with one of my queries. There is one reducer
hanging forever. I didn't see data skew for that reducer. It has similar
amount of REDUCE_INPUT_RECORDS as other reducers. But this number stopped
changing any more and just hanging..

Does anybody else know what's happening there?

Daniel
>From "Feng Yuan" <tomson8...@126.com>
Subject In reduce task,i have a join operation ,and i found
"org.apache.hadoop.mapred.FileInputFormat: Total input paths to process :
1" cast much long
Date Mon, 10 Apr 2017 06:51:26 GMT

The log is :
2017-04-10 01:34:22,375 INFO [main]
org.apache.hadoop.mapred.FileInputFormat: Total input
paths to process : 1
2017-04-10 01:36:32,551 INFO [main] ExecReducer: ExecReducer:
processing 2000000 rows: used
memory = 101789096
2017-04-10 01:37:03,284 INFO [main]
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table
0 has 1000 rows for join key [4092813312923569]
2017-04-10 01:37:03,286 INFO [main]
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table
0 has 2000 rows for join key [4092813312923569]
2017-04-10 01:37:03,291 INFO [main]
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table
0 has 4000 rows for join key [4092813312923569]
2017-04-10 01:37:03,301 INFO [main]
org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table
0 has 8000 rows for join key [4092813312923569]
2017-04-10 01:37:03,379 INFO [main]
org.apache.hadoop.hive.ql.exec.persistence.RowContainer:
RowContainer created temp file
/data9/hadoop/local/usercache/xx/appcache/application_1482905245692_7777364/container_1482905245692_7777364_01_000330/tmp/hive-rowcontainer5366426093735775537/RowContainer3525630608978801813.tmp
2017-04-10 01:37:04,559 INFO [main]
org.apache.hadoop.mapred.FileInputFormat: Total input
paths to process : 1
2017-04-10 07:17:47,584 INFO [main]
org.apache.hadoop.hive.ql.exec.persistence.RowContainer:
RowContainer created temp file
/data9/hadoop/local/usercache/xx/appcache/application_1482905245692_7777364/container_1482905245692_7777364_01_000330/tmp/hive-rowcontainer8292833982081568523/RowContainer734749216866467280.tmp
2017-04-10 07:17:47,775 INFO [main]
org.apache.hadoop.mapred.FileInputFormat: Total input
paths to process : 1
2017-04-10 07:21:57,890 INFO [main]
org.apache.hadoop.hive.ql.exec.persistence.RowContainer:
RowContainer created temp file
/data9/hadoop/local/usercache/xx/appcache/application_1482905245692_7777364/container_1482905245692_7777364_01_000330/tmp/hive-rowcontainer3072958941479299308/RowContainer1838954978169271208.tmp
2017-04-10 07:21:58,119 INFO [main]
org.apache.hadoop.mapred.FileInputFormat: Total input
paths to process : 1
2017-04-10 07:24:07,796 INFO [main] org.apach
=========
what i know is there is a join operation,but what did
"org.apache.hadoop.mapred.FileInputFormat:
Total input paths to process : 1" mean?
is there some data it need to read? from hdfs?More critical why it is so slow?
from 2017-04-10 01:37:04 to 2017-04-10 07:17:47

Reply via email to