Wei Zheng created TEZ-2684:
------------------------------
Summary: ShuffleVertexManager.parsePartitionStats throws
IllegalStateException: Stats should be initialized
Key: TEZ-2684
URL: https://issues.apache.org/jira/browse/TEZ-2684
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.8.0
Environment: Hive on Tez
Reporter: Wei Zheng
When I run hive qfile test (attached) using TestMiniTezCliDriver. My WIP patch
is also attached for problem reproduction purpose.
Here's the explain and backtrace I got from qfile output:
{code}
EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds =
srcpart_date.ds) where srcpart_date.`date` = '2008-04-08'
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Tez
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
DagName: wzheng_20150803161620_55c139de-c26c-467f-b592-7d4333053ac6:38
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: srcpart
filterExpr: ds is not null (type: boolean)
Statistics: Num rows: 2000 Data size: 21248 Basic stats:
COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: ds (type: string)
sort order: +
Map-reduce partition columns: ds (type: string)
Statistics: Num rows: 2000 Data size: 21248 Basic stats:
COMPLETE Column stats: NONE
Map 4
Map Operator Tree:
TableScan
alias: srcpart_date
filterExpr: (ds is not null and (date = '2008-04-08')) (type:
boolean)
Statistics: Num rows: 2 Data size: 42 Basic stats: COMPLETE
Column stats: NONE
Filter Operator
predicate: (ds is not null and (date = '2008-04-08'))
(type: boolean)
Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE
Column stats: NONE
Reduce Output Operator
key expressions: ds (type: string)
sort order: +
Map-reduce partition columns: ds (type: string)
Statistics: Num rows: 1 Data size: 21 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: ds (type: string)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 21 Basic stats:
COMPLETE Column stats: NONE
Group By Operator
keys: _col0 (type: string)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 21 Basic stats:
COMPLETE Column stats: NONE
Dynamic Partitioning Event Operator
Target Input: srcpart
Partition key expr: ds
Statistics: Num rows: 1 Data size: 21 Basic stats:
COMPLETE Column stats: NONE
Target column: ds
Target Vertex: Map 1
Reducer 2
Reduce Operator Tree:
Merge Join Operator
condition map:
Inner Join 0 to 1
keys:
0 ds (type: string)
1 ds (type: string)
Statistics: Num rows: 2200 Data size: 23372 Basic stats:
COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
value expressions: _col0 (type: bigint)
Reducer 3
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
PREHOOK: query: select count(*) from srcpart join srcpart_date on (srcpart.ds =
srcpart_date.ds) where srcpart_date.`date` = '2008-04-08'
PREHOOK: type: QUERY
PREHOOK: Input: default@srcpart
PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
PREHOOK: Input: default@srcpart@ds=2008-04-09/hr=11
PREHOOK: Input: default@srcpart@ds=2008-04-09/hr=12
PREHOOK: Input: default@srcpart_date
PREHOOK: Output:
file:/Users/wzheng/bf/hive/itests/qtest/target/tmp/localscratchdir/93b335b5-3ced-4f4d-abdd-2fd5defd11e4/hive_2015-08-03_16-16-21_046_5066458626645110592-1/-mr-10001
Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1438643776809_0001_8_02,
diagnostics=[Vertex vertex_1438643776809_0001_8_02 [Reducer 2] killed/failed
due to:AM_USERCODE_FAILURE, Exception in VertexManager,
vertex:vertex_1438643776809_0001_8_02 [Reducer 2],
java.lang.IllegalStateException: Stats should be initialized
at
com.google.common.base.Preconditions.checkState(Preconditions.java:149)
at
org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager.parsePartitionStats(ShuffleVertexManager.java:535)
at
org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager.onVertexManagerEventReceived(ShuffleVertexManager.java:575)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventReceived.invoke(VertexManager.java:602)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:643)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:638)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:638)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:627)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]
Vertex killed, vertexName=Reducer 3, vertexId=vertex_1438643776809_0001_8_03,
diagnostics=[Vertex received Kill in INITED state., Vertex
vertex_1438643776809_0001_8_03 [Reducer 3] killed/failed due to:null]
Vertex killed, vertexName=Map 1, vertexId=vertex_1438643776809_0001_8_01,
diagnostics=[Vertex received Kill in INITED state., Vertex
vertex_1438643776809_0001_8_01 [Map 1] killed/failed due to:null]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:2
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer
2, vertexId=vertex_1438643776809_0001_8_02, diagnostics=[Vertex
vertex_1438643776809_0001_8_02 [Reducer 2] killed/failed due
to:AM_USERCODE_FAILURE, Exception in VertexManager,
vertex:vertex_1438643776809_0001_8_02 [Reducer 2],
java.lang.IllegalStateException: Stats should be initialized
at
com.google.common.base.Preconditions.checkState(Preconditions.java:149)
at
org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager.parsePartitionStats(ShuffleVertexManager.java:535)
at
org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager.onVertexManagerEventReceived(ShuffleVertexManager.java:575)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventReceived.invoke(VertexManager.java:602)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:643)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:638)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:638)
at
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:627)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]Vertex killed, vertexName=Reducer 3, vertexId=vertex_1438643776809_0001_8_03,
diagnostics=[Vertex received Kill in INITED state., Vertex
vertex_1438643776809_0001_8_03 [Reducer 3] killed/failed due to:null]Vertex
killed, vertexName=Map 1, vertexId=vertex_1438643776809_0001_8_01,
diagnostics=[Vertex received Kill in INITED state., Vertex
vertex_1438643776809_0001_8_01 [Map 1] killed/failed due to:null]DAG did not
succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:2
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)