[
https://issues.apache.org/jira/browse/HIVE-22294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qiang.Kang updated HIVE-22294:
------------------------------
Description:
Our hive version is 1.2.1 which has merged some patches (including patches
mentioned in https://issues.apache.org/jira/browse/HIVE-14557,
https://issues.apache.org/jira/browse/HIVE-16155 ) .
My sql query string is like this:
{code:java}
// code placeholder
set hive.auto.convert.join = true;
set hive.optimize.skewjoin=true;
SELECT a.*
FROM
a
JOIN b
ON a.id=b.id AND a.uid = b.uid
LEFT JOIN c
ON b.id=c.id AND b.uid=c.uid;
{code}
And we met some error:
FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.ConditionalWork
cannot be cast to org.apache.hadoop.hive.ql.plan.MapredWork
The main reason is that there is a conditional task (*MapJoin*) in the list
tasks of another Conditional task (*SkewJoin*). Here is the code snippet where
it throws this exception:
`org.apache.hadoop.hive.ql.optimizer.physical.MapJoinResolver:`
{code:java}
// code placeholder
public Object dispatch(Node nd, Stack<Node> stack, Object... nodeOutputs)
throws SemanticException {
Task<? extends Serializable> currTask = (Task<? extends Serializable>) nd;
// not map reduce task or not conditional task, just skip
if (currTask.isMapRedTask()) {
if (currTask instanceof ConditionalTask) {
// get the list of task
List<Task<? extends Serializable>> taskList = ((ConditionalTask)
currTask).getListTasks();
for (Task<? extends Serializable> tsk : taskList) {
if (tsk.isMapRedTask())
{ // ATTENTION: tsk May be ConditionalTask !!! this.processCurrentTask(tsk,
((ConditionalTask) currTask)); }
}
} else
{ this.processCurrentTask(currTask, null); }
}
return null;
}
private void processCurrentTask(Task<? extends Serializable> currTask,
ConditionalTask conditionalTask) throws SemanticException {
// get current mapred work and its local work
MapredWork mapredWork = (MapredWork) currTask.getWork(); // WRONG!!!!!!
MapredLocalWork localwork = mapredWork.getMapWork().getMapRedLocalWork();
{code}
Here is some detail Information about query plan:
*
-- set hive.auto.convert.join = true; set hive.optimize.skewjoin=false;*
{code:java}
// code placeholder
Stage-1 is a root stage [a join b]
Stage-12 [map join]depends on stages: Stage-1 , consists of Stage-13, Stage-2
Stage-13 has a backup stage: Stage-2
Stage-11 depends on stages: Stage-13
Stage-8 depends on stages: Stage-2, Stage-11 , consists of Stage-5, Stage-4,
Stage-6
Stage-5
Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
Stage-14 depends on stages: Stage-0
Stage-3 depends on stages: Stage-14
Stage-4
Stage-6
Stage-7 depends on stages: Stage-6
Stage-2
{code}
*
-- set hive.auto.convert.join = false; set hive.optimize.skewjoin=true;*
{code:java}
// code placeholder
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-12 depends on stages: Stage-1 , consists of Stage-13, Stage-2
Stage-13 [skew Join map local task]
Stage-11 depends on stages: Stage-13
Stage-2 depends on stages: Stage-11
Stage-8 depends on stages: Stage-2 , consists of Stage-5, Stage-4, Stage-6
Stage-5
Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
Stage-14 depends on stages: Stage-0
Stage-3 depends on stages: Stage-14
Stage-4
Stage-6
Stage-7 depends on stages: Stage-6
{code}
was:
Our hive version is 1.2.1 which has merged some patches (including patches
mentioned in https://issues.apache.org/jira/browse/HIVE-14557,
https://issues.apache.org/jira/browse/HIVE-16155 ) .
My sql query string is like this:
```
set hive.auto.convert.join = true;
set hive.optimize.skewjoin=true;
SELECT a.*
FROM
a
JOIN b
ON a.id=b.id AND a.uid = b.uid
LEFT JOIN c
ON b.id=c.id AND b.uid=c.uid;
```
And we met some error:
FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.ConditionalWork
cannot be cast to org.apache.hadoop.hive.ql.plan.MapredWork
The main reason is that there is a conditional task (*MapJoin*) in the list
tasks of another Conditional task (*SkewJoin*). Here is the code snippet where
it throws this exception:
`org.apache.hadoop.hive.ql.optimizer.physical.MapJoinResolver:`
```java
public Object dispatch(Node nd, Stack<Node> stack, Object... nodeOutputs)
throws SemanticException {
Task<? extends Serializable> currTask = (Task<? extends Serializable>) nd;
// not map reduce task or not conditional task, just skip
if (currTask.isMapRedTask()) {
if (currTask instanceof ConditionalTask) {
// get the list of task
List<Task<? extends Serializable>> taskList = ((ConditionalTask)
currTask).getListTasks();
for (Task<? extends Serializable> tsk : taskList) {
if (tsk.isMapRedTask()) {
// ATTENTION: tsk May be ConditionalTask !!!
this.processCurrentTask(tsk, ((ConditionalTask) currTask));
}
}
} else {
this.processCurrentTask(currTask, null);
}
}
return null;
}
private void processCurrentTask(Task<? extends Serializable> currTask,
ConditionalTask conditionalTask) throws SemanticException {
// get current mapred work and its local work
MapredWork mapredWork = (MapredWork) currTask.getWork(); // WRONG!!!!!!
MapredLocalWork localwork = mapredWork.getMapWork().getMapRedLocalWork();
```
Here is some detail Information about query plan:
*- set hive.auto.convert.join = true; set hive.optimize.skewjoin=false;*
```
Stage-1 is a root stage [a join b]
Stage-12 [map join]depends on stages: Stage-1 , consists of Stage-13, Stage-2
Stage-13 has a backup stage: Stage-2
Stage-11 depends on stages: Stage-13
Stage-8 depends on stages: Stage-2, Stage-11 , consists of Stage-5, Stage-4,
Stage-6
Stage-5
Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
Stage-14 depends on stages: Stage-0
Stage-3 depends on stages: Stage-14
Stage-4
Stage-6
Stage-7 depends on stages: Stage-6
Stage-2
```
*- set hive.auto.convert.join = false; set hive.optimize.skewjoin=true;*
```
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-12 depends on stages: Stage-1 , consists of Stage-13, Stage-2
Stage-13 [skew Join map local task]
Stage-11 depends on stages: Stage-13
Stage-2 depends on stages: Stage-11
Stage-8 depends on stages: Stage-2 , consists of Stage-5, Stage-4, Stage-6
Stage-5
Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
Stage-14 depends on stages: Stage-0
Stage-3 depends on stages: Stage-14
Stage-4
Stage-6
Stage-7 depends on stages: Stage-6
```
> ConditionalWork cannot be cast to MapredWork When both skew.join and
> auto.convert is on.
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-22294
> URL: https://issues.apache.org/jira/browse/HIVE-22294
> Project: Hive
> Issue Type: Bug
> Components: Physical Optimizer
> Affects Versions: 2.3.0, 3.1.1, 2.3.4
> Reporter: Qiang.Kang
> Assignee: Rui Li
> Priority: Critical
>
> Our hive version is 1.2.1 which has merged some patches (including patches
> mentioned in https://issues.apache.org/jira/browse/HIVE-14557,
> https://issues.apache.org/jira/browse/HIVE-16155 ) .
>
> My sql query string is like this:
> {code:java}
> // code placeholder
> set hive.auto.convert.join = true;
> set hive.optimize.skewjoin=true;
>
> SELECT a.*
> FROM
> a
> JOIN b
> ON a.id=b.id AND a.uid = b.uid
> LEFT JOIN c
> ON b.id=c.id AND b.uid=c.uid;
>
> {code}
>
> And we met some error:
> FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.ConditionalWork
> cannot be cast to org.apache.hadoop.hive.ql.plan.MapredWork
>
> The main reason is that there is a conditional task (*MapJoin*) in the list
> tasks of another Conditional task (*SkewJoin*). Here is the code snippet
> where it throws this exception:
> `org.apache.hadoop.hive.ql.optimizer.physical.MapJoinResolver:`
>
> {code:java}
> // code placeholder
> public Object dispatch(Node nd, Stack<Node> stack, Object... nodeOutputs)
> throws SemanticException {
> Task<? extends Serializable> currTask = (Task<? extends Serializable>) nd;
> // not map reduce task or not conditional task, just skip
> if (currTask.isMapRedTask()) {
> if (currTask instanceof ConditionalTask) {
> // get the list of task
> List<Task<? extends Serializable>> taskList = ((ConditionalTask)
> currTask).getListTasks();
> for (Task<? extends Serializable> tsk : taskList) {
> if (tsk.isMapRedTask())
> { // ATTENTION: tsk May be ConditionalTask !!!
> this.processCurrentTask(tsk, ((ConditionalTask) currTask)); }
> }
> } else
> { this.processCurrentTask(currTask, null); }
> }
> return null;
> }
> private void processCurrentTask(Task<? extends Serializable> currTask,
> ConditionalTask conditionalTask) throws SemanticException {
> // get current mapred work and its local work
> MapredWork mapredWork = (MapredWork) currTask.getWork(); // WRONG!!!!!!
> MapredLocalWork localwork = mapredWork.getMapWork().getMapRedLocalWork();
>
> {code}
>
> Here is some detail Information about query plan:
> *
> -- set hive.auto.convert.join = true; set hive.optimize.skewjoin=false;*
> {code:java}
> // code placeholder
> Stage-1 is a root stage [a join b]
> Stage-12 [map join]depends on stages: Stage-1 , consists of Stage-13, Stage-2
> Stage-13 has a backup stage: Stage-2
> Stage-11 depends on stages: Stage-13
> Stage-8 depends on stages: Stage-2, Stage-11 , consists of Stage-5, Stage-4,
> Stage-6
> Stage-5
> Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
> Stage-14 depends on stages: Stage-0
> Stage-3 depends on stages: Stage-14
> Stage-4
> Stage-6
> Stage-7 depends on stages: Stage-6
> Stage-2
>
> {code}
> *
> -- set hive.auto.convert.join = false; set hive.optimize.skewjoin=true;*
> {code:java}
> // code placeholder
> STAGE DEPENDENCIES:
> Stage-1 is a root stage
> Stage-12 depends on stages: Stage-1 , consists of Stage-13, Stage-2
> Stage-13 [skew Join map local task]
> Stage-11 depends on stages: Stage-13
> Stage-2 depends on stages: Stage-11
> Stage-8 depends on stages: Stage-2 , consists of Stage-5, Stage-4, Stage-6
> Stage-5
> Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
> Stage-14 depends on stages: Stage-0
> Stage-3 depends on stages: Stage-14
> Stage-4
> Stage-6
> Stage-7 depends on stages: Stage-6
> {code}
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)