[ 
https://issues.apache.org/jira/browse/HIVE-21111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797787#comment-16797787
 ] 

zhuwei commented on HIVE-21111:
-------------------------------

[~lirui] Since it's related to table data size , it's not easy to reproduce it 
from beginning. The root cause is that a child task of conditional task is 
still conditional task. Please take a look at the code that I pasted in 
description, I think this bug is obvious.

The SQL that triggered this bug in our product environment is like this:

set hive.auto.convert.join=true;
set hive.optimize.skewjoin = true;
explain
insert overwrite table dw.dwd_tc_order_old_d_orign
select
a.order_no,
a.kdt_id,
a.store_id,
a.order_type,
a.features,
a.state,
a.close_state,
a.pay_state,
b.origin_price, 
a.buy_way,
b.goods_num,
b.goods_pay, 
a.express_type,
case when ((a.state >=6 and a.state <> 99) or a.express_time <> 0) then 1 else 
0 end as express_state,
case when ((a.state >=6 and a.state <> 99) or a.express_time <> 0) then 'a' 
else 'b' end as express_state_name, 
if((a.order_type=6 and a.pay_state>0),1,a.stock_state) as stock_state,
a.customer_id,
a.customer_type,
a.customer_name,
a.buyer_id,
a.buyer_phone,
if(a.book_time=0 or a.book_time is null,'0',udf.format_unixtime(a.book_time)) 
as book_time,
if(a.pay_time=0 or a.pay_time is null,'0',udf.format_unixtime(a.pay_time)) as 
pay_time,
if(a.express_time=0 or a.express_time is 
null,'0',udf.format_unixtime(a.express_time)) as express_time,
if(a.success_time=0 or a.success_time is 
null,'0',udf.format_unixtime(a.success_time)) as success_time,
if(a.close_time=0 or a.close_time is null,0,udf.format_unixtime(a.close_time)) 
as close_time,
if(a.feedback_time=0 or a.feedback_time is 
null,'0',udf.format_unixtime(a.feedback_time)) as feedback_time

FROM 
(
 select order_no, 
kdt_id,store_id,features,state,close_state,pay_state,order_type, 
buy_way,express_type,activity_type,
 
express_state,feedback,refund_state,stock_state,customer_id,customer_type,customer_name,buyer_id,buyer_phone,
 book_time,pay_time, express_time,success_time,close_time,feedback_time
 FROM ods.tc_seller_order 
 where kdt_id<>0
 and (length(order_no)<> 24 OR substr(order_no,1,1) <> 'E' OR 
substr(order_no,-5,1) <> '0') 
) a
join 
(
 select order_no, 
 cast(sum(price * num)as bigint) as origin_price ,
 sum(num) AS goods_num,
 cast(sum(pay_price*num) AS bigint) AS goods_pay 
 from ods.tc_order_item
 where (length(order_no)<> 24 OR substr(order_no,1,1) <> 'E' OR 
substr(order_no,-5,1) <> '0') 
 group by order_no
) b
on a.order_no = b.order_no;

> ConditionalTask cannot be cast to MapRedTask
> --------------------------------------------
>
>                 Key: HIVE-21111
>                 URL: https://issues.apache.org/jira/browse/HIVE-21111
>             Project: Hive
>          Issue Type: Bug
>          Components: Physical Optimizer
>    Affects Versions: 2.1.1, 3.1.1, 2.3.4
>            Reporter: zhuwei
>            Assignee: zhuwei
>            Priority: Major
>         Attachments: HIVE-21111.1.patch
>
>
> We met error like this in our product environment:
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173)
>  
> There is a bug in function 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch:
> if (tsk.isMapRedTask()) {
>  Task<? extends Serializable> newTask = this.processCurrentTask((MapRedTask) 
> tsk,
>  ((ConditionalTask) currTask), physicalContext.getContext());
>  walkerCtx.addToDispatchList(newTask);
> }
> In the above code, when tsk is instance of ConditionalTask, 
> tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to