[
https://issues.apache.org/jira/browse/HIVE-21111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797787#comment-16797787
]
zhuwei commented on HIVE-21111:
-------------------------------
[~lirui] Since it's related to table data size , it's not easy to reproduce it
from beginning. The root cause is that a child task of conditional task is
still conditional task. Please take a look at the code that I pasted in
description, I think this bug is obvious.
The SQL that triggered this bug in our product environment is like this:
set hive.auto.convert.join=true;
set hive.optimize.skewjoin = true;
explain
insert overwrite table dw.dwd_tc_order_old_d_orign
select
a.order_no,
a.kdt_id,
a.store_id,
a.order_type,
a.features,
a.state,
a.close_state,
a.pay_state,
b.origin_price,
a.buy_way,
b.goods_num,
b.goods_pay,
a.express_type,
case when ((a.state >=6 and a.state <> 99) or a.express_time <> 0) then 1 else
0 end as express_state,
case when ((a.state >=6 and a.state <> 99) or a.express_time <> 0) then 'a'
else 'b' end as express_state_name,
if((a.order_type=6 and a.pay_state>0),1,a.stock_state) as stock_state,
a.customer_id,
a.customer_type,
a.customer_name,
a.buyer_id,
a.buyer_phone,
if(a.book_time=0 or a.book_time is null,'0',udf.format_unixtime(a.book_time))
as book_time,
if(a.pay_time=0 or a.pay_time is null,'0',udf.format_unixtime(a.pay_time)) as
pay_time,
if(a.express_time=0 or a.express_time is
null,'0',udf.format_unixtime(a.express_time)) as express_time,
if(a.success_time=0 or a.success_time is
null,'0',udf.format_unixtime(a.success_time)) as success_time,
if(a.close_time=0 or a.close_time is null,0,udf.format_unixtime(a.close_time))
as close_time,
if(a.feedback_time=0 or a.feedback_time is
null,'0',udf.format_unixtime(a.feedback_time)) as feedback_time
FROM
(
select order_no,
kdt_id,store_id,features,state,close_state,pay_state,order_type,
buy_way,express_type,activity_type,
express_state,feedback,refund_state,stock_state,customer_id,customer_type,customer_name,buyer_id,buyer_phone,
book_time,pay_time, express_time,success_time,close_time,feedback_time
FROM ods.tc_seller_order
where kdt_id<>0
and (length(order_no)<> 24 OR substr(order_no,1,1) <> 'E' OR
substr(order_no,-5,1) <> '0')
) a
join
(
select order_no,
cast(sum(price * num)as bigint) as origin_price ,
sum(num) AS goods_num,
cast(sum(pay_price*num) AS bigint) AS goods_pay
from ods.tc_order_item
where (length(order_no)<> 24 OR substr(order_no,1,1) <> 'E' OR
substr(order_no,-5,1) <> '0')
group by order_no
) b
on a.order_no = b.order_no;
> ConditionalTask cannot be cast to MapRedTask
> --------------------------------------------
>
> Key: HIVE-21111
> URL: https://issues.apache.org/jira/browse/HIVE-21111
> Project: Hive
> Issue Type: Bug
> Components: Physical Optimizer
> Affects Versions: 2.1.1, 3.1.1, 2.3.4
> Reporter: zhuwei
> Assignee: zhuwei
> Priority: Major
> Attachments: HIVE-21111.1.patch
>
>
> We met error like this in our product environment:
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173)
>
> There is a bug in function
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch:
> if (tsk.isMapRedTask()) {
> Task<? extends Serializable> newTask = this.processCurrentTask((MapRedTask)
> tsk,
> ((ConditionalTask) currTask), physicalContext.getContext());
> walkerCtx.addToDispatchList(newTask);
> }
> In the above code, when tsk is instance of ConditionalTask,
> tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)