[
https://issues.apache.org/jira/browse/FLINK-25641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhang updated FLINK-25641:
-------------------------------
Description:
When using flink batch sql to run hive sql queries, we load hive module to use
Hive built-in functions.
However some query plan plan are unexpected after loading hive module.
For the following sql,
{code:sql}
load module hive;
use modules hive,core;
set table.sql-dialect=hive;
select
account_id,
sum(impression)
from test_db.test_table where dt = '2022-01-10' and hi = '0100' group by
account_id
{code}
The planner is:
!image-2022-01-13-15-55-40-958.png!
After remove 'load mudiles hive; use modules hive, core;', the planner is:
!image-2022-01-13-15-52-27-783.png!
After loading hive modules, hash aggregate is not final plan because the
aggregate function is `HiveAggSqlFunction` and the aggregate buffer is not
fixed length which type is as following:
{code:java}
LEGACY('RAW',
'ANY<org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer>')
{code}
was:
When using flink batch sql to run hive sql queries, we load hive module to use
Hive built-in functions.
However some query plan plan are unexpected after loading hive module.
For the following sql,
{code:sql}
load module hive;
use modules hive,core;
set table.sql-dialect=hive;
select
account_id,
sum(impression)
from test_db.test_table where dt = '2022-01-10' and hi = '0100' group by
account_id
{code}
The planner is:
!image-2022-01-13-15-55-40-958.png!
After remove 'load mudiles hive; use modules hive, core;', the planner is:
!image-2022-01-13-15-52-27-783.png!
After loading hive modules, hash aggregate is not final plan because the
aggregate buffer is not fixed length which type is as following.
{code:java}
LEGACY('RAW',
'ANY<org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer>')
{code}
> Unexpected aggregate plan after load hive module
> ------------------------------------------------
>
> Key: FLINK-25641
> URL: https://issues.apache.org/jira/browse/FLINK-25641
> Project: Flink
> Issue Type: Sub-task
> Components: Table SQL / Planner
> Reporter: Jing Zhang
> Priority: Major
> Attachments: image-2022-01-13-15-52-27-783.png,
> image-2022-01-13-15-55-40-958.png
>
>
> When using flink batch sql to run hive sql queries, we load hive module to
> use Hive built-in functions.
> However some query plan plan are unexpected after loading hive module.
> For the following sql,
> {code:sql}
> load module hive;
> use modules hive,core;
> set table.sql-dialect=hive;
> select
> account_id,
> sum(impression)
> from test_db.test_table where dt = '2022-01-10' and hi = '0100' group by
> account_id
> {code}
> The planner is:
> !image-2022-01-13-15-55-40-958.png!
> After remove 'load mudiles hive; use modules hive, core;', the planner is:
> !image-2022-01-13-15-52-27-783.png!
> After loading hive modules, hash aggregate is not final plan because the
> aggregate function is `HiveAggSqlFunction` and the aggregate buffer is not
> fixed length which type is as following:
> {code:java}
> LEGACY('RAW',
> 'ANY<org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer>')
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)