[
https://issues.apache.org/jira/browse/IMPALA-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758050#comment-17758050
]
daicheng edited comment on IMPALA-12007 at 8/23/23 1:51 PM:
------------------------------------------------------------
Thank [~stigahuang] for reproducing and identifying the problem
was (Author: dachn):
Thanks for reproducing and identifying the problem
> wrong result with Pushdown runtime min-max filters on bitlong and decimal
> -------------------------------------------------------------------------
>
> Key: IMPALA-12007
> URL: https://issues.apache.org/jira/browse/IMPALA-12007
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 4.1.1
> Environment: centos
> Reporter: daicheng
> Priority: Critical
> Attachments: image-2023-03-16-18-03-55-375.png,
> image-2023-03-16-18-05-03-201.png, image-2023-03-16-18-08-55-259.png,
> image-2023-03-16-18-11-28-239.png, image-2023-03-16-18-12-30-062.png,
> image-2023-03-16-18-17-57-745.png, image-2023-03-17-15-09-40-463.png,
> image-2023-03-17-15-13-49-139.png, image-2023-03-17-15-17-23-669.png
>
>
> h1. wrong result with Pushdown runtime min-max filters
> h2. table description
> I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like
> below,and i create external table in impala:
> {code:java}
> TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
> ID INT32 NOT NULL,
> kudu_change_time UNIXTIME_MICROS NOT NULL,
> kudu_update_time UNIXTIME_MICROS NOT NULL,
> kudu_is_deleted BOOL NOT NULL,
> CHILD INT64 NULLABLE,
> CHILD_PROJECT INT64 NULLABLE,
> PARENT INT64 NULLABLE,
> PARENT_PROJECT INT64 NULLABLE,
> SEQUENCE INT32 NULLABLE,
> SEQ_BALANCER INT32 NULLABLE,
> PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
> PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMENT {code}
> {code:java}
> CREATE EXTERNAL TABLE prod__tm_jira.ao_69e499_req_relation STORED AS KUDU
> TBLPROPERTIES ('STATS_GENERATED'='TASK', 'accessType'='8',
> 'impala.lastComputeStatsTime'='1678903202',
> 'kudu.master_addresses'='kudu-host',
> 'kudu.table_name'='prod__tm_jira__AO_69E499_REQ_RELATION', 'numFiles'='0',
> 'numRows'='62204', 'totalSize'='0') {code}
> ‘child’ is bigint in impala,and table AO_69E499_REQ_RELATION table has 6273
> rows:
> !image-2023-03-17-15-17-23-669.png!
> !image-2023-03-16-18-05-03-201.png|width=395,height=295!
> and another table describe below:
> {code:java}
> TABLE prod_tm_jira_jiraissue (
> ID DECIMAL(18, 0) NOT NULL,
> kudu_change_time UNIXTIME_MICROS NOT NULL,
> kudu_update_time UNIXTIME_MICROS NOT NULL,
> kudu_is_deleted BOOL NOT NULL,
> pkey STRING NULLABLE,
> issuenum DECIMAL(18, 0) NULLABLE,
> PROJECT DECIMAL(18, 0) NULLABLE,
> REPORTER STRING NULLABLE,
> ASSIGNEE STRING NULLABLE,
> CREATOR STRING NULLABLE,
> issuetype STRING NULLABLE,
> SUMMARY STRING NULLABLE,
> DESCRIPTION STRING NULLABLE,
> ENVIRONMENT STRING NULLABLE,
> PRIORITY STRING NULLABLE,
> RESOLUTION STRING NULLABLE,
> issuestatus STRING NULLABLE,
> CREATED UNIXTIME_MICROS NULLABLE,
> UPDATED UNIXTIME_MICROS NULLABLE,
> DUEDATE UNIXTIME_MICROS NULLABLE,
> RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
> VOTES DECIMAL(18, 0) NULLABLE,
> WATCHES DECIMAL(18, 0) NULLABLE,
> TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
> TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
> TIMESPENT DECIMAL(18, 0) NULLABLE,
> WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
> SECURITY DECIMAL(18, 0) NULLABLE,
> FIXFOR DECIMAL(18, 0) NULLABLE,
> COMPONENT DECIMAL(18, 0) NULLABLE,
> ARCHIVEDBY STRING NULLABLE,
> ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
> ARCHIVED STRING NULLABLE,
> PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
> PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMEN {code}
> this is the external table in impala:
> {code:java}
> CREATE EXTERNAL TABLE prod__tm_jira.jiraissue STORED AS KUDU TBLPROPERTIES
> ('STATS_GENERATED'='TASK', 'accessType'='8',
> 'impala.lastComputeStatsTime'='1678903206',
> 'kudu.master_addresses'='kudu-host',
> 'kudu.table_name'='prod__tm_jira__jiraissue', 'numFiles'='0',
> 'numRows'='295165', 'totalSize'='0') {code}
> 'id' is decimal in impala,and table jiraissue has 295336 rows
> !image-2023-03-17-15-13-49-139.png!!image-2023-03-16-18-03-55-375.png|width=514,height=452!
> h2. sql description
>
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
> from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
> on c1.id=c3.child
> where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
> !image-2023-03-16-18-08-55-259.png|width=765,height=379!
> when i execute the sql,i got the right result ,and from execute plan i found:
> !image-2023-03-16-18-11-28-239.png|width=724,height=526!
> after i execute follow sql,i expect more rows result but i got 0 result:
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
> from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
> on c1.id=c3.child
> where c1.issuetype IN ('11801','10400')
> -- and c1.id=324482 {code}
> !image-2023-03-16-18-12-30-062.png|width=545,height=316!
> and the sql plan like below,i found c3.child didn't cast to decimal,and
> runtime filter works to id.
> !image-2023-03-16-18-17-57-745.png|width=696,height=692!
> *I got 0 results,and i found impala runtime filter is worked,so is it
> possible that impala didn't identify data type with runtime filter?in this
> case 'id' is decimal but 'child' is bigint.*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]