[ 
https://issues.apache.org/jira/browse/IMPALA-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daicheng updated IMPALA-12007:
------------------------------
    Description: 
h1. wrong result with Pushdown runtime min-max filters
h2. table description

I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like 
below,and i create external table in impala:
{code:java}
TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
    ID INT32 NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    CHILD INT64 NULLABLE,
    CHILD_PROJECT INT64 NULLABLE,
    PARENT INT64 NULLABLE,
    PARENT_PROJECT INT64 NULLABLE,
    SEQUENCE INT32 NULLABLE,
    SEQ_BALANCER INT32 NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMENT {code}
{code:java}
CREATE EXTERNAL TABLE prod__tm_jira.ao_69e499_req_relation STORED AS KUDU 
TBLPROPERTIES ('STATS_GENERATED'='TASK', 'accessType'='8', 
'impala.lastComputeStatsTime'='1678903202', 
'kudu.master_addresses'='kudu-host', 
'kudu.table_name'='prod__tm_jira__AO_69E499_REQ_RELATION', 'numFiles'='0', 
'numRows'='62204', 'totalSize'='0') {code}
‘child’ is bigint in impala,and table AO_69E499_REQ_RELATION table has 6273 
rows:

!image-2023-03-17-15-17-23-669.png!    
!image-2023-03-16-18-05-03-201.png|width=395,height=295!

and another table describe below:
{code:java}
TABLE prod_tm_jira_jiraissue (
    ID DECIMAL(18, 0) NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    pkey STRING NULLABLE,
    issuenum DECIMAL(18, 0) NULLABLE,
    PROJECT DECIMAL(18, 0) NULLABLE,
    REPORTER STRING NULLABLE,
    ASSIGNEE STRING NULLABLE,
    CREATOR STRING NULLABLE,
    issuetype STRING NULLABLE,
    SUMMARY STRING NULLABLE,
    DESCRIPTION STRING NULLABLE,
    ENVIRONMENT STRING NULLABLE,
    PRIORITY STRING NULLABLE,
    RESOLUTION STRING NULLABLE,
    issuestatus STRING NULLABLE,
    CREATED UNIXTIME_MICROS NULLABLE,
    UPDATED UNIXTIME_MICROS NULLABLE,
    DUEDATE UNIXTIME_MICROS NULLABLE,
    RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
    VOTES DECIMAL(18, 0) NULLABLE,
    WATCHES DECIMAL(18, 0) NULLABLE,
    TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMESPENT DECIMAL(18, 0) NULLABLE,
    WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
    SECURITY DECIMAL(18, 0) NULLABLE,
    FIXFOR DECIMAL(18, 0) NULLABLE,
    COMPONENT DECIMAL(18, 0) NULLABLE,
    ARCHIVEDBY STRING NULLABLE,
    ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
    ARCHIVED STRING NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMEN {code}
this is the external table in impala:
{code:java}
CREATE EXTERNAL TABLE prod__tm_jira.jiraissue STORED AS KUDU TBLPROPERTIES 
('STATS_GENERATED'='TASK', 'accessType'='8', 
'impala.lastComputeStatsTime'='1678903206', 
'kudu.master_addresses'='kudu-host', 
'kudu.table_name'='prod__tm_jira__jiraissue', 'numFiles'='0', 
'numRows'='295165', 'totalSize'='0') {code}
'id' is decimal in impala,and table jiraissue has 295336 rows

!image-2023-03-17-15-13-49-139.png!!image-2023-03-16-18-03-55-375.png|width=514,height=452!
h2. sql description

 
{code:java}
select
c1.id,
c3.child,
c1.project,
c1.summary,
c1.description,
c1.created
 from prod__tm_jira.AO_69E499_REQ_RELATION c3
left join prod__tm_jira.jiraissue c1
 on c1.id=c3.child 
where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
!image-2023-03-16-18-08-55-259.png|width=765,height=379!

when i execute the sql,i got the right result ,and from execute plan i found:

!image-2023-03-16-18-11-28-239.png|width=724,height=526!

after i execute follow sql,i expect more rows result but i got 0 result: 
{code:java}
select
c1.id,
c3.child,
c1.project,
c1.summary,
c1.description,
c1.created
 from prod__tm_jira.AO_69E499_REQ_RELATION c3
left join prod__tm_jira.jiraissue c1
 on c1.id=c3.child 
where c1.issuetype IN ('11801','10400')
-- and c1.id=324482 {code}
!image-2023-03-16-18-12-30-062.png|width=545,height=316!

and the sql plan like below,i found c3.child didn't cast to decimal,and runtime 
filter works to id.

!image-2023-03-16-18-17-57-745.png|width=696,height=692!

*I got 0 results,and i found impala runtime filter is worked,so is it possible 
that  impala didn't identify data type with runtime filter?in this case 'id' is 
decimal but 'child' is bigint.*

  was:
h1. wrong result with Pushdown runtime min-max filters
h2. table description

I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like 
this:

 
{code:java}
TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
    ID INT32 NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    CHILD INT64 NULLABLE,
    CHILD_PROJECT INT64 NULLABLE,
    PARENT INT64 NULLABLE,
    PARENT_PROJECT INT64 NULLABLE,
    SEQUENCE INT32 NULLABLE,
    SEQ_BALANCER INT32 NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMENT {code}
‘child’ is bigint in impala,and table AO_69E499_REQ_RELATION table has 6273 
rows:

!image-2023-03-17-15-17-23-669.png!   
!image-2023-03-16-18-05-03-201.png|width=395,height=295!

and another table describe below:
{code:java}
TABLE prod_tm_jira_jiraissue (
    ID DECIMAL(18, 0) NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    pkey STRING NULLABLE,
    issuenum DECIMAL(18, 0) NULLABLE,
    PROJECT DECIMAL(18, 0) NULLABLE,
    REPORTER STRING NULLABLE,
    ASSIGNEE STRING NULLABLE,
    CREATOR STRING NULLABLE,
    issuetype STRING NULLABLE,
    SUMMARY STRING NULLABLE,
    DESCRIPTION STRING NULLABLE,
    ENVIRONMENT STRING NULLABLE,
    PRIORITY STRING NULLABLE,
    RESOLUTION STRING NULLABLE,
    issuestatus STRING NULLABLE,
    CREATED UNIXTIME_MICROS NULLABLE,
    UPDATED UNIXTIME_MICROS NULLABLE,
    DUEDATE UNIXTIME_MICROS NULLABLE,
    RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
    VOTES DECIMAL(18, 0) NULLABLE,
    WATCHES DECIMAL(18, 0) NULLABLE,
    TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMESPENT DECIMAL(18, 0) NULLABLE,
    WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
    SECURITY DECIMAL(18, 0) NULLABLE,
    FIXFOR DECIMAL(18, 0) NULLABLE,
    COMPONENT DECIMAL(18, 0) NULLABLE,
    ARCHIVEDBY STRING NULLABLE,
    ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
    ARCHIVED STRING NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMEN {code}
'id' is decimal in impala,and table jiraissue has 295336 rows

!image-2023-03-17-15-13-49-139.png!!image-2023-03-16-18-03-55-375.png|width=514,height=452!
h2. sql description

 
{code:java}
select
c1.id,
c3.child,
c1.project,
c1.summary,
c1.description,
c1.created
 from prod__tm_jira.AO_69E499_REQ_RELATION c3
left join prod__tm_jira.jiraissue c1
 on c1.id=c3.child 
where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
!image-2023-03-16-18-08-55-259.png|width=765,height=379!

when i execute the sql,i got the right result ,and from execute plan i found:

!image-2023-03-16-18-11-28-239.png|width=724,height=526!

after i execute follow sql,i expect more rows result but i got 0 result: 
{code:java}
select
c1.id,
c3.child,
c1.project,
c1.summary,
c1.description,
c1.created
 from prod__tm_jira.AO_69E499_REQ_RELATION c3
left join prod__tm_jira.jiraissue c1
 on c1.id=c3.child 
where c1.issuetype IN ('11801','10400')
-- and c1.id=324482 {code}
!image-2023-03-16-18-12-30-062.png|width=545,height=316!

and the sql plan like below,i found c3.child didn't cast to decimal,and runtime 
filter works to id.

!image-2023-03-16-18-17-57-745.png|width=696,height=692!

*I got 0 results,and i found impala runtime filter is worked,so is it possible 
that  impala didn't identify data type with runtime filter?in this case 'id' is 
decimal but 'child' is bigint.*


> wrong result with Pushdown runtime min-max filters on bitlong and decimal
> -------------------------------------------------------------------------
>
>                 Key: IMPALA-12007
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12007
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.1.1
>         Environment: centos
>            Reporter: daicheng
>            Priority: Major
>         Attachments: image-2023-03-16-18-03-55-375.png, 
> image-2023-03-16-18-05-03-201.png, image-2023-03-16-18-08-55-259.png, 
> image-2023-03-16-18-11-28-239.png, image-2023-03-16-18-12-30-062.png, 
> image-2023-03-16-18-17-57-745.png, image-2023-03-17-15-09-40-463.png, 
> image-2023-03-17-15-13-49-139.png, image-2023-03-17-15-17-23-669.png
>
>
> h1. wrong result with Pushdown runtime min-max filters
> h2. table description
> I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like 
> below,and i create external table in impala:
> {code:java}
> TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
>     ID INT32 NOT NULL,
>     kudu_change_time UNIXTIME_MICROS NOT NULL,
>     kudu_update_time UNIXTIME_MICROS NOT NULL,
>     kudu_is_deleted BOOL NOT NULL,
>     CHILD INT64 NULLABLE,
>     CHILD_PROJECT INT64 NULLABLE,
>     PARENT INT64 NULLABLE,
>     PARENT_PROJECT INT64 NULLABLE,
>     SEQUENCE INT32 NULLABLE,
>     SEQ_BALANCER INT32 NULLABLE,
>     PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
>     PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMENT {code}
> {code:java}
> CREATE EXTERNAL TABLE prod__tm_jira.ao_69e499_req_relation STORED AS KUDU 
> TBLPROPERTIES ('STATS_GENERATED'='TASK', 'accessType'='8', 
> 'impala.lastComputeStatsTime'='1678903202', 
> 'kudu.master_addresses'='kudu-host', 
> 'kudu.table_name'='prod__tm_jira__AO_69E499_REQ_RELATION', 'numFiles'='0', 
> 'numRows'='62204', 'totalSize'='0') {code}
> ‘child’ is bigint in impala,and table AO_69E499_REQ_RELATION table has 6273 
> rows:
> !image-2023-03-17-15-17-23-669.png!    
> !image-2023-03-16-18-05-03-201.png|width=395,height=295!
> and another table describe below:
> {code:java}
> TABLE prod_tm_jira_jiraissue (
>     ID DECIMAL(18, 0) NOT NULL,
>     kudu_change_time UNIXTIME_MICROS NOT NULL,
>     kudu_update_time UNIXTIME_MICROS NOT NULL,
>     kudu_is_deleted BOOL NOT NULL,
>     pkey STRING NULLABLE,
>     issuenum DECIMAL(18, 0) NULLABLE,
>     PROJECT DECIMAL(18, 0) NULLABLE,
>     REPORTER STRING NULLABLE,
>     ASSIGNEE STRING NULLABLE,
>     CREATOR STRING NULLABLE,
>     issuetype STRING NULLABLE,
>     SUMMARY STRING NULLABLE,
>     DESCRIPTION STRING NULLABLE,
>     ENVIRONMENT STRING NULLABLE,
>     PRIORITY STRING NULLABLE,
>     RESOLUTION STRING NULLABLE,
>     issuestatus STRING NULLABLE,
>     CREATED UNIXTIME_MICROS NULLABLE,
>     UPDATED UNIXTIME_MICROS NULLABLE,
>     DUEDATE UNIXTIME_MICROS NULLABLE,
>     RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
>     VOTES DECIMAL(18, 0) NULLABLE,
>     WATCHES DECIMAL(18, 0) NULLABLE,
>     TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
>     TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
>     TIMESPENT DECIMAL(18, 0) NULLABLE,
>     WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
>     SECURITY DECIMAL(18, 0) NULLABLE,
>     FIXFOR DECIMAL(18, 0) NULLABLE,
>     COMPONENT DECIMAL(18, 0) NULLABLE,
>     ARCHIVEDBY STRING NULLABLE,
>     ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
>     ARCHIVED STRING NULLABLE,
>     PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
>     PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMEN {code}
> this is the external table in impala:
> {code:java}
> CREATE EXTERNAL TABLE prod__tm_jira.jiraissue STORED AS KUDU TBLPROPERTIES 
> ('STATS_GENERATED'='TASK', 'accessType'='8', 
> 'impala.lastComputeStatsTime'='1678903206', 
> 'kudu.master_addresses'='kudu-host', 
> 'kudu.table_name'='prod__tm_jira__jiraissue', 'numFiles'='0', 
> 'numRows'='295165', 'totalSize'='0') {code}
> 'id' is decimal in impala,and table jiraissue has 295336 rows
> !image-2023-03-17-15-13-49-139.png!!image-2023-03-16-18-03-55-375.png|width=514,height=452!
> h2. sql description
>  
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
>  from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
>  on c1.id=c3.child 
> where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
> !image-2023-03-16-18-08-55-259.png|width=765,height=379!
> when i execute the sql,i got the right result ,and from execute plan i found:
> !image-2023-03-16-18-11-28-239.png|width=724,height=526!
> after i execute follow sql,i expect more rows result but i got 0 result: 
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
>  from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
>  on c1.id=c3.child 
> where c1.issuetype IN ('11801','10400')
> -- and c1.id=324482 {code}
> !image-2023-03-16-18-12-30-062.png|width=545,height=316!
> and the sql plan like below,i found c3.child didn't cast to decimal,and 
> runtime filter works to id.
> !image-2023-03-16-18-17-57-745.png|width=696,height=692!
> *I got 0 results,and i found impala runtime filter is worked,so is it 
> possible that  impala didn't identify data type with runtime filter?in this 
> case 'id' is decimal but 'child' is bigint.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to