[
https://issues.apache.org/jira/browse/KYLIN-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709221#comment-17709221
]
longfeiJiang edited comment on KYLIN-5496 at 4/6/23 6:47 AM:
-------------------------------------------------------------
*Root Cause*
The CONTRACT_DATE field of the table used by SQL is of varchar type, and the
string '2022-12-01 00:00:00' is used for equality filtering in the where
condition.
Use the compare method in
org.apache.spark.sql.execution.datasource.SegDimFilters#foldFilter to compare
with the min max value of the field.
Expected here is hoped to be compared as time. However, it is compared here as
a string.
Therefore, the segment will be filtered out (the string '2022-12-01 00:00:00'
is greater than '2022-12-01'), and the data cannot be filtered out.
!image-2023-04-06-14-43-30-194.png!
!image-2023-04-06-14-47-13-247.png!
was (Author: JIRAUSER298472):
*Root Cause*
The CONTRACT_DATE field of the table used by SQL is of varchar type, and the
string '2022-12-01 00:00:00' is used for equality filtering in the where
condition.
Use the compare method in
org.apache.spark.sql.execution.datasource.SegDimFilters#foldFilter to compare
with the min max value of the field.
Expected here is hoped to be compared as time. However, it is compared here as
a string.
Therefore, the segment will be filtered out (the string '2022-12-01 00:00:00'
is greater than '2022-12-01'), and the data cannot be filtered out.
!image-2023-04-06-14-43-30-194.png!
> The query result is incorrect after converting the string type data in
> 'yyyy-mm-dd' format to timestamp type and querying with filter of this column
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KYLIN-5496
> URL: https://issues.apache.org/jira/browse/KYLIN-5496
> Project: Kylin
> Issue Type: Bug
> Reporter: longfeiJiang
> Assignee: longfeiJiang
> Priority: Major
> Attachments: image-2023-04-06-14-43-30-194.png,
> image-2023-04-06-14-47-13-247.png
>
>
> The query result is incorrect after converting the string type data in
> 'yyyy-mm-dd' format to timestamp type and querying with filter of this column.
>
> Steps to reproduce:
> 1. Create a hive table, the field type is string, and insert data 2022-12-01
> {code:java}
> create table test(dt string);
> insert into test values('2022-12-01'); {code}
> 2. Use kylin loads the table and builds the model
> 3. Query as follows sql, result is empty
> select dt,cast(dt as timestamp) from TEST.TEST where cast(dt as
> timestamp)='2022-12-01 00:00:00.0'
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)