Hrongrong Cao created KYLIN-5704: ------------------------------------ Summary: For ‘in’ condition query of non-time partition columns, when the data type of the value in 'in' condition is inconsistent with that of the non-time partition column, the segment pruner fails, resulting in full Segment scanning Key: KYLIN-5704 URL: https://issues.apache.org/jira/browse/KYLIN-5704 Project: Kylin Issue Type: Bug Affects Versions: 5.0-alpha Reporter: Hrongrong Cao Fix For: 5.0-beta
The query column is a non-time partition column, a common dimension column, and the filter condition of the common dimension column is col in (x1, x2...) In this case (and because the col and x1 types do not match, it is automatically converted to (cast col as string) in (x1,x2..), Fileprunner will report an error because org.apache.spark.sql.execution.datasource.FilePruner#convertCastFilter does not handle in. Explain that the convertCastFilter method is to remove the cast condition, so that the filter condition can be matched when calling DataSourceStrategy.translateFilter, and then the Segment can be filtered. However, currently convertCastFilter misses the processing of the in condition, so translateFilter cannot match and becomes empty, so The query was thrown incorrectly. In addition: if it is a time partition column, it does not matter if an error is reported here, because in the previous steps, the calcite file prunner has already completed the Segment Prune of the time partition column. -- This message was sent by Atlassian Jira (v8.20.10#820010)