venusMask commented on issue #6931:
URL: https://github.com/apache/paimon/issues/6931#issuecomment-3702380317
This is because Paimon's statistical information led to the correct data
file being wrongly filtered! The problem occurs at the LeafPredicate node.
Consider the following scenario:
```sql
create table t(id int, age int name int primary key (id) not enforced);
insert into t select 1 as id, 2 as age, cast(null as int) as name union all
select 2,2,2;
select * from t where !(b <=> 2)
```
Statistical information of the Paimon data file:
```java
minValue: 1, 1, 2
maxValue: 2, 2, 2
```
According to the current logic, this data file will be filtered out.
However, in reality, it should not be filtered.
```java
// LeafPredicate.test()
public boolean test(DataType type, long rowCount, Object min, Object max,
Long nullCount, Object literal) {
// type: int, rowCount: 2, min:2, max:2, nullCount:1, literal: 2
return CompareUtils.compareLiteral(type, literal, min) != 0 ||
CompareUtils.compareLiteral(type, literal, max) != 0;
})
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]