ruofan created HUDI-5557:
----------------------------

             Summary: Wrong candidate files found in metadata table 
                 Key: HUDI-5557
                 URL: https://issues.apache.org/jira/browse/HUDI-5557
             Project: Apache Hudi
          Issue Type: Bug
          Components: metadata, spark-sql
    Affects Versions: 0.12.1
            Reporter: ruofan


Suppose the hudi table has five fields, but only two fields are indexed. When 
part of the filter condition in SQL comes from index fields and the other part 
comes from non-index fields, the candidate files queried from the metadata 
table are wrong.

For example following hudi table schema
{code:java}
name: varchar(128)
age: int
addr: varchar(128)
city: varchar(32)
job: varchar(32) {code}
table properties
{code:java}
hoodie.table.type=MERGE_ON_READ
hoodie.metadata.enable=true
hoodie.metadata.index.column.stats.enable=true
hoodie.metadata.index.column.stats.column.list='name,city'
hoodie.enable.data.skipping=true {code}
sql
{code:java}
select * from hudi_table where name='tom' and age=18;  {code}
if we set hoodie.enable.data.skipping=false, the data can be found. But if we 
set hoodie.enable.data.skipping=true, we can't find the expected data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to