hudi-bot opened a new issue, #16360:
URL: https://github.com/apache/hudi/issues/16360

   from the picture, csi will use parquet chunk block meta calculate min/max 
value, and save it to mdt col stat. For complex cols, such as **info 
array<struct<name: string, age: int>>** , parquet meta will contain only 
`info.array.name`, `infor.array.age`, but hudi will only calculate `info` 
column, so this meta in mdt will be null.
   
   And if sql expression contain `IsNotNull(info)`, the file will all be skip.
   
   And consider common cols, which will be add in the future and old file will 
not contain this col, may cause some other question. So, make code logical 
clean, Check for null before evaluating the value:min/mav/nullValue.
   
   !image-2023-12-28-13-29-15-943.png|width=1458,height=798!
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-7267
   - Type: Bug
   - Fix version(s):
     - 1.1.0
   - Attachment(s):
     - 28/Dec/23 
05:29;knightchess;image-2023-12-28-13-29-15-943.png;https://issues.apache.org/jira/secure/attachment/13065636/image-2023-12-28-13-29-15-943.png


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to