PengleiShi opened a new issue #1053:
URL: https://github.com/apache/orc/issues/1053


   Hi, I have a problem, my test data is tpc-ds 1g,  spark 3.2, orc version 
1.6.11,
   test sql `select count(1) from call_center_orc where cc_call_center_sk > 
100;`
   `cc_call_center_sk` is the first column in `call_center_orc`, and predicate 
pushdown is effectual
   
![image](https://user-images.githubusercontent.com/23723660/156520112-c3d6eb0c-6fda-42b9-9677-00ef9af4c216.png)
   but when i test `select count(1) from call_center_orc where cc_company > 
100;`
   `cc_company` is not the first column, predicate pushdown does not work
   
![image](https://user-images.githubusercontent.com/23723660/156520469-f43822e8-2107-4737-be30-662e2864d529.png)
   
   And I debug the code, I found the problem is 
`SchemaEvolution.ppdSafeConversion`,  
   
![image](https://user-images.githubusercontent.com/23723660/156521600-d5dfb4ed-3454-464a-8340-0064da5d343e.png)
   in my case, `result.size` is 2, but in `pickRowGroups`
   
![image](https://user-images.githubusercontent.com/23723660/156521946-dfc99cda-439a-4508-8afc-eaf32d719c6d.png)
   the `columnIx` is the column index in orc meta, which is 19 for 
`cc_company`, this causes orc will not evaluate pushdown filters with row group 
stats, and can not skip the row group.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to