[ 
https://issues.apache.org/jira/browse/DRILL-8526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17987226#comment-17987226
 ] 

ASF GitHub Bot commented on DRILL-8526:
---------------------------------------

shfshihuafeng commented on PR #2995:
URL: https://github.com/apache/drill/pull/2995#issuecomment-3024167433

   > I submitted some minor changes. Just a reminder but we really need unit 
tests in order to merge this.
   > 
   > Also, have you considered adding a limit pushdown? It is usually pretty 
easy to do and only involves:
   > 
   > * Implementing two methods in the group scan (`HiveScan`) which are:  
`supportsLimitPushdown` and `applyLimit`.
   > * Passing the limit through the subscans.
   > * Adding some logic in the readers to stop when the limit is reached.
   >   Maybe it would be best to open a new JIRA for that, but IMHO, it is one 
of the easiest and most effective pushdowns that can be implemented yet Drill 
didn't seem to do for all the plugins.
   
   @cgivre I think  it is best to open a new JIRA,so you can refer to following 
link for hive limit push down,
     pr: []( https://github.com/apache/drill/pull/2997)
    JIRA:[](url) https://issues.apache.org/jira/browse/DRILL-8527




> Hive Predicate Push Down for ORC and Parquet
> --------------------------------------------
>
>                 Key: DRILL-8526
>                 URL: https://issues.apache.org/jira/browse/DRILL-8526
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive
>    Affects Versions: 1.22.0
>            Reporter: shihuafeng
>            Priority: Major
>             Fix For: 1.23.0
>
>         Attachments: image-2025-06-24-18-08-34-427.png, 
> image-2025-06-24-18-08-54-768.png
>
>
> Drill do not  support  filter push down  for  orc  format.  i do it  and test.
> When a large amount of data is filtered out, Predicate PushDown can 
> significantly improve the query performance of ORC format
> Through comparative testing of the following TPCH SQL queries, ORC format 
> with filter pushdown achieves nearly a 5-20x performance improvement over 
> execution without pushdown.
> sql : select * from hive.lineitem_o  where L_ORDERKEY=1;
> the data of table lineitem_o: 6001215
> with out push down
> !image-2025-06-24-18-08-34-427.png!
> push down
> !image-2025-06-24-18-08-54-768.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to