[ 
https://issues.apache.org/jira/browse/DRILL-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409580#comment-16409580
 ] 

Arina Ielchiieva commented on DRILL-6248:
-----------------------------------------

Currently limit push down is not applied to any system tables. Generally there 
are fairly small and thus such optimizations is not critical.
In case of profiles tables it can be fairly large. Design of system tables are 
done in such way that data is all loaded into memory (this can be a subject for 
further optimizations, not tackling this issue here, since it may required 
re-design of system tables handling plus since primary profiles table is used 
for testing purposes, I do not think such optimization is highly critical).

Adding limit push down support can be done but selecting profiles in ranges, 
this will definitely improve performance if limit is low.
Also while adding push down support I have noticed that pojo record reader does 
not leverage batches, so such support should be added as well.

Regarding second issue with count, rule ConvertCountToDirectScan is responsible 
for such optimizations. During planning we don't know exact number of profiles 
to fetch plus we'll need to fetch all data anyway (especially from ZK) before 
we can receive exact count, so there will no significant performance 
improvement. 
Actually I had 80K+ profiles on my cluster and it took several minutes to 
return their count. It definitely can take longer if ZK is overloaded (for 
example, many queries are submitted / executed at that time). Anyway, ZK is not 
designed to store such information and we won't be able to get high performance 
while querying it. In future when proper metastore will be implemented based on 
some DB, it will definitely resolve performance issues.

> Support pushdown into System Table
> ----------------------------------
>
>                 Key: DRILL-6248
>                 URL: https://issues.apache.org/jira/browse/DRILL-6248
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Kunal Khatua
>            Assignee: Arina Ielchiieva
>            Priority: Minor
>             Fix For: 1.14.0
>
>
> When querying a profile store, we fetch all the records before applying a 
> limit.
> {code:sql}
> select * from sys.profiles limit 1
> {code}
> For a test scenario with 120K+ profiles in the store, a 
> {code:sql}
> select count(*) from sys.profiles
> {code}
> took 90 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to