[ https://issues.apache.org/jira/browse/DRILL-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409580#comment-16409580 ]
Arina Ielchiieva commented on DRILL-6248: ----------------------------------------- Currently limit push down is not applied to any system tables. Generally there are fairly small and thus such optimizations is not critical. In case of profiles tables it can be fairly large. Design of system tables are done in such way that data is all loaded into memory (this can be a subject for further optimizations, not tackling this issue here, since it may required re-design of system tables handling plus since primary profiles table is used for testing purposes, I do not think such optimization is highly critical). Adding limit push down support can be done but selecting profiles in ranges, this will definitely improve performance if limit is low. Also while adding push down support I have noticed that pojo record reader does not leverage batches, so such support should be added as well. Regarding second issue with count, rule ConvertCountToDirectScan is responsible for such optimizations. During planning we don't know exact number of profiles to fetch plus we'll need to fetch all data anyway (especially from ZK) before we can receive exact count, so there will no significant performance improvement. Actually I had 80K+ profiles on my cluster and it took several minutes to return their count. It definitely can take longer if ZK is overloaded (for example, many queries are submitted / executed at that time). Anyway, ZK is not designed to store such information and we won't be able to get high performance while querying it. In future when proper metastore will be implemented based on some DB, it will definitely resolve performance issues. > Support pushdown into System Table > ---------------------------------- > > Key: DRILL-6248 > URL: https://issues.apache.org/jira/browse/DRILL-6248 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.13.0 > Reporter: Kunal Khatua > Assignee: Arina Ielchiieva > Priority: Minor > Fix For: 1.14.0 > > > When querying a profile store, we fetch all the records before applying a > limit. > {code:sql} > select * from sys.profiles limit 1 > {code} > For a test scenario with 120K+ profiles in the store, a > {code:sql} > select count(*) from sys.profiles > {code} > took 90 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)