Arina Ielchiieva created DRILL-7418:
---------------------------------------

             Summary: MetadataDirectGroupScan improvements
                 Key: DRILL-7418
                 URL: https://issues.apache.org/jira/browse/DRILL-7418
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.16.0
            Reporter: Arina Ielchiieva
            Assignee: Arina Ielchiieva
             Fix For: 1.17.0


When count is converted to direct scan (case when statistics and table metadata 
are available and there is no need to perform count operation), 
{{MetadataDirectGroupScan}} is used. Proposed {{MetadataDirectGroupScan}} 
enhancements:
1. show table root instead listing all table files. If users= has lots of 
files, query plan gets polluted with files enumeration. Since files are not 
used for calculation (only metadata), they are not relevant and can be excluded 
from plan.

Before:
{noformat}
| 00-00    Screen
00-01      Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
00-02        DirectScan(groupscan=[files = 
[/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_0.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_5.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_4.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_9.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_3.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_6.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_7.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_10.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_2.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_1.parquet, 
/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_8.parquet], 
numFiles = 11, usedMetadataSummaryFile = false, DynamicPojoRecordReader{records 
= [[1560060, 2880404, 2880404, 0]]}])
{noformat}


After:
{noformat}
| 00-00    Screen
00-01      Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
00-02        DirectScan(groupscan=[selectionRoot = 
/drill/testdata/metadata_cache/store_sales_null_blocks_all, numFiles = 11, 
usedMetadataSummaryFile = false, DynamicPojoRecordReader{records = [[1560060, 
2880404, 2880404, 0]]}])
{noformat}

2. Submission of physical plan which contains {{MetadataDirectGroupScan}} fails 
with deserialization errors, proper ser / de should be implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to