[ 
https://issues.apache.org/jira/browse/DRILL-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961955#comment-16961955
 ] 

ASF GitHub Bot commented on DRILL-7418:
---------------------------------------

arina-ielchiieva commented on pull request #1883: DRILL-7418: 
MetadataDirectGroupScan improvements
URL: https://github.com/apache/drill/pull/1883#discussion_r340045287
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/test/QueryBuilder.java
 ##########
 @@ -764,4 +765,45 @@ protected String queryPlan(String columnName) throws 
Exception {
 
     return builder.toString();
   }
+
+  /**
+   * Collects expected and non-expected query patterns.
+   * Upon {@link #match()} method call, matches given patterns to the query 
plan.
+   */
+  public static class PlanMatcher {
+
+    private static final String EXPECTED_NOT_FOUND = "Did not find expected 
pattern";
+    private static final String UNEXPECTED_FOUND = "Found unwanted pattern";
+
+    private final String plan;
+    private final List<String> included = new ArrayList<>();
+    private final List<String> excluded = new ArrayList<>();
+
+    public PlanMatcher(String plan) {
+      this.plan = plan;
+    }
+
+    public PlanMatcher include(String... patterns) {
+      included.addAll(Arrays.asList(patterns));
+      return this;
+    }
+
+    public PlanMatcher exclude(String... patterns) {
+      excluded.addAll(Arrays.asList(patterns));
+      return this;
+    }
+
+    public void match() {
 
 Review comment:
   Added.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> MetadataDirectGroupScan improvements
> ------------------------------------
>
>                 Key: DRILL-7418
>                 URL: https://issues.apache.org/jira/browse/DRILL-7418
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.16.0
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>            Priority: Minor
>             Fix For: 1.17.0
>
>
> When count is converted to direct scan (case when statistics or table 
> metadata are available and there is no need to perform count operation), 
> {{MetadataDirectGroupScan}} is used. Proposed {{MetadataDirectGroupScan}} 
> enhancements:
> 1. Show table selection root instead listing all table files. If table has 
> lots of files, query plan gets polluted with all files enumeration. Since 
> files are not used for calculation (only metadata), they are not relevant and 
> can be excluded from the plan.
> Before:
> {noformat}
> | 00-00    Screen
> 00-01      Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02        DirectScan(groupscan=[files = 
> [/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_0.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_5.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_4.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_9.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_3.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_6.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_7.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_10.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_2.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_1.parquet, 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_8.parquet], 
> numFiles = 11, usedMetadataSummaryFile = false, 
> DynamicPojoRecordReader{records = [[1560060, 2880404, 2880404, 0]]}])
> {noformat}
> After:
> {noformat}
> | 00-00    Screen
> 00-01      Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02        DirectScan(groupscan=[selectionRoot = 
> /drill/testdata/metadata_cache/store_sales_null_blocks_all, numFiles = 11, 
> usedMetadataSummaryFile = false, DynamicPojoRecordReader{records = [[1560060, 
> 2880404, 2880404, 0]]}])
> {noformat}
> For Hive tables which were scanned directly, selection root is not available 
> thus will be omitted.
> 2. Submission of physical plan which contains {{MetadataDirectGroupScan}} 
> fails with deserialization errors, proper ser / de should be implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to