[ 
https://issues.apache.org/jira/browse/DRILL-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562734#comment-17562734
 ] 

ASF GitHub Bot commented on DRILL-8182:
---------------------------------------

vvysotskyi commented on code in PR #2583:
URL: https://github.com/apache/drill/pull/2583#discussion_r913963349


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTableSelection.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+public interface DrillTableSelection {
+  public static final String SELECTION_DIGEST_NONE = "NONE";
+
+  /**
+   * The digest of the selection represented by the implementation. The
+   * selections that accompany Tables can modify the contained dataset, e.g.
+   * a file selection can restrict to a subset of the available data and a
+   * format selection can include options that affect the behaviour of the
+   * underlying reader. Two scans will end up being considered identical during
+   * logical planning if their digests are the same so selection
+   * implementations should override this method so that exactly those scans
+   * that really are identical (in terms of the data they produce) have 
matching
+   * digests.
+   *
+   * @return this selection's digest, normally a string built from its 
properties.
+   */
+  default String digest() {

Review Comment:
   Could you please use abstract method instead of the default one here?





> File scan nodes not differentiated by format config
> ---------------------------------------------------
>
>                 Key: DRILL-8182
>                 URL: https://issues.apache.org/jira/browse/DRILL-8182
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Other
>    Affects Versions: 1.20.0
>            Reporter: James Turton
>            Assignee: Charles Givre
>            Priority: Major
>             Fix For: 1.20.2
>
>         Attachments: Products_Customers_Orders.xlsx
>
>
> Two file scans that differ only by format config overriden with table 
> functions may be genuinely different in terms of the data they return. The 
> format config options may affect the behaviour of the format parser (date 
> strings, delimiters, etc.) possibly directing format plugin to entirely 
> different data within the file. Such scans should not be considered the same 
> by the query planner. This illustrated by the following example based on the 
> Excel format plugin.
> When a query includes multiple SELECTs against a workbook by using TABLE 
> functions to access different sheets, and those sheets contain a column with 
> the same name, then values for that column come a single sheet for both 
> SELECTs.  To reproduce, run the following query against the attachment and 
> note that the `Name` values returned from the Products sheet are `Name` 
> values from the Customers sheet.
>  
> {code:java}
> with
> prod as (
>     select Id, Name from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` 
> (type => 'excel', sheetName => 'Products'))
> )
> , cust as (
>     select Id, Name from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` 
> (type => 'excel', sheetName => 'Customers'))
> )
> select * from cust join prod on cust.Id = prod.Id; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to