vvysotskyi commented on a change in pull request #1914: DRILL-7458: Base 
framework for storage plugins
URL: https://github.com/apache/drill/pull/1914#discussion_r369112612
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/base/BaseGroupScan.java
 ##########
 @@ -0,0 +1,481 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.base;
+
+import java.util.List;
+
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.SchemaFactory;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.base.BaseStoragePlugin.StoragePluginOptions;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+/**
+ * Base group scan for storage plugins. A group scan is a "logical" scan: it is
+ * the representation of the scan used during the logical and physical planning
+ * phases. The group scan is converted to a "sub scan" (an executable scan
+ * specification) for inclusion in the physical plan sent to Drillbits for
+ * execution. The group scan represents the entire scan of a table or data
+ * source. The sub scan divides that scan into files, storage blocks or other
+ * forms of parallelism.
+ * <p>
+ * The group scan participates in both logical and physical planning. As
+ * noted below, logical plan information is JSON serialized, but physical
+ * plan information is not. The transition from logical
+ * to physical planning is not clear. The first call to
+ * {@link #getMinParallelizationWidth()} or
+ * {@link #getMaxParallelizationWidth()} is a good signal. The group scan
+ * is copied multiple times (each with more information) during logical
+ * planning, but is not copied during physical planning. This means that
+ * physical planning state can be thought of as transient: it should not
+ * be serialized or copied.
+ * <p>
+ * Because the group scan is part of the Calcite planning process, it is
+ * very helpful to understand the basics of query planning and how
+ * Calcite implements that process.
+ *
+ * <h4>Serialization</h4>
+ *
+ * Drill provides the ability to serialize the logical plan. This is most
+ * easily seen by issuing the <code>EXPLAIN PLAN FOR</code> command for a
+ * query. The Jackson-serialized representation of the group scan appears
+ * in the JSON partition of the <code>EXPLAIN</code> output, while the
+ * <code>toString()</code> output appears in the text output of that
+ * command.
+ * <p>
+ * Care must be taken when serializing: include only the <i>logical</i>
+ * plan information, omit the physical plan information.
+ * <p>
+ * Any fields that are part of the <i>logical</i> plan must be Jackson
+ * serializable. The following information should be serialized in the
+ * logical plan:
+ * <ul>
+ * <li>Information from the scan spec (from the table lookup in the
+ * schema) if relevant to the plugin.</li>
+ * <li>The list of columns projected in the scan.</li>
+ * <li>Any filters pushed into the query (in whatever form makes sense
+ * for the plugin.</li>
+ * <li>Any other plugin-specific logical plan information.</li>
+ * </ul>
+ * This base class (and its superclasses) serialize the plugin config,
+ * user name, column list and scan stats. The derived class should handle
+ * other fields.
+ * <p>
+ * On the other hand, the kinds of information should <i>not</i> be
+ * serialized:
+ * <ul>
+ * <li>The set of drillbits on which queries will run.</li>
+ * <li>The actual number of minor fragments that run scans.</li>
+ * <li>Other physical plan information.</li>
+ * <li>Cached data (such as the cached scan stats, etc.</li>
+ * </ul>
+ * <p>
+ * Jackson will use the constructor marked with <tt>@JsonCreator</tt> to
+ * deserialize your group scan. If you create a subclass, and add fields, start
+ * with the constructor from this class, then add your custom fields after the
+ * fields defined here. Make liberal use of <code>@JsonProperty</code> to
+ * identify fields (getters) to be serialized, and <code>@JsonIgnore</code>
+ * for those that should not be serialized.
+ *
+ * <h4>Life Cycle</h4>
+ *
+ * Drill uses Calcite for planning. Calcite is a bit complex: it applies a 
series
+ * of rules to transform the query, then chooses the lowest cost among the
+ * available transforms. As a result, the group scan object is continually
+ * created and recreated. The following is a rough outline of these events.
+ *
+ * <h5>Create the Group Scan</h5>
+ *
+ * The storage plugin provides a {@link SchemaFactory} which provides a
+ * {@link SchemaConfig} which represents the set of (logical) tables available
+ * from the plugin.
+ * <p>
+ * Calcite offers table names to the schema. If the table is valid, the schema
+ * creates a {@link BaseScanSpec} to describe the schema, table and other
 
 Review comment:
   There is no such class and some constructors and methods mentioned in this 
JavaDoc below don't exist.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to