[GitHub] [drill] luocooong commented on a change in pull request #2419: DRILL-8085: EVF V2 support in the "Easy" format plugin

GitBox Sat, 22 Jan 2022 03:37:11 -0800


luocooong commented on a change in pull request #2419:
URL: https://github.com/apache/drill/pull/2419#discussion_r790131253




##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/convert/WriterBuilder.java
##########
@@ -0,0 +1,230 @@
+/*

Review comment:
       I have not found a function to refer to this builder. To avoid 
developers forgetting this good helper, is it possible to give a sample for use 
(in unit test?)?

##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/project/ReaderSchemaOrchestrator.java
##########
@@ -68,10 +71,11 @@ public void setBatchSize(int size) {
 
   @VisibleForTesting
   public ResultSetLoader makeTableLoader(TupleMetadata readerSchema) {
-    return makeTableLoader(scanOrchestrator.scanProj.context(), readerSchema);
+    return makeTableLoader(scanOrchestrator.scanProj.context(), readerSchema, 
-1);
   }
 
-  public ResultSetLoader makeTableLoader(CustomErrorContext errorContext, 
TupleMetadata readerSchema) {
+  public ResultSetLoader makeTableLoader(CustomErrorContext errorContext,
+      TupleMetadata readerSchema, long localLimit) {

Review comment:
       Sorry for the question, how to understand the variable name : localLimit?
   If need 50 rows (for two bacth), and A batch has only 40 rows, so detect the 
B batch only need to fetch (50 - 40 =) 10 rows, then 10 rows is the localLimit?

##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/ScanOperatorExec.java
##########
@@ -32,6 +32,24 @@
  * Implementation of the revised scan operator that uses a mutator aware of
  * batch sizes. This is the successor to {@link ScanBatch} and should be used
  * by all new scan implementations.
+ * <p>
+ * The basic concept is to split the scan operator into layers:
+ * <ul>
+ * <li>The {@code OperatorRecordBatch} which implements Drill's Volcano-like
+ * protocol.</li>
+ * <li>The scan operator "wrapper" (this class) which implements actions for 
the
+ * operator record batch specifically for scan. It iterates over readers,
+ * delegating semantic work to other classes.</li>
+ * <li>The implementation of per-reader semantics in the two EVF versions and
+ * other ad-hoc implementations.</li>
+ * <li>The result set loader and related classes which pack values into
+ * value vectors.</li>
+ * <li>Value vectors, which store the data.</li>
+ * </ul>
+ * <p>
+ * The layered format can be confusing. However, each layer is somewhat
+ * complex, so dividing the work into layers keeps the overall complexity
+ * somewhat under control.

Review comment:
       @paul-rogers Thank you for updating these descriptions.
   If there is an opportunity to publish a new books ("Learning Apache Drill, 
2nd"), we can record these in the book. As I understand, for newcomers, perhaps 
they are concerned about how to use the new easy framework to develop new 
connectors, for experienced developers, they want to focus more on the internal 
of scan workflow.
   Is it possible to record the tutorial and internal mechanisms (scan 
workflow) of the new framework (based on V2) in the current wiki library? I can 
provide a copy of Chinese languages.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [drill] luocooong commented on a change in pull request #2419: DRILL-8085: EVF V2 support in the "Easy" format plugin

Reply via email to