paul-rogers commented on a change in pull request #2419:
URL: https://github.com/apache/drill/pull/2419#discussion_r795379597



##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/ScanOperatorExec.java
##########
@@ -32,6 +32,24 @@
  * Implementation of the revised scan operator that uses a mutator aware of
  * batch sizes. This is the successor to {@link ScanBatch} and should be used
  * by all new scan implementations.
+ * <p>
+ * The basic concept is to split the scan operator into layers:
+ * <ul>
+ * <li>The {@code OperatorRecordBatch} which implements Drill's Volcano-like
+ * protocol.</li>
+ * <li>The scan operator "wrapper" (this class) which implements actions for 
the
+ * operator record batch specifically for scan. It iterates over readers,
+ * delegating semantic work to other classes.</li>
+ * <li>The implementation of per-reader semantics in the two EVF versions and
+ * other ad-hoc implementations.</li>
+ * <li>The result set loader and related classes which pack values into
+ * value vectors.</li>
+ * <li>Value vectors, which store the data.</li>
+ * </ul>
+ * <p>
+ * The layered format can be confusing. However, each layer is somewhat
+ * complex, so dividing the work into layers keeps the overall complexity
+ * somewhat under control.

Review comment:
       @luocooong, good suggestion. I think that there is material that covers 
creating a plugin using the old-style low-level vector approach. I think there 
may be material for how to use EVF. I'll see if I can find that material and 
either update it for EVF V2, or at last write up differences from those earlier 
versions.

##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/project/ReaderSchemaOrchestrator.java
##########
@@ -68,10 +71,11 @@ public void setBatchSize(int size) {
 
   @VisibleForTesting
   public ResultSetLoader makeTableLoader(TupleMetadata readerSchema) {
-    return makeTableLoader(scanOrchestrator.scanProj.context(), readerSchema);
+    return makeTableLoader(scanOrchestrator.scanProj.context(), readerSchema, 
-1);
   }
 
-  public ResultSetLoader makeTableLoader(CustomErrorContext errorContext, 
TupleMetadata readerSchema) {
+  public ResultSetLoader makeTableLoader(CustomErrorContext errorContext,
+      TupleMetadata readerSchema, long localLimit) {

Review comment:
       Good question. Let's start at the top. The user specifies a query limit, 
say, `LIMIT 50000`. Drill is distributed, so the limit must be pushed to each 
of the scan operators. Let's say there are 2, each running in a different 
thread. I hope the Calcite planner pushes the full limit to each, so each scan 
gets a limit of 50,000. (Reason: we can't know ahead of time if either scan can 
produce that many rows, so we have to pass the full amount in case the other 
scan produces no rows.)
   
   I just realized one other issue: the limit can be pushed into the scan only 
if there is no `WHERE` clause in the query. It is actually rather unusual to 
*not* have a `WHERE` clause, so this whole limit-pushdown feature targets a 
rather simple, and unusual use case.
   
   Any, assume we bet past the above issues, and we pass that 50K limit down 
into the scan. That is the "global limit" for the scan. (Or, maybe a better 
term term is "scan limit" since the limit is not global to all scans, just this 
one.)
   
   Now, every scan can process any number of readers. Let's say this scan has 3 
readers. Each reader has its own "local limit". Let's see how that works. The 
first reader gets the full limit of 50K rows. Let's say the reader actually 
returns 10K rows. In that case, the second reader gets a local limit of 40K 
rows. And so on.
   
   Once the local limit goes to zero, we skip any remaining readers: they can't 
give us any new rows, so no need to even start those readers.
   
   So, this function accepts the per-reader ("local") scan limit

##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/convert/WriterBuilder.java
##########
@@ -0,0 +1,230 @@
+/*

Review comment:
       Well, that is odd. Maybe this is an earlier draft of something that 
ended up somewhere else? I'll have to hunt down the answer. I'll do that when I 
update this PR after doing the smaller PRs that James requested.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to