Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/597#discussion_r81196052
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java
 ---
    @@ -926,16 +952,22 @@ public GroupScan applyLimit(long maxRecords) {
           fileNames.add(rowGroupInfo.getPath());
         }
     
    -    if (fileNames.size() == fileSet.size() ) {
    +    // If there is no change in fileSet and maxRecords is >= batchSize, no 
need to create new groupScan.
    +    if (fileNames.size() == fileSet.size() && (maxRecords >= 
recommendedBatchSize) ) {
           // There is no reduction of rowGroups. Return the original groupScan.
           logger.debug("applyLimit() does not apply!");
           return null;
         }
     
    +    // If limit maxRecords is less than batchSize, update batchSize to the 
limit size.
    +    if (maxRecords < recommendedBatchSize) {
    +      recommendedBatchSize = (int) maxRecords;
    +    }
    +
         try {
           FileSelection newSelection = new FileSelection(null, 
Lists.newArrayList(fileNames), getSelectionRoot(), cacheFileRoot, false);
           logger.debug("applyLimit() reduce parquet file # from {} to {}", 
fileSet.size(), fileNames.size());
    -      return this.clone(newSelection);
    +      return this.clone(newSelection, recommendedBatchSize);
    --- End diff --
    
    I feel in case  file selection is unchanged and maxRecords < 
recommenedBatchSize, we do not have to re-create a new parquetGroupScan. In 
such case, all we need is to re-set batchsize. Recreate a parquetgroup with 
same fileselection would incur overhead of reading parquet metadata.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to