Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/749#discussion_r102812251
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenBinaryReader.java
 ---
    @@ -70,33 +70,31 @@ public long readFields(long recordsToReadInThisPass, 
ColumnReader<?> firstColumn
         return recordsReadInCurrentPass;
       }
     
    -
       private long determineSizesSerial(long recordsToReadInThisPass) throws 
IOException {
    -    int lengthVarFieldsInCurrentRecord = 0;
    -    boolean exitLengthDeterminingLoop = false;
    -    long totalVariableLengthData = 0;
    -    long recordsReadInCurrentPass = 0;
    -    do {
    +
    +    // Can't read any more records than fixed width fields will fit.
    +    // Note: this calculation is very likely wrong; it is a simplified
    +    // version of earlier code, but probably needs even more attention.
    +
    +    int totalFixedFieldWidth = parentReader.getBitWidthAllFixedFields() / 
8;
    +    long batchSize = parentReader.getBatchSize();
    +    if (totalFixedFieldWidth > 0) {
    +      recordsToReadInThisPass = Math.min(recordsToReadInThisPass, 
batchSize / totalFixedFieldWidth);
    --- End diff --
    
    Fixed. Move to the constructor. Changed the batch size in the reader to 
final to guarantee that it can be treated as an invariant.
    
    Please give the revised code a good inspection.
    
    As noted in the comment in the code; this areas is a target-rich 
environment for improvements...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to