[ 
https://issues.apache.org/jira/browse/DRILL-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335119#comment-16335119
 ] 

ASF GitHub Bot commented on DRILL-6080:
---------------------------------------

Github user vrozov commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1090#discussion_r162772482
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/selection/SelectionVector4.java
 ---
    @@ -100,8 +101,8 @@ public boolean next() {
           return false;
         }
     
    -    start = start+length;
    -    int newEnd = Math.min(start+length, recordCount);
    +    start = start + length;
    --- End diff --
    
    This code does not look right to me. It tries to enforce invariant that 
`start + length <= recordCount`, but based on the check on line 96, the 
invariant is not enforced in other places, so it is not clear why the invariant 
needs to be enforced here. If the invariant needs to be enforced, will it be 
better to use:
    ```
    start += length;
    length = Math.min(length, recordCount - start);
    ```


> Sort incorrectly limits batch size to 65535 records rather than 65536
> ---------------------------------------------------------------------
>
>                 Key: DRILL-6080
>                 URL: https://issues.apache.org/jira/browse/DRILL-6080
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.12.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.13.0
>
>
> Drill places an upper limit on the number of rows in a batch of 64K. That is 
> 65,536 decimal. When we index records, the indexes run from 0 to 64K-1 or 0 
> to 65,535.
> The sort code incorrectly uses {{Character.MAX_VALUE}} as the maximum row 
> count. So, if an incoming batch uses the full 64K size, sort ends up 
> splitting batches unnecessarily.
> The fix is to instead use the correct constant `ValueVector.MAX_ROW_COUNT`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to