[ 
https://issues.apache.org/jira/browse/DRILL-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340469#comment-16340469
 ] 

ASF GitHub Bot commented on DRILL-6080:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1090#discussion_r164021818
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/selection/SelectionVector4.java
 ---
    @@ -31,8 +31,9 @@
       private int length;
     
       public SelectionVector4(ByteBuf vector, int recordCount, int 
batchRecordCount) throws SchemaChangeException {
    -    if (recordCount > Integer.MAX_VALUE /4) {
    -      throw new SchemaChangeException(String.format("Currently, Drill can 
only support allocations up to 2gb in size.  You requested an allocation of %d 
bytes.", recordCount * 4));
    +    if (recordCount > Integer.MAX_VALUE / 4) {
    +      throw new SchemaChangeException(String.format("Currently, Drill can 
only support allocations up to 2gb in size. " +
    +          "You requested an allocation of %d bytes.", recordCount * 4));
    --- End diff --
    
    Sounds like two issues.
    
    First, while I pointed out opportunities for improvement in the code to be 
consistent with work elsewhere, the code as it is has worked for the last two 
years.
    
    Second, if it helps to move this PR ahead for @ilooner, I can back out the 
formatting changes to this file so that it drops out of the PR. That said,  our 
general policy has been to include code cleanup within other commits rather 
than incurring the cost and delay of doing two commits for each bit of work 
(one for code cleanup, another for substantive changes.)
    
    Besides this issue, anything else needed?


> Sort incorrectly limits batch size to 65535 records rather than 65536
> ---------------------------------------------------------------------
>
>                 Key: DRILL-6080
>                 URL: https://issues.apache.org/jira/browse/DRILL-6080
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.12.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.13.0
>
>
> Drill places an upper limit on the number of rows in a batch of 64K. That is 
> 65,536 decimal. When we index records, the indexes run from 0 to 64K-1 or 0 
> to 65,535.
> The sort code incorrectly uses {{Character.MAX_VALUE}} as the maximum row 
> count. So, if an incoming batch uses the full 64K size, sort ends up 
> splitting batches unnecessarily.
> The fix is to instead use the correct constant `ValueVector.MAX_ROW_COUNT`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to