[ 
https://issues.apache.org/jira/browse/DRILL-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906010#comment-15906010
 ] 

ASF GitHub Bot commented on DRILL-5344:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/778#discussion_r105522650
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierTemplate.java
 ---
    @@ -57,9 +57,12 @@ public void setup(FragmentContext context, 
BufferAllocator allocator, VectorAcce
     
         queueSize = 0;
         for (int i = 0; i < size; i++) {
    -      vector4.set(i, i, batchGroups.get(i).getNextIndex());
    -      siftUp();
    -      queueSize++;
    +      int index = batchGroups.get(i).getNextIndex();
    +      vector4.set(i, i, index);
    --- End diff --
    
    This is tricky, and likely to due conflicting design decisions.
    
    It seems that the vector container retrieved by `get()` has a built-in 
iterator. Not a good design as it will cause havoc if two bits of code try to 
iterate at the same time. Sigh...
    
    The iterator decides to return -1 when it reaches the end. That is OK, as 
long as code checks.
    
    The original code did not check, it just used -1 (truncated to 16 bits, 
stored as a char, so 65535) as a row index.
    
    The solution is to not add the batch to the "queue" when the iterator 
returns -1 in the first position.
    
    Note that if the iterator has simply returned the count at the end, and we 
used the classic "index >= count" as the terminal condition, then none of this 
would be necessary.


> External sort priority queue copier fails with an empty batch
> -------------------------------------------------------------
>
>                 Key: DRILL-5344
>                 URL: https://issues.apache.org/jira/browse/DRILL-5344
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.11.0
>
>
> The external sort uses a "priority queue copier" to merge batches when 
> spilling or when merging spilled batches.
> The code will fail with an {{IndexOutOfBoundsException}} if any record batch 
> is empty. The reason is a faulty assumption in generated code:
> {code}
>   public void setup(...) {
>     ...
>       vector4.set(i, i, batchGroups.get(i).getNextIndex());
>     ...
>   }
>   public int getNextIndex() {
>     if (pointer == getRecordCount()) {
>       return -1;
>     }
>     ...
>   }
> {code}
> The code to get the next index returns -1 when the "position" in a record 
> batch is zero. The -1 position translates (when truncated) into 65535 which 
> produces the index exception.
> The workaround has been to special case empty batches elsewhere in the code, 
> apparently to avoid hitting this error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to