[jira] [Commented] (DRILL-5839) Handle Empty Batches in Merge Receiver

ASF GitHub Bot (JIRA) Thu, 05 Oct 2017 13:39:48 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193607#comment-16193607
 ]


ASF GitHub Bot commented on DRILL-5839:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/974#discussion_r143049947
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockRecordReader.java
 ---
    @@ -52,7 +52,7 @@ public MockRecordReader(FragmentContext context, 
MockScanEntry config) {
     
       private int getEstimatedRecordSize(MockColumn[] types) {
         int x = 0;
    -    for (int i = 0; i < types.length; i++) {
    +    for (int i = 0; i < (types == null ? 0 : types.length); i++) {
    --- End diff --
    
    Nit: maybe use:
    
    ```
    if (types == null) { return 0; }
    // Original code here..
    ```


> Handle Empty Batches in Merge Receiver
> --------------------------------------
>
>                 Key: DRILL-5839
>                 URL: https://issues.apache.org/jira/browse/DRILL-5839
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.11.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>             Fix For: 1.12.0
>
>
> merge receiver throws an exception when it receives first batch as empty 
> batch (no rows and no schema) from any of the senders. Problem is that the 
> operator expects at least one batch with schema (0 rows is ok, 0 columns is 
> not) from each of its senders. 
> The way algorithm works is as follows:
> Get the first batch from each of the senders.
> Create hyper vector container with this first batch from each of the senders.
> Add the batches from senders to the priority queue
> Pop from priority queue, get the index for the current batch from that 
> sender, 
> and use that to copy from the hyper vector to the outgoing vector
> When the end of batch from a sender is reached, load the next batch from the 
> sender.
> Stop when there are no more batches from any of the senders.
> If any of the senders do not send first batch with schema and if we skip 
> adding that batch to the hyper vector, hyper vector is not setup correctly 
> and all the offsets from selection vector to individual batches from senders 
> with in the hyper vector are messed up. 
> Fix for this problem is when we receive empty batch from any of the senders, 
> create dummy batch with schema  from one of the other senders and add it to 
> the hyper vector. 
> If all senders send empty first batches, we just return NONE to downstream 
> operator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5839) Handle Empty Batches in Merge Receiver

Reply via email to