[jira] [Commented] (DRILL-5839) Handle Empty Batches in Merge Receiver

ASF GitHub Bot (JIRA) Wed, 04 Oct 2017 22:21:53 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192477#comment-16192477
 ]


ASF GitHub Bot commented on DRILL-5839:
---------------------------------------

GitHub user ppadma opened a pull request:

    https://github.com/apache/drill/pull/974

    DRILL-5839: Handle Empty Batches in Merge Receiver

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ppadma/drill DRILL-5839

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #974
    
----
commit ae259534d5ebabfe7f64e170012bea96fd655943
Author: Padma Penumarthy <[email protected]>
Date:   2017-10-01T00:33:17Z

    DRILL-5839: Handle Empty Batches in Merge Receiver

----


> Handle Empty Batches in Merge Receiver
> --------------------------------------
>
>                 Key: DRILL-5839
>                 URL: https://issues.apache.org/jira/browse/DRILL-5839
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.11.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>             Fix For: 1.12.0
>
>
> merge receiver throws an exception when it receives first batch as empty 
> batch (no rows and no schema) from any of the senders. Problem is that the 
> operator expects at least one batch with schema (0 rows is ok, 0 columns is 
> not) from each of its senders. 
> The way algorithm works is as follows:
> Get the first batch from each of the senders.
> Create hyper vector container with this first batch from each of the senders.
> Add the batches from senders to the priority queue
> Pop from priority queue, get the index for the current batch from that 
> sender, 
> and use that to copy from the hyper vector to the outgoing vector
> When the end of batch from a sender is reached, load the next batch from the 
> sender.
> Stop when there are no more batches from any of the senders.
> If any of the senders do not send first batch with schema and if we skip 
> adding that batch to the hyper vector, hyper vector is not setup correctly 
> and all the offsets from selection vector to individual batches from senders 
> with in the hyper vector are messed up. 
> Fix for this problem is when we receive empty batch from any of the senders, 
> create dummy batch with schema  from one of the other senders and add it to 
> the hyper vector. 
> If all senders send empty first batches, we just return NONE to downstream 
> operator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5839) Handle Empty Batches in Merge Receiver

Reply via email to