When next is called on upstream operator then 2 things can happen:

   1. Current operator either work on incoming and produces/copy the
   records to its outgoing batch. In this case there is no transfer of
   ownership of incoming batch buffer from upstream to current operator
   allocator. Current operator will allocate separate memory footprint for its
   outgoing batch buffer. Also current operator is supposed to release the
   incoming batch buffer once its done working on it.
   2. Current operator does a transfer of buffers from incoming batch value
   vectors to outgoing value vectors (like in Filter, limit (see [1]), etc).
   In this case ownership of buffers in incoming batch is transferred to
   current operator allocator.

But I have seen different operator behaving differently. For Hash Join
since join operators has to evaluate join condition for each probe side
row, I don't think it will do any transfers. For build side it will build
hash table on column involved in join condition but also has to store other
columns if in projection list of query. So probably it might do transfer
for those columns only (haven't looked into code though).

[1]:
https://github.com/apache/drill/blob/006dc10a88c1708b793e3a38ac52a0266bb07deb/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/limit/LimitRecordBatch.java#L181

Thanks,
Sorabh

On Thu, Aug 2, 2018 at 9:43 PM, Timothy Farkas <tfar...@mapr.com> wrote:

> Hi All,
>
> What is the expected behavior for HashJoin when it calls next for its left
> or right upstream record batches. Is ownership of an upstream
> VectorContainer supposed to pass from from the left or right upstream
> record batches to HashJoin immediately after a call to next? Or is
> ownership of a VectorContainer supposed to stay with an upstream record
> batch immediately after a call to next?
>
> Thanks,
> Tim
>

Reply via email to