When next is called on upstream operator then 2 things can happen: 1. Current operator either work on incoming and produces/copy the records to its outgoing batch. In this case there is no transfer of ownership of incoming batch buffer from upstream to current operator allocator. Current operator will allocate separate memory footprint for its outgoing batch buffer. Also current operator is supposed to release the incoming batch buffer once its done working on it. 2. Current operator does a transfer of buffers from incoming batch value vectors to outgoing value vectors (like in Filter, limit (see [1]), etc). In this case ownership of buffers in incoming batch is transferred to current operator allocator.
But I have seen different operator behaving differently. For Hash Join since join operators has to evaluate join condition for each probe side row, I don't think it will do any transfers. For build side it will build hash table on column involved in join condition but also has to store other columns if in projection list of query. So probably it might do transfer for those columns only (haven't looked into code though). [1]: https://github.com/apache/drill/blob/006dc10a88c1708b793e3a38ac52a0266bb07deb/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/limit/LimitRecordBatch.java#L181 Thanks, Sorabh On Thu, Aug 2, 2018 at 9:43 PM, Timothy Farkas <tfar...@mapr.com> wrote: > Hi All, > > What is the expected behavior for HashJoin when it calls next for its left > or right upstream record batches. Is ownership of an upstream > VectorContainer supposed to pass from from the left or right upstream > record batches to HashJoin immediately after a call to next? Or is > ownership of a VectorContainer supposed to stay with an upstream record > batch immediately after a call to next? > > Thanks, > Tim >