Hi Tim,

Here's a bit more background info from my experience.

As Sorabh said, the downstream operator has visibility to the incoming record 
batch regardless of ownership.

If the downstream operator will retain the incoming vectors (buffers), then it 
must take ownership of them. This is the case for pass-through operators such 
as filters, or for buffering operators such as sort.

If the downstream operator will observe, but not reuse, the incoming data, then 
ownership is (mostly) moot. Sounds like this is the case for HashJoin.

The key item I wish to add here is timing. Accepting ownership of the incoming 
buffer shifts memory accounting from the incoming to the downstream operator. 
As we have seem over and over, this can then cause the downstream operator to 
suffer OOM errors if, say, the downstream operator then wants to allocate an 
SV2 or some such, but finds itself over the memory limit. 

I can imagine that, to avoid this issue, downstream operators may delay taking 
ownership of the incoming vectors as long as possible. (The sort, for example, 
first checks the incoming batch size, then spills as needed, to ensure that 
there capacity in the operator allocator to accept ownership.)

Other operators may have decided to never take ownership if they will simply 
observe, then discard incoming batches (or vectors.) One very likely candidate 
for such an approach would be the SelectionVectorRemover since it copies data 
from incoming to a new set of outgoing batches. (Have not checked if, in fact, 
the SVR does avoid taking ownership.)

Would be great to adjust memory management so that we can use a more uniform 
approach. As we've discussed multiple times, there is no good reason for us to 
hold a gun to our own head with the current per-operator allocators, especially 
when we issue an OOM for memory already located.

Perhaps you have some ideas for how to improve this as part of your resource 
management work?

Thanks,
- Paul

 

    On Friday, August 3, 2018, 12:39:35 AM PDT, Timothy Farkas 
<[email protected]> wrote:  
 
 Thanks for the explanation Sorabh, that clears things up for me.

Tim

On Thu, Aug 2, 2018 at 11:38 PM, Sorabh Hamirwasia <[email protected]>
wrote:

> When next is called on upstream operator then 2 things can happen:
>
>    1. Current operator either work on incoming and produces/copy the
>    records to its outgoing batch. In this case there is no transfer of
>    ownership of incoming batch buffer from upstream to current operator
>    allocator. Current operator will allocate separate memory footprint for
> its
>    outgoing batch buffer. Also current operator is supposed to release the
>    incoming batch buffer once its done working on it.
>    2. Current operator does a transfer of buffers from incoming batch value
>    vectors to outgoing value vectors (like in Filter, limit (see [1]),
> etc).
>    In this case ownership of buffers in incoming batch is transferred to
>    current operator allocator.
>
> But I have seen different operator behaving differently. For Hash Join
> since join operators has to evaluate join condition for each probe side
> row, I don't think it will do any transfers. For build side it will build
> hash table on column involved in join condition but also has to store other
> columns if in projection list of query. So probably it might do transfer
> for those columns only (haven't looked into code though).
>
> [1]:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> com_apache_drill_blob_006dc10a88c1708b793e3a38ac52a0
> 266bb07deb_exec_java-2Dexec_src_main_java_org_apache_
> drill_exec_physical_impl_limit_LimitRecordBatch.java-23L181&d=DwIBaQ&c=
> cskdkSMqhcnjZxdQVpwTXg&r=4eQVr8zB8ZBff-yxTimdOQ&m=
> 3PkPHZrX33YZhmngIGbjYtgGn_KYzXNVUyJPEUQVFyo&s=7YyEpHwWOp2I8zQjNEdFXLu-
> QnVCq7neR7lyg7bfPVs&e=
>
> Thanks,
> Sorabh
>
> On Thu, Aug 2, 2018 at 9:43 PM, Timothy Farkas <[email protected]> wrote:
>
> > Hi All,
> >
> > What is the expected behavior for HashJoin when it calls next for its
> left
> > or right upstream record batches. Is ownership of an upstream
> > VectorContainer supposed to pass from from the left or right upstream
> > record batches to HashJoin immediately after a call to next? Or is
> > ownership of a VectorContainer supposed to stay with an upstream record
> > batch immediately after a call to next?
> >
> > Thanks,
> > Tim
> >
>
  

Reply via email to