My gut is that any changes other than correct accounting will have little impact on real world situations. Do you think the change will make enough difference to be valuable?
It seems like capacity should return the length of the slice (since I believe that fits with the general ByteBuf interface). If I reset writerIndex(0), I should be able write to capacity() without issue. Seems weird to return some other (mostly disconnect) value. -- Jacques Nadeau CTO and Co-Founder, Dremio On Fri, Nov 20, 2015 at 3:28 PM, Abdel Hakim Deneche <[email protected]> wrote: > @Steven, I think DrillBuf.capacity() was changed at some point, I was > looking at the code and it seems to only return the size of the "slice" and > not the root buffer. > > While waiting for the new allocator and transfer of ownership, would it > make sense to remove the check for root buffers like this ? > > private long getBufferSize(VectorAccessible batch) { > > long size = 0; > > for (VectorWrapper<?> w : batch) { > > DrillBuf[] bufs = w.getValueVector().getBuffers(false); > > for (DrillBuf buf : bufs) { > > size += buf.capacity(); > > } > > } > > return size; > > } > > > > On Fri, Nov 20, 2015 at 2:50 PM, Hanifi GUNES <[email protected]> > wrote: > > > Problem with the above code is that not all vectors operate on root > buffers > > rendering the accounting above inaccurate. In fact your example is one > > perfect instance where vectors would point to non-root buffers for sure > > because of the slicing taking place at RecordBatchLoader#load [1] > > > > > > 1: > > > > > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchLoader.java#L117 > > > > 2015-11-20 11:41 GMT-08:00 Steven Phillips <[email protected]>: > > > > > I think it is because we can't actually properly account for sliced > > > buffers. I don't remember for sure, but I think it might be because > > calling > > > buf.capacity() on a sliced buffer returns the the capacity of root > > buffer, > > > not the size of the slice. That may not be correct, but I think it was > > > something like that. Whatever it is, I am pretty sure it was giving > wrong > > > results when they are sliced buffers. > > > > > > I think we need to get the new allocator, along with proper transfer of > > > ownership in order to do this correctly. Then we can just query the > > > allocator rather than trying to track it separately. > > > > > > On Fri, Nov 20, 2015 at 11:25 AM, Abdel Hakim Deneche < > > > [email protected] > > > > wrote: > > > > > > > I'm looking at the external sort code and it uses the following > method > > to > > > > compute the allocated size of a batch: > > > > > > > > private long getBufferSize(VectorAccessible batch) { > > > > > long size = 0; > > > > > for (VectorWrapper<?> w : batch) { > > > > > DrillBuf[] bufs = w.getValueVector().getBuffers(false); > > > > > for (DrillBuf buf : bufs) { > > > > > if (*buf.isRootBuffer()*) { > > > > > size += buf.capacity(); > > > > > } > > > > > } > > > > > } > > > > > return size; > > > > > } > > > > > > > > > > > > This method only accounts for root buffers, but when we have a > receiver > > > > below the sort, most of (if not all) buffers are child buffers. This > > may > > > > delay spilling, and increase the memory usage of the drillbit. If my > > > > computations are correct, for a single query, one drillbit can > allocate > > > up > > > > to 40GB without spilling once to disk. > > > > > > > > Is there a specific reason we only account for root buffers ? > > > > > > > > -- > > > > > > > > Abdelhakim Deneche > > > > > > > > Software Engineer > > > > > > > > <http://www.mapr.com/> > > > > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > > > < > > > > > > > > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > > > > > > > > > > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >
