(Adding Java to thread title) For more context, I pushed back on the changes in https://github.com/apache/arrow/pull/4358 because they don't seem typical in memory management systems (i.e. they expose internal implementation details of the allocator).
I think https://github.com/apache/arrow/pull/4400 which makes the rounding policy a parameter of the allocator at construction time is a better solution because it provides better encapsulation. I don't know if there is something subtle about the current power of 2 allocation or if it was just the easiest possible option to begin with. Feedback on this would be welcome. Thanks, Micah On Tue, May 28, 2019 at 9:23 PM Fan Liya <liya.fa...@gmail.com> wrote: > Hi all, > > We frequently encounter this problem in our code: Arrow throws > OutOfMemoryException even though there is sufficient memory. > > Let me illustrate with the following example, which is frequently used in > our code: > > int requestSize = ...; > if (requestSize <= allocator.getLimit() - allocator.getAllocatedMemory()) { > ArrowBuf buffer = allocator.buffer(requestSize); > } > > This code occasionally throws OOM. The reason is Arrow's rounding behavior. > In particular, if the requested size is within the chunk size, the buffer > size will be rounded to the next power of 2 (We believe this is an overly > aggressive rounding strategy). > > For example, suppose we have 12 MB memory left, and request a buffer with > size 10 MB. Appearantly, there is sufficient memory to meet the request. > However, the rounding behavior rounds the request size from 10 MB to 16 MB. > Since there is no 16 MB memory, an OutOfMemoryException will be thrown. > > We propose two ways to solve this problem: > > 1) We provide a rounding option as an argument to the BaseAllocator#buffer > method. There are two possible values for the rounding option: rounding up > (to the next power of 2) and rounding down (to the previous power of 2). In > the above code, the rounding down option can solve the problem. > > 2) We add a method to the allocator: > > int getRoundedSize(final int requestSize) > > This method will give the rounded buffer size, given the initial request > size. With this method, the user can adjust their request size to avoid > OOM. > > We have opened ARROW-5386 < > https://issues.apache.org/jira/browse/ARROW-5386> > and PR-4358 <https://github.com/apache/arrow/pull/4358> to track this > issue. Many thanks to @emkornfield and @praveenbingo for all the valuable > comments. > > The current status is that, solution 1) is deprecated, because it may > return less memory than requested. > > So would you please give your valuable comments? > 1) Do you think solution 2) is OK? > 2) Do you have any other suggested solutions? > > Thank you in advance. > > Best, > Liya Fan >