On Thu, May 30, 2019 at 12:23 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
> (Adding Java to thread title) > > For more context, I pushed back on the changes in > https://github.com/apache/arrow/pull/4358 because they don't seem typical > in memory management systems (i.e. they expose internal implementation > details of the allocator). > > I think https://github.com/apache/arrow/pull/4400 which makes the rounding > policy a parameter of the allocator at construction time is a better > solution because it provides better encapsulation. I don't know if there > is something subtle about the current power of 2 allocation or if it was > just the easiest possible option to begin with. I too don't know the reason for the power of 2 allocation. @Wes McKinney <wesmck...@gmail.com> or @Jacques Nadeau <jacq...@dremio.com> may know more about this. However, there is likely to be more code that assumes or optimises for this case. eg. in the vector allocation code, we round-up the request to make full use of the power-of-2 buffer. > Feedback on this would be > welcome. > > Thanks, > Micah > > On Tue, May 28, 2019 at 9:23 PM Fan Liya <liya.fa...@gmail.com> wrote: > > > Hi all, > > > > We frequently encounter this problem in our code: Arrow throws > > OutOfMemoryException even though there is sufficient memory. > > > > Let me illustrate with the following example, which is frequently used in > > our code: > > > > int requestSize = ...; > > if (requestSize <= allocator.getLimit() - > allocator.getAllocatedMemory()) { > > ArrowBuf buffer = allocator.buffer(requestSize); > > } > > > > This code occasionally throws OOM. The reason is Arrow's rounding > behavior. > > In particular, if the requested size is within the chunk size, the buffer > > size will be rounded to the next power of 2 (We believe this is an overly > > aggressive rounding strategy). > > > > For example, suppose we have 12 MB memory left, and request a buffer with > > size 10 MB. Appearantly, there is sufficient memory to meet the request. > > However, the rounding behavior rounds the request size from 10 MB to 16 > MB. > > Since there is no 16 MB memory, an OutOfMemoryException will be thrown. > > > > We propose two ways to solve this problem: > > > > 1) We provide a rounding option as an argument to the > BaseAllocator#buffer > > method. There are two possible values for the rounding option: rounding > up > > (to the next power of 2) and rounding down (to the previous power of 2). > In > > the above code, the rounding down option can solve the problem. > > > > 2) We add a method to the allocator: > > > > int getRoundedSize(final int requestSize) > > > > This method will give the rounded buffer size, given the initial request > > size. With this method, the user can adjust their request size to avoid > > OOM. > > > > We have opened ARROW-5386 < > > https://issues.apache.org/jira/browse/ARROW-5386> > > and PR-4358 <https://github.com/apache/arrow/pull/4358> to track this > > issue. Many thanks to @emkornfield and @praveenbingo for all the valuable > > comments. > > > > The current status is that, solution 1) is deprecated, because it may > > return less memory than requested. > > > > So would you please give your valuable comments? > > 1) Do you think solution 2) is OK? > > 2) Do you have any other suggested solutions? > > > > Thank you in advance. > > > > Best, > > Liya Fan > > > -- Thanks and regards, Ravindra.