Hi all,

We frequently encounter this problem in our code: Arrow throws
OutOfMemoryException even though there is sufficient memory.

Let me illustrate with the following example, which is frequently used in
our code:

int requestSize = ...;
if (requestSize <= allocator.getLimit() - allocator.getAllocatedMemory()) {
  ArrowBuf buffer = allocator.buffer(requestSize);
}

This code occasionally throws OOM. The reason is Arrow's rounding behavior.
In particular, if the requested size is within the chunk size, the buffer
size will be rounded to the next power of 2 (We believe this is an overly
aggressive rounding strategy).

For example, suppose we have 12 MB memory left, and request a buffer with
size 10 MB. Appearantly, there is sufficient memory to meet the request.
However, the rounding behavior rounds the request size from 10 MB to 16 MB.
Since there is no 16 MB memory, an OutOfMemoryException will be thrown.

We propose two ways to solve this problem:

1) We provide a rounding option as an argument to the BaseAllocator#buffer
method. There are two possible values for the rounding option: rounding up
(to the next power of 2) and rounding down (to the previous power of 2). In
the above code, the rounding down option can solve the problem.

2) We add a method to the allocator:

int getRoundedSize(final int requestSize)

This method will give the rounded buffer size, given the initial request
size. With this method, the user can adjust their request size to avoid OOM.

We have opened ARROW-5386 <https://issues.apache.org/jira/browse/ARROW-5386>
and PR-4358 <https://github.com/apache/arrow/pull/4358> to track this
issue. Many thanks to @emkornfield and @praveenbingo for all the valuable
comments.

The current status is that, solution 1) is deprecated, because it may
return less memory than requested.

So would you please give your valuable comments?
1) Do you think solution 2) is OK?
2) Do you have any other suggested solutions?

Thank you in advance.

Best,
Liya Fan

Reply via email to