Hi @Ravindra Pindikura and @Wes McKinney,

Thanks for your feedback.
According to [1], the major benefit of buddy allocator is that, it avoids
memory fragmentation, and thus reduces the maintenance cost and compaction
overhead.
However, for applications that know exactly how much memory they need
(which is not a power of 2), this effectively move the fragmentation inside
the buffer.

Best,
Liya Fan

[1] Buddy memory allocator.
https://www.cs.fsu.edu/~engelen/courses/COP402003/p827.pdf

On Tue, Jun 4, 2019 at 5:38 AM Wes McKinney <wesmck...@gmail.com> wrote:

> Note that we prefer using jemalloc in C++, which also employs the
> buddy allocator strategy
>
> On Mon, Jun 3, 2019 at 4:33 PM Jacques Nadeau <jacq...@dremio.com> wrote:
> >
> > The Netty allocator is a buddy allocator. It always allocates in power
> of two sizes. As far as I know, this hasn't changed recently.
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> >
> > On Thu, May 30, 2019 at 12:30 AM Ravindra Pindikura <ravin...@dremio.com>
> wrote:
> >>
> >>
> >>
> >> On Thu, May 30, 2019 at 12:23 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
> >>>
> >>> (Adding Java to thread title)
> >>>
> >>> For more context, I pushed back on the changes in
> >>> https://github.com/apache/arrow/pull/4358 because they don't seem
> typical
> >>> in memory management systems (i.e. they expose internal implementation
> >>> details of the allocator).
> >>>
> >>> I think https://github.com/apache/arrow/pull/4400 which makes the
> rounding
> >>> policy a parameter of the allocator at construction time is a better
> >>> solution because it provides better encapsulation.  I don't know if
> there
> >>> is something subtle about the current power of 2 allocation or if it
> was
> >>> just the easiest possible option to begin with.
> >>
> >>
> >> I too don't know the reason for the power of 2 allocation. @Wes
> McKinney or @Jacques Nadeau may know more about this.
> >>
> >> However, there is likely to be more code that assumes or optimises for
> this case. eg. in the vector allocation code, we round-up the request to
> make full use of the power-of-2 buffer.
> >>
> >>>
> >>>   Feedback on this would be
> >>> welcome.
> >>>
> >>> Thanks,
> >>> Micah
> >>>
> >>> On Tue, May 28, 2019 at 9:23 PM Fan Liya <liya.fa...@gmail.com> wrote:
> >>>
> >>> > Hi all,
> >>> >
> >>> > We frequently encounter this problem in our code: Arrow throws
> >>> > OutOfMemoryException even though there is sufficient memory.
> >>> >
> >>> > Let me illustrate with the following example, which is frequently
> used in
> >>> > our code:
> >>> >
> >>> > int requestSize = ...;
> >>> > if (requestSize <= allocator.getLimit() -
> allocator.getAllocatedMemory()) {
> >>> >   ArrowBuf buffer = allocator.buffer(requestSize);
> >>> > }
> >>> >
> >>> > This code occasionally throws OOM. The reason is Arrow's rounding
> behavior.
> >>> > In particular, if the requested size is within the chunk size, the
> buffer
> >>> > size will be rounded to the next power of 2 (We believe this is an
> overly
> >>> > aggressive rounding strategy).
> >>> >
> >>> > For example, suppose we have 12 MB memory left, and request a buffer
> with
> >>> > size 10 MB. Appearantly, there is sufficient memory to meet the
> request.
> >>> > However, the rounding behavior rounds the request size from 10 MB to
> 16 MB.
> >>> > Since there is no 16 MB memory, an OutOfMemoryException will be
> thrown.
> >>> >
> >>> > We propose two ways to solve this problem:
> >>> >
> >>> > 1) We provide a rounding option as an argument to the
> BaseAllocator#buffer
> >>> > method. There are two possible values for the rounding option:
> rounding up
> >>> > (to the next power of 2) and rounding down (to the previous power of
> 2). In
> >>> > the above code, the rounding down option can solve the problem.
> >>> >
> >>> > 2) We add a method to the allocator:
> >>> >
> >>> > int getRoundedSize(final int requestSize)
> >>> >
> >>> > This method will give the rounded buffer size, given the initial
> request
> >>> > size. With this method, the user can adjust their request size to
> avoid
> >>> > OOM.
> >>> >
> >>> > We have opened ARROW-5386 <
> >>> > https://issues.apache.org/jira/browse/ARROW-5386>
> >>> > and PR-4358 <https://github.com/apache/arrow/pull/4358> to track
> this
> >>> > issue. Many thanks to @emkornfield and @praveenbingo for all the
> valuable
> >>> > comments.
> >>> >
> >>> > The current status is that, solution 1) is deprecated, because it may
> >>> > return less memory than requested.
> >>> >
> >>> > So would you please give your valuable comments?
> >>> > 1) Do you think solution 2) is OK?
> >>> > 2) Do you have any other suggested solutions?
> >>> >
> >>> > Thank you in advance.
> >>> >
> >>> > Best,
> >>> > Liya Fan
> >>> >
> >>
> >>
> >>
> >> --
> >> Thanks and regards,
> >> Ravindra.
>

Reply via email to