I can think of an example in Flink SQL: A hash table (for hash join) consists of two parts: 1. The bucket area, which is used to store the hash code & data address. 2. The record area, which stores the actual data.
The size of the record area can be estimated by: estimated_record_count * average_record_size. The estimation is based on input statistics, so the estimated count may differ from the actual count, and this may not be a good example. However, the point is that, the record area size may not be a power of 2, and it may be much smaller than the next power of 2. If we allocate the buffer for record area by power-of-2, that can lead to waste of memory and fragment in the buffer. Best, Liya Fan On Tue, Jun 4, 2019 at 10:45 AM Jacques Nadeau <jacq...@dremio.com> wrote: > I'd like to see an application that always knows exactly how much memory > they need. Do the aggregation of the concatenation of two string columns > and you'll see what I mean... :) > > Arrow always does part of this by sizing record batches to avoid this > exact fragmentation... > > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > > On Mon, Jun 3, 2019 at 7:36 PM Fan Liya <liya.fa...@gmail.com> wrote: > >> Hi @Ravindra Pindikura and @Wes McKinney, >> >> Thanks for your feedback. >> According to [1], the major benefit of buddy allocator is that, it avoids >> memory fragmentation, and thus reduces the maintenance cost and compaction >> overhead. >> However, for applications that know exactly how much memory they need >> (which is not a power of 2), this effectively move the fragmentation inside >> the buffer. >> >> Best, >> Liya Fan >> >> [1] Buddy memory allocator. >> https://www.cs.fsu.edu/~engelen/courses/COP402003/p827.pdf >> >> On Tue, Jun 4, 2019 at 5:38 AM Wes McKinney <wesmck...@gmail.com> wrote: >> >>> Note that we prefer using jemalloc in C++, which also employs the >>> buddy allocator strategy >>> >>> On Mon, Jun 3, 2019 at 4:33 PM Jacques Nadeau <jacq...@dremio.com> >>> wrote: >>> > >>> > The Netty allocator is a buddy allocator. It always allocates in power >>> of two sizes. As far as I know, this hasn't changed recently. >>> > -- >>> > Jacques Nadeau >>> > CTO and Co-Founder, Dremio >>> > >>> > >>> > On Thu, May 30, 2019 at 12:30 AM Ravindra Pindikura < >>> ravin...@dremio.com> wrote: >>> >> >>> >> >>> >> >>> >> On Thu, May 30, 2019 at 12:23 PM Micah Kornfield < >>> emkornfi...@gmail.com> wrote: >>> >>> >>> >>> (Adding Java to thread title) >>> >>> >>> >>> For more context, I pushed back on the changes in >>> >>> https://github.com/apache/arrow/pull/4358 because they don't seem >>> typical >>> >>> in memory management systems (i.e. they expose internal >>> implementation >>> >>> details of the allocator). >>> >>> >>> >>> I think https://github.com/apache/arrow/pull/4400 which makes the >>> rounding >>> >>> policy a parameter of the allocator at construction time is a better >>> >>> solution because it provides better encapsulation. I don't know if >>> there >>> >>> is something subtle about the current power of 2 allocation or if it >>> was >>> >>> just the easiest possible option to begin with. >>> >> >>> >> >>> >> I too don't know the reason for the power of 2 allocation. @Wes >>> McKinney or @Jacques Nadeau may know more about this. >>> >> >>> >> However, there is likely to be more code that assumes or optimises >>> for this case. eg. in the vector allocation code, we round-up the request >>> to make full use of the power-of-2 buffer. >>> >> >>> >>> >>> >>> Feedback on this would be >>> >>> welcome. >>> >>> >>> >>> Thanks, >>> >>> Micah >>> >>> >>> >>> On Tue, May 28, 2019 at 9:23 PM Fan Liya <liya.fa...@gmail.com> >>> wrote: >>> >>> >>> >>> > Hi all, >>> >>> > >>> >>> > We frequently encounter this problem in our code: Arrow throws >>> >>> > OutOfMemoryException even though there is sufficient memory. >>> >>> > >>> >>> > Let me illustrate with the following example, which is frequently >>> used in >>> >>> > our code: >>> >>> > >>> >>> > int requestSize = ...; >>> >>> > if (requestSize <= allocator.getLimit() - >>> allocator.getAllocatedMemory()) { >>> >>> > ArrowBuf buffer = allocator.buffer(requestSize); >>> >>> > } >>> >>> > >>> >>> > This code occasionally throws OOM. The reason is Arrow's rounding >>> behavior. >>> >>> > In particular, if the requested size is within the chunk size, the >>> buffer >>> >>> > size will be rounded to the next power of 2 (We believe this is an >>> overly >>> >>> > aggressive rounding strategy). >>> >>> > >>> >>> > For example, suppose we have 12 MB memory left, and request a >>> buffer with >>> >>> > size 10 MB. Appearantly, there is sufficient memory to meet the >>> request. >>> >>> > However, the rounding behavior rounds the request size from 10 MB >>> to 16 MB. >>> >>> > Since there is no 16 MB memory, an OutOfMemoryException will be >>> thrown. >>> >>> > >>> >>> > We propose two ways to solve this problem: >>> >>> > >>> >>> > 1) We provide a rounding option as an argument to the >>> BaseAllocator#buffer >>> >>> > method. There are two possible values for the rounding option: >>> rounding up >>> >>> > (to the next power of 2) and rounding down (to the previous power >>> of 2). In >>> >>> > the above code, the rounding down option can solve the problem. >>> >>> > >>> >>> > 2) We add a method to the allocator: >>> >>> > >>> >>> > int getRoundedSize(final int requestSize) >>> >>> > >>> >>> > This method will give the rounded buffer size, given the initial >>> request >>> >>> > size. With this method, the user can adjust their request size to >>> avoid >>> >>> > OOM. >>> >>> > >>> >>> > We have opened ARROW-5386 < >>> >>> > https://issues.apache.org/jira/browse/ARROW-5386> >>> >>> > and PR-4358 <https://github.com/apache/arrow/pull/4358> to track >>> this >>> >>> > issue. Many thanks to @emkornfield and @praveenbingo for all the >>> valuable >>> >>> > comments. >>> >>> > >>> >>> > The current status is that, solution 1) is deprecated, because it >>> may >>> >>> > return less memory than requested. >>> >>> > >>> >>> > So would you please give your valuable comments? >>> >>> > 1) Do you think solution 2) is OK? >>> >>> > 2) Do you have any other suggested solutions? >>> >>> > >>> >>> > Thank you in advance. >>> >>> > >>> >>> > Best, >>> >>> > Liya Fan >>> >>> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Thanks and regards, >>> >> Ravindra. >>> >>