Re: Chunk Table into RecordBatches of at most 512MB each

Greg Lowe Thu, 14 Mar 2024 17:52:50 -0700

I had a quick look at the Arrow Go source code. In the IPC case, when using
the Go allocator, it looks like it allocates to the nearest multiple of 64
bytes. I'm not very familiar with the details of how the Go runtime handles
large byte array allocations. But with a quick scan of the docs, I believe
these get rounded to the nearest page size of 4K. So I don't think there's
a power of two issue when reading record batches via IPC.



On Fri, 15 Mar 2024 at 13:34, Jacques Nadeau <[email protected]> wrote:

> I would expect go to allocate to IPC size but the underlying allocator
> behavior will still be present. It seems like golang uses tcmalloc so it
> would probably round up to the next tcmalloc size. I'd assume waste
> increases at larger allocation sizes but you'd have to review the detail to
> better understand.
>
> On Thu, Mar 14, 2024, 2:15 PM Greg Lowe <[email protected]> wrote:
>
>> Note, I'm mostly concerned about constraining the memory use when reading
>> record batches from the IPC format. I'm not so concerned about memory use
>> by the builders while writing them.
>>
>> Is the power of two allocation also used when reading a record batch from
>> an IPC record? I would have assumed that wouldn't be necessary since the
>> required sizes would be known up front and be encoded in the IPC format.
>>
>> On Fri, 15 Mar 2024 at 11:33, Jacques Nadeau <[email protected]> wrote:
>>
>>> It depends on the implementation but some implementations use power if
>>> two allocations or similar (not sure in golang front). So one might start
>>> with space for 80 integers and then once you get to 81, allocation doubles
>>> to 160 integers. I know the Java library historically operated this way
>>> (albeit not exactly a power of two because of space related to colocated
>>> allocations for nullability). So trying to constrain memory with record at
>>> a time writing/reallocation will likely turn out pretty poorly. I recommend
>>> you preallocate your batch size based on estimates initially to max memory
>>> and then fill things in and then adjust your estimation algorithm over
>>> time.
>>>
>>> On Thu, Mar 14, 2024, 12:25 PM Greg Lowe <[email protected]> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'm aiming to reply to the following thread. Not sure if this message
>>>> will appear in the right place.
>>>> https://lists.apache.org/thread/93kg641xk52lm5m11vwodbyc1hzvbnf3
>>>>
>>>> I've implemented a workaround for a similar use case. I thought
>>>> I'd share, as either someone could recommend a better solution using the
>>>> existing API. Or perhaps to discuss additions to the API which could make
>>>> this easier.
>>>>
>>>> In my use case the limitation is the memory available when reading a
>>>> record batch. I'd like to keep the in-memory size of each record batch
>>>> within a maximum number of bytes. Note, I'm not concerned about the disk
>>>> size (which will be smaller due to LZ4 compression).
>>>>
>>>> So when appending values, I'd like to be able to specify a
>>>> maximum value, say 500MB, and then once that's exceeded write the
>>>> record batch to disk.
>>>>
>>>> The data types I need to support are float64, int64,
>>>> bool, listof(float64), listof(int64), listof(bool), and strings.
>>>>
>>>> In my use case, I'm writing to a builder in a row-wise fashion. My
>>>> current approach is, when I write each cell I increment a variable which
>>>> keeps track of the approximate used memory size in bytes. Luckily, for the
>>>> types I need to support, this is fairly simple to track approximately.
>>>>
>>>> i.e. a float64 is "+8", list-of float64 is "len(floats)*8+8".
>>>>
>>>> Is there a better way to do this using the existing API?
>>>>
>>>> Would it make sense for this to be supported natively by the API?
>>>>
>>>> I'm using the Go implementation. But I guess this applies equally to
>>>> the C++, and maybe other implementations too.
>>>>
>>>> Thanks for taking the time to read this.
>>>>
>>>> Cheers,
>>>> Greg
>>>>
>>>

Re: Chunk Table into RecordBatches of at most 512MB each

Reply via email to