Hello,
I'm storing RecordBatch objects in a local cache to improve performance. I
want to keep track of the memory usage to stay within bounds. The arrays
stored in the batch are not nested.
The best way I came up to compute the size of a RecordBatch is:
size_t arrowSize = 0;
for (auto i = 0; i < arrowBatch->num_columns(); ++i) {
auto column = arrowBatch->column_data(i);
if (column->buffers[0])
arrowSize += column->buffers[0]->size();
if (column->buffers[1])
arrowSize += column->buffers[1]->size();
}
Does this look reasonable? I guess we are over estimating a bit due to the
buffer alignment but that should be fine.
Thanks!
Rares