[
https://issues.apache.org/jira/browse/ARROW-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238413#comment-15238413
]
Micah Kornfield commented on ARROW-100:
---------------------------------------
I don't have much experience with flatbuffers but I think it should be pretty
fast (the google benchmark library is part of the toolchain if you want to
write a benchmark to prove this). I can't think of a better way to estimate
its size, and I think keeping the implementation simple and concise makes sense
until we have numbers to prove that it is a performance bottleneck. If it is
a bottleneck, we could probably mitigate it by keeping around the serialized
version for writing (I think the most common use-case for this method will be
call GetRowBatchSize(), allocate the necessary shared memory memory, then push
the buffers to the shared-memory).
> [C++] Computing RowBatch size
> -----------------------------
>
> Key: ARROW-100
> URL: https://issues.apache.org/jira/browse/ARROW-100
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Philipp Moritz
>
> Hi,
> thank you guys for this project, I'm really enjoying what I've seen so far!
> There is an unimplemented method for getting the total size of objects:
> int64_t GetRowBatchSize(const RowBatch* batch);
> Has somebody already started to implement it or thought about how to do it?
> It could be done by recursively adding up all the involved buffer sizes,
> build the metadata and add its size. Let me know if you want me to create a
> draft of the implementation.
> -- Philipp.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)