[ https://issues.apache.org/jira/browse/ARROW-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238413#comment-15238413 ]
Micah Kornfield commented on ARROW-100: --------------------------------------- I don't have much experience with flatbuffers but I think it should be pretty fast (the google benchmark library is part of the toolchain if you want to write a benchmark to prove this). I can't think of a better way to estimate its size, and I think keeping the implementation simple and concise makes sense until we have numbers to prove that it is a performance bottleneck. If it is a bottleneck, we could probably mitigate it by keeping around the serialized version for writing (I think the most common use-case for this method will be call GetRowBatchSize(), allocate the necessary shared memory memory, then push the buffers to the shared-memory). > [C++] Computing RowBatch size > ----------------------------- > > Key: ARROW-100 > URL: https://issues.apache.org/jira/browse/ARROW-100 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Reporter: Philipp Moritz > > Hi, > thank you guys for this project, I'm really enjoying what I've seen so far! > There is an unimplemented method for getting the total size of objects: > int64_t GetRowBatchSize(const RowBatch* batch); > Has somebody already started to implement it or thought about how to do it? > It could be done by recursively adding up all the involved buffer sizes, > build the metadata and add its size. Let me know if you want me to create a > draft of the implementation. > -- Philipp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)