Hi Simon, Yes, I think it potentially would be a good addition to the BinaryBuilder. I'm sure other people might have opinions on this, the best way forward would be to open up a JIRA with a proposal for an API and send a PR (I imagine this should be a fairly small change, so most discussion could probably happen on the PR).
Thanks, Micah [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/builder_binary.h#L505 On Wed, Jul 29, 2020 at 5:53 AM Simon Dumke <[email protected]> wrote: > Hi Micah, > > very late on my part but still: Thanks for your reply! I've followed your > suggestion and it is working as expected. I believe this functionality > could be added to the BinaryBuilder - Would this be a sensible feature to > add? > > Kind regards, > Simon > > Am 19.06.2020 um 05:39 schrieb Micah Kornfield: > > Hi Simon, > I don't think there is a public API for this in C++. You would have to > presize a values buffer to the size expected for the compressed data, have > the compressor output directly to that buffer while recording the necessary > offsets. You could then construct the BinaryArray directly with these > buffers (I would need to double check, but you might need to construct an > intermediate ArrayData object). > > Hope this helps. > > Micah > > > > On Thursday, June 18, 2020, Simon Dumke <[email protected]> wrote: > >> Hi all, >> >> I would like build RecordBatches with (besides others) a BinaryArray >> column containing compressed data. when filling the BinaryArray, i would >> like to allow the compresseor to immediately output into the Arrow Buffer >> instead of allocating an output buffer and then copying the data into Arrow >> Buffers. >> >> Is such an approach possible? And if so - how do I achieve this? >> >> I'd be thankfull for any insights! >> >> Best regards, >> >> Simon >> >> > -- > Simon Dumke > > Entwickler - CoDaC > Department Operation > > Max Planck Institut for Plasmaphysics > Wendelsteinstrasse 1 > 17491 Greifswald, Germany > > Phone: +49(0)3834 88 1215 > >
