isidentical commented on issue #3518: URL: https://github.com/apache/arrow-datafusion/issues/3518#issuecomment-1264450789
> reusing the null buffer from the input array (instead of rebuilding in the iterator) I was looking into this one; but even on cases where the array is very sparse (in testings, I've used %20 data / %80 nulls), there doesn't seem to be an observable difference between building the underlying data buffers by ourselves and reusing the existing null buffer vs rebuilding everything. The experiment is [in this branch](https://github.com/isidentical/arrow-datafusion/pull/3/files), and the results are (on release mode): - baseline is ~0.721s ish - that branch is ~0.690s ish So a ~5% speed-up, but I highly suspect it might be just noise. (There is also a chance that I completely misunderstood the concept 😄 so am very open to input on the experiment code on what else I can try). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
