wgtmac commented on PR #17877: URL: https://github.com/apache/arrow/pull/17877#issuecomment-1379024270
> Adding more clarification here. > > The change proposed here is about the vector of values that is returned. Currently, we first come up with the location of the null values and then make a vector that has an empty space for the null values (reading spaced). When reading dense, we do not make space for the null values. > > For example: def_levels: [0, 1] values: [10] > > Reading spaced: [null, 10] Reading dense: [10] > > The change here is meaningful for nullable columns. The savings come when we have null values. The issue is that 1) it is inefficient to come up with the exact space of the nulls and move the values around to make space for null values and 2) Some readers may want to indeed read dense and so they have to move the null values out again. Thanks for the explanation. I got your point. In that case, I assume the caller does not care about the null values and will not be able to sync values of different columns (because they may have different null slots), am I right? Is it better to add a new function to support this case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
