pitrou commented on PR #43661: URL: https://github.com/apache/arrow/pull/43661#issuecomment-3520514776
> I would be happy to open a separate issue to look at `CastBinaryToBinaryOffsets`, however, I wasn't sure how to make this cast more efficient without changing the API significantly? A single cast of a slice is O(offset + length), since a new buffer for the string offsets needs to be created with the same shape as the original slice. No API needs to be changed here. The output of casting is by definition the same *logical* length as the input, but it does not need to have the same physical allocation shape (i.e. offset). The current code reuses the input offset (through `ZeroCopyCastExec`) mostly because it's easier. But it should be simple as well to produce the output with a different offset (not necessarily 0, because we would like to reuse the null bitmap and that implies we must keep `offset % 8` the same). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
