theirix commented on PR #23211: URL: https://github.com/apache/datafusion/pull/23211#issuecomment-4848383484
@pepijnve thanks for bringing this up! When I added string view support to Arrow, it didn't land in a release yet for my concat PR. Now it could be nicely unified into a single call, and the duplicated code from `kernels.rs` should be removed. As you have shown, benchmarks are okay, so the use of `concat_elements_dyn` with a heavier `downcast_ref`, which uses runtime lookups, works fine - please double-check just in case. @alamb regarding the `FixedSizeBinary` case in DataFusion. I decided to cast them to Binaries, since coercion gave me errors like this: `DataFusion error: Error during planning: Cannot infer common string type for string concat operation FixedSizeBinary(2) || FixedSizeBinary(5)` for a query `SELECT arrow_cast(x'6361', 'FixedSizeBinary(2)') || arrow_cast(x'68656c6c6f', 'FixedSizeBinary(5)')`. Deducing a return type in ConcatFunc is simple, and these Arrow arrays can be easily concatenated. However, you have to announce a coerced type in `type_coercion/binary.rs` with Signature::Uniform, which should be the same as the source types. Then, during execution, we have a return type `FixedSizeBinary(2+5)`, but a coerced type `FixedSizeBinary(2)`. It required a larger refactor of the `BinaryTypeCoercer` with non-uniform input and output types - the thing I wanted to avoid. I'll be happy if we can moot this enhancement. If FSB is too stubborn to support, we can make a specific handling for it in a branch, keeping the rest of the code in `concat_elements_dyn` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
