adriangb commented on PR #16461: URL: https://github.com/apache/datafusion/pull/16461#issuecomment-3000255690
@kosiew I'm not sure I agree with the conclusions there. Why can't we use expressions to do the schema adapting during the scan? It's very possible as @alamb pointed out in https://github.com/apache/datafusion/pull/16461#issuecomment-2997870791 to feed a RecordBatch into a an expression and get back a new array. So unless I'm missing something I don't think these are correct: > Expression rewriting is great for pushdown but batch-level adapters are needed for correct, shaped data. > No effect on RecordBatch structure. > Limited scope (only predicates and pruning). > Possibly poorer performance due to repeated expression rewrites. There's no more expression rewrites than there are SchemaAdapters created. Those aren't cached either and are created for each file. I'll put together an example to show how predicate rewrites can be used to reshape data. But also FWIW that's exactly how ProjectionExec works. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org