adriangb commented on issue #20135:
URL: https://github.com/apache/datafusion/issues/20135#issuecomment-4489764202
IIUC the plan we end up with is:
```rust
FilterExec: projection=[other_col], filter=[file_row_index() > 3]
DataSourceExec: projection=[other_col]
```
Because by design this UDF can only be evaluated by `ParquetOpener`.
That makes sense and is unfortunate.
Incidentally the (otherwise unrelated) changes in
https://github.com/apache/datafusion/pull/22144 /
https://github.com/apache/datafusion/pull/22237 happen to fix this: they allow
*any* filter to be pushed down into `ParquetOpener` because it doesn't have to
be evaluated as a row filter.
Maybe this is a fundamental limitation of the UDF approach? The other
approach we explored (adding some sort of system column) requires invasive
modifications to the concept of a schema itself if I remember correctly and
that was the main con, but I'm open to re-exploring it.
Either way neither of these seem blockers for
https://github.com/apache/datafusion/pull/22026 and I still plan on merging
that once 54 is released @mbutrovich
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]