comphead commented on PR #19716:
URL: https://github.com/apache/datafusion/pull/19716#issuecomment-3745284505
> > * apply casted stream on top of scan stream so we still can manage batch
to batch mapping(potentially it could affect filter pushdown)
>
> I'm a bit confused, isn't it pretty much the same thing? What we do in the
Parquet opener is `stream.map(|maybe_batch| { let batch = maybe_batch?;
projector.project(batch) })` which essentially builds a "casted" stream.
>
> It also doesn't really seem like something you should have to do, if you
provide the casting rules you want via `PhysicalExprAdapter` the Parquet opener
will take care of essentially what this PR is doing and apply it to the stream.
unfortunately it is slightly more than just casting(applying default values,
unifying schemas, etc), we doing some RB->RB modification just after the scan,
hopefully we can do this better in future as this part is expensive.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]