alamb commented on code in PR #3470:
URL: https://github.com/apache/arrow-datafusion/pull/3470#discussion_r971230551
##########
datafusion/core/src/physical_plan/file_format/row_filter.rs:
##########
@@ -202,7 +202,18 @@ impl<'a> ExprRewriter for FilterCandidateBuilder<'a> {
fn mutate(&mut self, expr: Expr) -> Result<Expr> {
if let Expr::Column(Column { name, .. }) = &expr {
if self.file_schema.field_with_name(name).is_err() {
- return Ok(Expr::Literal(ScalarValue::Null));
+ // the column expr must be in the table schema
+ return match self.table_schema.field_with_name(name) {
Review Comment:
If there is a predicate with column references in it, in order to handle it
reasonably in all cases, I suspect we need to know the type of that column. I
definitely had issues in the past trying to use `ScalarValue::Null` in a
predicate rather than the properly typed null.
I would suggest passing down the complete schema to this reader, or
rewriting any references to projected columns to `NULL` higher up where we do
have the complete schema and type information
That being said, keeping this PR's change minimal and returning a generic
`DataType::Null` in the case where the column is present seems like a good way
to proceed -- it unblocks @liukun4515 but doesn't break any other cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]