alamb commented on code in PR #3470:
URL: https://github.com/apache/arrow-datafusion/pull/3470#discussion_r971230551


##########
datafusion/core/src/physical_plan/file_format/row_filter.rs:
##########
@@ -202,7 +202,18 @@ impl<'a> ExprRewriter for FilterCandidateBuilder<'a> {
     fn mutate(&mut self, expr: Expr) -> Result<Expr> {
         if let Expr::Column(Column { name, .. }) = &expr {
             if self.file_schema.field_with_name(name).is_err() {
-                return Ok(Expr::Literal(ScalarValue::Null));
+                // the column expr must be in the table schema
+                return match self.table_schema.field_with_name(name) {

Review Comment:
   If there is a predicate with column references in it, in order to handle it 
reasonably in all cases, I suspect we need to know the type of that column. I 
definitely had issues in the past  trying to use `ScalarValue::Null` in a 
predicate rather than the properly typed null. 
   
   I would suggest passing down the complete schema to this reader, or 
rewriting any references to projected columns to `NULL` higher up where we do 
have the complete schema and type information
   
   That being said, keeping this PR's change minimal and returning a generic 
`DataType::Null` in the case where the column is present seems like a good way 
to proceed -- it unblocks @liukun4515  but doesn't break any other cases. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to