jonahgao opened a new pull request, #8776:
URL: https://github.com/apache/arrow-datafusion/pull/8776

   ## Which issue does this PR close?
   Closes #8763.
   
   ## Rationale for this change
   In `ValuesExec::try_new`, we use the schema of `ValuesExec` and **null 
arrays** to construct a placeholder batch. 
   
https://github.com/apache/arrow-datafusion/blob/dd4263f843e093c807d63edf73a571b1ba2669b5/datafusion/physical-plan/src/values.rs#L56-L64
   This batch placeholder is used to evaluate the physical expressions in 
Values.
   
   But users can define a schema containing non-nullable fields just like issue 
#8763, which is in conflict with null arrays.
   ```rust
   let field = Field::new("a", DataType::Int32, false);
   let schema = Schema::new(vec![field]);
   let df_schema = DFSchema::try_from(schema.clone()).unwrap();
   let values = vec![vec![Expr::Literal(ScalarValue::Int32(Some(1)))]];
   let values_plan = LogicalPlan::Values(Values {
           schema: df_schema.clone().into(),
           values: values.clone(),
   })
   ```
   In this case, an ArrowError ‘InvalidArgumentError("Column 'a' is declared as 
non-nullable but contains null values"))' will be raised.
   
   Since `ValuesExec` has no input, I think the schema for this batch 
placeholder can be empty, rather than using the schema from `ValuesExec`.
   
   ## What changes are included in this PR?
   Use correct schema to build the placeholder batch in `ValuesExec::try_new`.
   
   ## Are these changes tested?
   Yes
   
   ## Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to