Re: [I] Logical optimizer causes invalid query result with case expression [arrow-datafusion]

via GitHub Mon, 22 Jan 2024 06:55:28 -0800


gruuya commented on issue #8942:
URL: 
https://github.com/apache/arrow-datafusion/issues/8942#issuecomment-1904175896


   It seems like when the entering plan's innermost projection:
   ```sql
   Projection: ?table?.id, t, CASE WHEN ?table?.id = Int32(1) THEN Int32(10) 
ELSE t END AS t2
     Projection: ?table?.id, CASE WHEN ?table?.id = Int32(1) THEN Int32(10) 
ELSE t END AS t
       Projection: ?table?.id, Int32(NULL) AS t
         TableScan: ?table?
   ```
   is being rewritten, this evaluation :
   
https://github.com/apache/arrow-datafusion/blob/2b218be67a6c412629530b812836a6cec76efc32/datafusion/optimizer/src/optimize_projections.rs#L867-L871
   concludes that its and its input schema (the bottom most projection) are 
identical, and so it just discards the projection (`proj` and its `exprs_used`) 
even though it has non-trivial computation on top.
   
   Trying out a naive solution like
   ```diff
   @@ -867,7 +867,7 @@ fn rewrite_projection_given_requirements(
        return if let Some(input) =
            optimize_projections(&proj.input, config, &required_indices)?
        {
   -        if &projection_schema(&input, &exprs_used)? == input.schema() {
   +        if &projection_schema(&input, &exprs_used)? == input.schema() && 
exprs_used.iter().all(is_expr_trivial) {
                Ok(Some(input))
            } else {
                Projection::try_new(exprs_used, Arc::new(input))
   ```
   does solve this particular problem but then it fails to eliminate unneeded 
projections in some other tests cases (notably in 
`test_infinite_source_partition_by` which ends up with a bunch of interleaved 
projections).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Logical optimizer causes invalid query result with case expression [arrow-datafusion]

Reply via email to