aaraujo opened a new pull request, #17626:
URL: https://github.com/apache/datafusion/pull/17626

   When aggregation operations produce unqualified column names in their output 
schema, subsequent operations (like binary expressions) may still reference the 
original qualified names. This fix adds fallback resolution that attempts to 
match qualified column references to unqualified column names when the exact 
qualified match is not found.
   
   This resolves errors like:
     'No field named table.column. Valid fields are column'
   
   that occur in expressions like `avg(table.column) / 1024` where the 
aggregation produces an unqualified 'value' field but the division still 
references 'table.column'.
   
   Includes test case demonstrating the issue and fix.
   
   ## Which issue does this PR close?
   
   This PR addresses a schema resolution issue discovered during integration 
testing. No existing issue was filed.
   
   ## Rationale for this change
   
   Currently, when an aggregation function produces an unqualified output 
schema (e.g., just "value" without a table qualifier), subsequent binary 
operations that reference the original qualified column name fail with a schema 
resolution error. This is a common pattern in SQL queries where aggregations 
are combined with arithmetic operations.
   
     For example:
     ```sql
     SELECT avg(metrics.value) / 1024 FROM metrics
   
     The aggregation produces an unqualified "value" field, but the division 
operation still carries the qualified reference "metrics.value",
      causing the query to fail.
   
     What changes are included in this PR?
   
     - Modified Expr::Column case in get_type() and nullable() methods in 
datafusion/expr/src/expr_schema.rs to add fallback resolution
     - When a qualified column reference is not found, attempts to resolve 
using just the column name without the qualifier
     - Added comprehensive test case test_qualified_column_after_aggregation 
that demonstrates the issue and validates the fix
   
     Are these changes tested?
   
     Yes, includes a new test case that:
     - Creates a schema simulating aggregation output (unqualified fields)
     - Tests resolution of qualified column references against this schema
     - Validates both direct column access and binary expressions
     - Verifies both data type and nullability resolution
   
     Are there any user-facing changes?
   
     This is a bug fix that makes previously failing queries work correctly. No 
breaking changes to existing functionality.
     ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to