changsun20 commented on issue #14438:
URL: https://github.com/apache/datafusion/issues/14438#issuecomment-2709581340

   Hi @eliaperantoni,
   
   After investigating this issue, here are my initial thoughts on 
implementation:
   
   The most straightforward approach would be to add a new `span` property to 
the `Subquery` struct in `datafusion/expr/src/logical_plan/plan.rs`:
   
   ```rust
   pub struct Subquery {
       /// The subquery
       pub subquery: Arc<LogicalPlan>,
       /// The outer references used in the subquery
       pub outer_ref_columns: Vec<Expr>,
       pub span: Option<Span>,
   }
   ```
   
   The span would be extracted in the `parse_scalar_subquery` function within 
`datafusion/sql/src/expr/subquery.rs`:
   
   ```rust
   pub(super) fn parse_scalar_subquery(
       &self,
       subquery: Query,
       // other params...
   ) -> Result<Expr> {
       // other logic...
       let span = 
Span::try_from_sqlparser_span(subquery.some_way_to_index_the_span_from_the 
query());
       Ok(Expr::ScalarSubquery(Subquery {
           subquery: Arc::new(sub_plan),
           outer_ref_columns,
           span,
       }))
   }
   ```
   
   This span would then be accessible when generating error messages, allowing 
us to add diagnostic information at 
`datafusion\expr\src\logical_plan\invariants.rs`:
   
   ```rust
   // This is the original implementation
   if subquery.subquery.schema().fields().len() > 1 {
       return plan_err!(
           "Scalar subquery should only return one column, but found {}: {}",
           subquery.subquery.schema().fields().len(),
           subquery.subquery.schema().field_names().join(", ")
       );
   }
   ```
   
   However, a drawback of this approach is that modifying the `Subquery` struct 
will impact 14 other functions in the codebase, making this change less minimal 
than ideal (not fulfilling the "as little invasive as possible" requirement).
   
   Please let me know if this is on the right track or if you have any 
suggestions. Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to