eliaperantoni commented on issue #14434: URL: https://github.com/apache/datafusion/issues/14434#issuecomment-2728578964
Thank you @changsun20 for the detailed analysis 🙏 #### 1. Warning Scope for = NULL Patterns I would say that, if's not _too_ complicated, we should go for the semantic approach. i.e. only raise a warning when `= NULL` is being used as a condition. #### 2. Non-Fatal Warning Plumbing This is a complicated one! 😃 You hit the nail on the head: the complicated part is bubbling up a warning without interrupting everything. I haven't thought too much about how to solve it. Perhaps one way would be to make a `type DatafusionResult<R> = Result<(R, Vec<DatafusionWarning>), DatafusionError>` so that a fatal error can still be bubbled up with `?`, and then a `WarningAccumulator` could help with the aggregation of warnings that arise in a function body. i.e. ```rust fn parse_query(sql: &str) -> DatafusionResult<LogicalPlan> { let mut warnings = WarningAccumulator::new(); // WarningAccumulator::extend(&mut self) takes a DatafusionResult<R> and // returns Result<R>, while storing any warnings in itself. let ctes = warnings.extend(parse_ctes(sql))?; let mut logical_plan = warnings.extend(parse_select(sql, ctes))?; warnings.extend(parse_where(&mut logical_plan, sql))?; // WarningAccumulator::into returns Vec<DatafusionWarning> DatafusionResult::ok_with_warnings(logical_plan, warnings.into()) } ``` But I believe that's a bit invasive because it requires replacing `Result` with `DatafusionResult` all the way up to the outermost function call. It also pollutes the code with a lot of `warnings.extend` calls. Though perhaps this would be the most robust solution in the long term. Alternatively, to start with we could add a field to `SqlToRel` where warnings are collected (i.e. just a `warnings: Vec<Diagnostic>` would work imo, no need for a new type), and then the user can get them out if they so wish. The only problem I see with this is that most (all?) methods of `SqlToRel` take an immutable reference. So you'd either have to use interior mutability, or change most receivers to `&mut self`. The latter is definitely a breaking change so I'd not encourage going that route. I think interior mutability with `RefCell` is fine. #### 3. Span Highlighting Strategy > Add Span to op: Eq and/or right: Value(Null) - most precise but risks breaking changes elsewhere This means changing the `sqlparser` dependency, that's the only component that can give you spans. Actually, someone already added spans to `Value` (https://github.com/apache/datafusion-sqlparser-rs/pull/1738). Perhaps you could bump Datafusion's depedency on `sqlparser` to get that? If that doesn't work, I would suggest perhaps going the easy route of highlighting the expression and saying "this expression is being compared with `NULL` and that's always falsey, perhaps you meant `IS NULL`" or something like that. It's not the best but it's the most we can do without more information from `sqlparser` :) I think trying to find the `IS NULL` by means of character operations is not worth the complexity and risks missing some edges cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org