changsun20 commented on issue #14434: URL: https://github.com/apache/datafusion/issues/14434#issuecomment-2726143005
Hi @eliaperantoni, Thank you raising this issue. As I explore potential implementations, I’d appreciate your insights on a few key points: **1. Warning Scope for `= NULL` Patterns** Could you help confirm the appropriate scope for detecting `= NULL` comparisons? - **Option 1**: Emit warnings universally whenever `= NULL` is parsed. - _Potential Risk_: Trigger warnings whenever "= NULL" is parsed. However, expressions like `UPDATE users SET password = NULL` would be legitimate usage. - Note: Since `Dml(Update)` isn't currently supported in DataFusion, this might not be an immediate issue but could become one. - **Option 2**: Restrict warnings to predicate contexts (WHERE/JOIN/HAVING/CASE WHEN clauses). - _Tradeoff_: This would be more precise but adds complexity. **2. Non-Fatal Warning Plumbing** You mentioned the need to implement plumbing and collect warnings via `Diagnostic::new_warning`. Could you please elaborate more on this? Can you briefly outline (or just give me a hint) how to propagate warnings without interrupting the execution flow? I recognize this may require deeper code context and may to too general to answer—feel free to defer discussion until I share a draft implementation! **3. Span Highlighting Strategy** For the query `SELECT * FROM users WHERE password = NULL`, the parsed structure shows: ```json BinaryOp { left: Identifier( Ident { value: "password", quote_style: None, span: Span(Location(1,27)..Location(1,35)), }, ), op: Eq, right: Value( Null, ), }, ``` I have come up with some possible approaches: - **Option 1**: Add `Span` to `op: Eq` and/or `right: Value(Null)` - most precise but risks breaking changes elsewhere - **Option 2**: Use the `password` identifier’s span - simplest approach, but highlights the column name rather than the actual issue ("= NULL"). - **Option 3**: Use the identifier's `Span` as a starting point and estimate the entire expression's span - avoids structural changes but may be inaccurate with variable whitespace What would be the best approach in your opinion? Thank you for your guidance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org