changsun20 commented on issue #14434:
URL: https://github.com/apache/datafusion/issues/14434#issuecomment-2726143005
Hi @eliaperantoni,
Thank you raising this issue. As I explore potential implementations, I’d
appreciate your insights on a few key points:
**1. Warning Scope for `= NULL` Patterns**
Could you help confirm the appropriate scope for detecting `= NULL`
comparisons?
- **Option 1**: Emit warnings universally whenever `= NULL` is parsed.
- _Potential Risk_: Trigger warnings whenever "= NULL" is parsed.
However, expressions like `UPDATE users SET password = NULL` would be
legitimate usage.
- Note: Since `Dml(Update)` isn't currently supported in DataFusion,
this might not be an immediate issue but could become one.
- **Option 2**: Restrict warnings to predicate contexts
(WHERE/JOIN/HAVING/CASE WHEN clauses).
- _Tradeoff_: This would be more precise but adds complexity.
**2. Non-Fatal Warning Plumbing**
You mentioned the need to implement plumbing and collect warnings via
`Diagnostic::new_warning`. Could you please elaborate more on this? Can you
briefly outline (or just give me a hint) how to propagate warnings without
interrupting the execution flow? I recognize this may require deeper code
context and may to too general to answer—feel free to defer discussion until I
share a draft implementation!
**3. Span Highlighting Strategy**
For the query `SELECT * FROM users WHERE password = NULL`, the parsed
structure shows:
```json
BinaryOp {
left: Identifier(
Ident {
value: "password",
quote_style: None,
span: Span(Location(1,27)..Location(1,35)),
},
),
op: Eq,
right: Value(
Null,
),
},
```
I have come up with some possible approaches:
- **Option 1**: Add `Span` to `op: Eq` and/or `right: Value(Null)` - most
precise but risks breaking changes elsewhere
- **Option 2**: Use the `password` identifier’s span - simplest approach,
but highlights the column name rather than the actual issue ("= NULL").
- **Option 3**: Use the identifier's `Span` as a starting point and estimate
the entire expression's span - avoids structural changes but may be inaccurate
with variable whitespace
What would be the best approach in your opinion?
Thank you for your guidance!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]