github-actions[bot] commented on PR #61548: URL: https://github.com/apache/doris/pull/61548#issuecomment-4102285794
## Code Review Summary **PR: [refactor](variant) normalize nested search predicate field resolution** I've performed a thorough review of this PR against the code-review skill checklist. The changes centralize nested field path construction, normalize child predicates during parsing, apply consistent validation across standard and lucene modes, and add comprehensive tests. ### Critical Checkpoint Conclusions **Goal and correctness**: The PR's goal is to make fields inside `NESTED(path, ...)` relative to the nested path and reject invalid forms earlier with clear errors. The code accomplishes this correctly. Tests prove it — both positive (normalization works) and negative (absolute paths, bare queries, nested NESTED, non-top-level NESTED all rejected). **Modification scope**: Focused and minimal. The SearchDslParser.java changes extract duplicated field-path-building code into `buildFieldPath()` and `normalizeNestedFieldPath()`, replace 4 duplicated field-path-building blocks (2 per builder × 2 methods), and add a new `nestedQuery` dispatch in the Lucene mode's `collectTermsFromNotClause`. The BE change is trivially safe dead code removal. **Concurrency**: Both `QsAstBuilder` and `QsLuceneModeAstBuilder` are created as fresh local instances per parse call. The mutable `currentNestedPath` field is instance-scoped, never shared. Thread-safe. **Lifecycle / save-restore pattern**: `currentNestedPath` is saved/restored in a try/finally in `visitNestedQuery`. The save is technically redundant (because the guard rejects nested NESTED, so `previousNestedPath` is always null), but the pattern is defensive and correct. Consistent across both builders. **Parallel code paths**: Both `QsAstBuilder` (standard mode) and `QsLuceneModeAstBuilder` (lucene mode) receive identical changes: same `visitNestedQuery` structure, same bare-query rejection, same `normalizeNestedFieldPath` calls in `visitFieldQuery` and `visitFieldGroupQuery`. The Lucene mode also gets a necessary `nestedQuery` dispatch in `collectTermsFromNotClause` (line 2298-2299) that was previously missing. All four parse entry points (single-field standard, multi-field standard, single-field lucene, multi-field lucene) call `validateNestedTopLevelOnly` post-parse. **Error handling**: All new error paths throw `SearchDslSyntaxException` (the existing custom exception class), with descriptive messages including the problematic field/path. The change at line 2686 upgrades a `RuntimeException` to `SearchDslSyntaxException` for consistency, which is an improvement. **Behavioral change / compatibility**: This is a **behavioral change** — previously, users had to write absolute paths inside NESTED (e.g., `NESTED(data, data.msg:hello)`), and now they must write relative paths (`NESTED(data, msg:hello)`). The old form is now explicitly rejected. This is documented in the PR body. Since this is a Variant search DSL feature, the scope of impact is limited. **Test coverage**: Comprehensive. The PR adds/updates tests covering: - Standard mode: simple, AND, dotted paths - Lucene mode: simple, AND, descendant fields - Rejection cases: absolute paths, mixed absolute+relative, bare queries, nested NESTED, non-top-level NESTED (both modes) - Thrift serialization: verifies normalized field bindings propagate correctly to BE **BE dead code removal**: The removed loop in `variant_util.cpp` iterated over output columns but had an empty body for variant columns (only `continue` for non-variant). Genuinely dead code, safe to remove. **Configuration / incompatible changes**: No new config items. The behavioral change (relative vs absolute field paths in NESTED) is a breaking change for existing DSL queries using absolute paths inside NESTED. **No issues found.** The refactoring is clean, well-tested, and improves code organization by eliminating duplication and catching errors earlier. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
