eldenmoon opened a new pull request, #63697:
URL: https://github.com/apache/doris/pull/63697
Pick #63082 to branch-4.0.
Original PR: #63082
Original merge commit: f563b2eb382a6a42c359d977707f7731d8e0d0a0
### What problem does this PR solve?
Issue Number: None
Related PR: #63082
Problem Summary: Add disabled-by-default duplicate Variant JSON path
checking so Variant JSON parsing can keep the first value for duplicate
normalized leaf paths and avoid inconsistent subcolumns during load/parsing.
This pick adapts the implementation to branch-4.0's vec/json parser and
parse2column flow.
### Release note
Add BE config variant_enable_duplicate_json_path_check to keep the first
duplicate Variant JSON leaf path when enabled. The default value is false.
### Check List (For Author)
- Test: Unit Test
- Unit Test: ./run-be-ut.sh --run
--filter='JsonParserTest.*DuplicateJsonPath*:SchemaUtilTest.TestParseVariantColumnsDuplicateJsonPathCheck'
- Format: PATH=/mnt/disk1/claude-max/ldb_toolchain16/bin:$PATH
./build-support/clang-format.sh
- Static check: git diff --check origin/branch-4.0...HEAD
- Regression test: Not run (no local ASAN output cluster available in
the fresh branch-4.0 checkout)
- Behavior changed: Yes. When variant_enable_duplicate_json_path_check is
enabled, duplicate normalized Variant JSON leaf paths keep the first parsed
value. The default value is false.
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]