github-actions[bot] commented on code in PR #63309:
URL: https://github.com/apache/doris/pull/63309#discussion_r3252159303


##########
be/src/util/jsonb_parser_simd.h:
##########
@@ -93,6 +93,29 @@ struct JsonbParser {
             simdjson::padded_string json_str {pch, len};
             simdjson::ondemand::document doc = 
simdjson_parser.iterate(json_str);
 
+            auto is_json_whitespace = [](char c) {
+                return c == ' ' || c == '\t' || c == '\n' || c == '\r';
+            };
+            const char* json_begin = json_str.data();
+            const char* json_end = json_str.data() + len;
+            while (json_begin < json_end && is_json_whitespace(*json_begin)) {
+                ++json_begin;
+            }
+            while (json_end > json_begin && is_json_whitespace(*(json_end - 
1))) {
+                --json_end;
+            }
+
+            std::string_view raw_json;
+            simdjson::error_code raw_res = doc.raw_json().get(raw_json);

Review Comment:
   This `raw_json()` call consumes the entire document, then `doc.rewind()` 
makes the code parse and serialize the same JSON again. `JsonbParser::parse` is 
shared by `json_valid`, `jsonb_parse`, casts to JSONB, and other ingestion 
paths, so large object/array inputs now pay an extra full simdjson traversal 
just to detect trailing content. The later code already consumes the root value 
and checks `doc.at_end()`, while top-level scalar getters use simdjson's root 
APIs that disallow trailing content and the new `is_null()` check covers the 
`not`/partial-null case. Please avoid the whole-document `raw_json()` pre-pass 
and enforce the remaining scalar/null validation in the existing parse pass so 
JSONB parsing stays single-pass.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to