Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/24283 )
Change subject: IMPALA-14962: Query Profile Parser and Section Retrieval Interface ...................................................................... Patch Set 10: (19 comments) http://gerrit.cloudera.org:8080/#/c/24283/4/be/src/service/query-profile-parsing-tools.cc File be/src/service/query-profile-parsing-tools.cc: http://gerrit.cloudera.org:8080/#/c/24283/4/be/src/service/query-profile-parsing-tools.cc@272 PS4, Line 272: if (!parsed_.IsObject()) return; > Done Done http://gerrit.cloudera.org:8080/#/c/24283/4/be/src/service/query-profile-parsing-tools.cc@276 PS4, Line 276: > Done Done http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc File be/src/service/query-profile-parsing-tools.cc: http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@44 PS9, Line 44: #include "gutil/strings/strip.h" Nit: includes should be alphabetically ordered within their group, thus this import should be underneath gen-cpp/Types_types.h http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@121 PS9, Line 121: int64_t percentage_limit_bytes = process_mem_limit / 100 : * DEFAULT_PARSING_PROFILE_SIZE_LIMIT_PERCENTAGE; This calculation confused me at first. The function doc in query-profile-parsing-tools.h on the CreateQueryProfileToolExecutorForProfile function says, " default limit is chosen as: min(256 MB, 1% of process_mem_limit)". A clearer way of calculating would by to multiple process_mem_limit by 0.01. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@127 PS9, Line 127: // Splits text into lines. : static vector<string_view> SplitLines(string_view text) { : vector<string_view> lines; : size_t line_start = 0; : while (line_start <= text.size()) { : const size_t line_end = text.find('\n', line_start); : if (line_end == string_view::npos) { : lines.emplace_back(text.substr(line_start)); : break; : } : lines.emplace_back(text.substr(line_start, line_end - line_start)); : line_start = line_end + 1; : } : return lines; : } Move this function into be/src/util/string-util.h,.cc http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@143 PS9, Line 143: static string_view TrimWhiteSpace(string_view value) { : while (!value.empty() : && isspace(static_cast<unsigned char>(value.front())) != 0) { : value.remove_prefix(1); : } : while (!value.empty() && isspace(static_cast<unsigned char>(value.back())) != 0) { : value.remove_suffix(1); : } : return value; : } Move this function into be/src/util/string-util.h,.cc http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@154 PS9, Line 154: static bool IsAllDigits(string_view text) { : return !text.empty() : && all_of(text.begin(), text.end(), : [](unsigned char c) { return isdigit(c) != 0; }); : } : Move this function into be/src/util/string-util.h,.cc. Also, should not need to check text.empty(). http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@178 PS9, Line 178: Value& existing = parent_obj[key]; Nit: insert newline to separate this line from the one above it. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@187 PS9, Line 187: static Status ConvertJsonProfile( Add a comment explaining the purpose of this function. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@189 PS9, Line 189: if (out == nullptr) return Status("output profile pointer cannot be null"); Add a DCHECK(out != nullptr) above this as any situation where out is nullptr is bad and we want to catch it as soon as possible. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@243 PS9, Line 243: profile_text.data(), static_cast<rapidjson::SizeType>(profile_text.size())); : if (source_json.HasParseError() || !source_json.IsObject()) { : Separate this into two separate if statements so the returned error can be clearer. Include the error code and offset if HasParseError() is true. https://rapidjson.org/classrapidjson_1_1_generic_document.html#a36d19989c9221b27036675455516a974 http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@338 PS9, Line 338: Value QueryProfileToolAccessor::GetSummary(Document::AllocatorType& alloc) const { This function duplicates a lot of functionality in GetSummaryPtr(). http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@345 PS9, Line 345: return Value(rapidjson::kObjectType); If query profiles should always have a summary section, then add a DCHECK(false) << "profile is missing summary section for query " + query_id_"; Still keep the return of a blank Value object though so that a missing summary section does not cause a crash during production runtime. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@372 PS9, Line 372: return Value(rapidjson::kObjectType); Is this case expected to happen? If not, add a DCHECK(false) above the return. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@422 PS9, Line 422: all_of(normalized.begin(), normalized.end(), : [](unsigned char c) { return isdigit(c) != 0; }) Replace with call to IsAllDigits() function. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@462 PS9, Line 462: return err; Is this case expected to happen? If not, add a DCHECK(false) above the return. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@590 PS9, Line 590: const string key_owned(key); The creation of this string isn't necessary. http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@773 PS9, Line 773: return nullptr; If query profiles should always have a summary section, then add a DCHECK(false) << "profile is missing summary section for query " + query_id_; http://gerrit.cloudera.org:8080/#/c/24283/9/be/src/service/query-profile-parsing-tools.cc@1022 PS9, Line 1022: "node_id" Extract "node_id" into a constant. -- To view, visit http://gerrit.cloudera.org:8080/24283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4d9c9e735aa089bb243b07af421553002a465a88 Gerrit-Change-Number: 24283 Gerrit-PatchSet: 10 Gerrit-Owner: Gokul Kolady <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Gokul Kolady <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Mon, 18 May 2026 19:28:24 +0000 Gerrit-HasComments: Yes
