Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/24282 )
Change subject: IMPALA-14961: Query Profile Redaction ...................................................................... Patch Set 4: (14 comments) http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc File be/src/service/query-profile-redaction.cc: http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@138 PS3, Line 138: } Pass entries by reference. http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@148 PS3, Line 148: // Builds aliases from tokens, applies them to text, and tracks reverse mappings. Pass parameters by reference. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc File be/src/service/query-profile-redaction.cc: http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@95 PS4, Line 95: "Per Host Min Memory Reservation", "Per Host Number of Fragment Instances"}; We could use one of these sections, maybe "Per Host Min Memory Reservation" ? Otherwise we're getting host names twice. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@99 PS4, Line 99: set<string> host_tokens; We probably don't need an ordered set here so maybe use unordered_set. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@115 PS4, Line 115: set<string> sorted_tokens(tokens.cbegin(), tokens.cend()); We don't need a temporary set here. We can simply iterate through the input tokens? http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@143 PS4, Line 143: vector<std::pair<string, string>>(alias_map.cbegin(), alias_map.cend()), Not sure why we need a temporary vector here ? We can pass the map by reference and iterate through it? Not sure why we need any ordering in this case? http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@154 PS4, Line 154: const auto reverse_aliases = BuildAliasReverseMap(aliases); Pass alias_to_original by reference to BuildAliasReverseMap to avoid temp copies. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@164 PS4, Line 164: const string plan_text_copy(plan_text); avoid local copy here, either pass string by reference to this function or use regex_search constructor which accepts a char* parameter instead of a string. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@328 PS4, Line 328: string redacted = profile_text; We're making 2 copies here and in the following line. We need one copy for redacted profile but for parsing you can directly pass char* to source_json.Parse(..) http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@334 PS4, Line 334: vector<string> sql_statements = Either pass sql_statements by reference to CollectInfoStringValuesByKeys or use move semantics to avoid copy. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@336 PS4, Line 336: for (const string& sql_statement : sql_statements) { Do we ever have multiple "Sql Statement" in the profile? http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@339 PS4, Line 339: boost::algorithm::replace_all(redacted, sql_statement, REDACTED_SQL_STATEMENT); This seems unnecessary since we've already redacted above ? http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@377 PS4, Line 377: const string_view& text, const map<string, string>& alias_to_original) { We can use an unordered_map unless we really need ordered replacements in L379. http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@387 PS4, Line 387: const string profile_text_copy(profile_text); This seems redundant? We can directly pass profile_text.data() and profile_text.size() to source_json.Parse() ? -- To view, visit http://gerrit.cloudera.org:8080/24282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0c5b4911a64888f319f212155df6e08c1800b32 Gerrit-Change-Number: 24282 Gerrit-PatchSet: 4 Gerrit-Owner: Gokul Kolady <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Gokul Kolady <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Thu, 14 May 2026 15:27:21 +0000 Gerrit-HasComments: Yes
