Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24282 )

Change subject: IMPALA-14961: Query Profile Redaction
......................................................................


Patch Set 4:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc
File be/src/service/query-profile-redaction.cc:

http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@138
PS3, Line 138: }
Pass entries by reference.


http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@148
PS3, Line 148: // Builds aliases from tokens, applies them to text, and tracks 
reverse mappings.
Pass parameters by reference.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc
File be/src/service/query-profile-redaction.cc:

http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@95
PS4, Line 95:       "Per Host Min Memory Reservation", "Per Host Number of 
Fragment Instances"};
We could use one of these sections, maybe "Per Host Min Memory Reservation" ?
Otherwise we're getting host names twice.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@99
PS4, Line 99:   set<string> host_tokens;
We probably don't need an ordered set here so maybe use unordered_set.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@115
PS4, Line 115:   set<string> sorted_tokens(tokens.cbegin(), tokens.cend());
We don't need a temporary set here. We can simply iterate through the input 
tokens?


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@143
PS4, Line 143:       vector<std::pair<string, string>>(alias_map.cbegin(), 
alias_map.cend()),
Not sure why we need a temporary vector here ? We can pass the map by reference 
and iterate through it? Not sure why we need any ordering in this case?


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@154
PS4, Line 154:   const auto reverse_aliases = BuildAliasReverseMap(aliases);
Pass alias_to_original by reference to BuildAliasReverseMap to avoid temp 
copies.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@164
PS4, Line 164:   const string plan_text_copy(plan_text);
avoid local copy here, either pass string by reference to this function or use 
regex_search constructor which accepts a char* parameter instead of a string.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@328
PS4, Line 328:   string redacted = profile_text;
We're making 2 copies here and in the following line. We need one copy for 
redacted profile but for parsing you can directly pass char* to 
source_json.Parse(..)


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@334
PS4, Line 334:   vector<string> sql_statements =
Either pass sql_statements by reference to CollectInfoStringValuesByKeys or use 
move semantics to avoid copy.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@336
PS4, Line 336:   for (const string& sql_statement : sql_statements) {
Do we ever have multiple "Sql Statement" in the profile?


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@339
PS4, Line 339:     boost::algorithm::replace_all(redacted, sql_statement, 
REDACTED_SQL_STATEMENT);
This seems unnecessary since we've already redacted above ?


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@377
PS4, Line 377:     const string_view& text, const map<string, string>& 
alias_to_original) {
We can use an unordered_map unless we really need ordered replacements in L379.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@387
PS4, Line 387:   const string profile_text_copy(profile_text);
This seems redundant? We can directly pass profile_text.data() and 
profile_text.size() to source_json.Parse() ?



--
To view, visit http://gerrit.cloudera.org:8080/24282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0c5b4911a64888f319f212155df6e08c1800b32
Gerrit-Change-Number: 24282
Gerrit-PatchSet: 4
Gerrit-Owner: Gokul Kolady <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Gokul Kolady <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 14 May 2026 15:27:21 +0000
Gerrit-HasComments: Yes

Reply via email to