Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24282 )

Change subject: IMPALA-14961: Query Profile Redaction
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc
File be/src/service/query-profile-redaction.cc:

http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@67
PS4, Line 67:   unordered_map<string, string> alias_to_original;
Since alias_to_original basically points to entries in original_to_alias we 
could use string_view to avoid deep copy.
 unordered_map<string, string> alias_to_original;

```
std::unordered_map<std::string_view, std::string_view> alias_to_original;
alias_to_original.reserve(original_to_alias.size());

for (const auto& entry : original_to_alias) {
  alias_to_original[entry.second] = entry.first;
}
```


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@80
PS4, Line 80:   set<string> unique;
use unordered_set<string_view> since ordering is not required and is more 
efficient for inserts.
Also, using string_view you avoid deep copies.
Using string_view we can also avoid creating a temp string  when we insert into 
'unique'.

```
 std::unordered_set<std::string_view> unique_views;

 for (std::sregex_iterator it(text.cbegin(), text.cend(), pattern), end;
       it != end; ++it) {
    ....
    unique_views.emplace(text.data() + match.position(group_index),
                         match.length(group_index));
  }

  vector<string> results;
  results.reserve(unique_views.size());
  for (const auto& sv : unique_views) {
    results.emplace_back(sv);
  }
```


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@263
PS4, Line 263:   set<string> table_set;
             :   set<string> column_set;
use unordered_set.

The regex match loops create many temporary strings we could use string_view 
instead. Also, stringstream can probably be avoided.


http://gerrit.cloudera.org:8080/#/c/24282/4/be/src/service/query-profile-redaction.cc@266
PS4, Line 266:   static const std::regex QUALIFIED_ID_RE(
This function is non trivial. Maybe add comments explaining what each of the 
regex patterns match?



--
To view, visit http://gerrit.cloudera.org:8080/24282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0c5b4911a64888f319f212155df6e08c1800b32
Gerrit-Change-Number: 24282
Gerrit-PatchSet: 4
Gerrit-Owner: Gokul Kolady <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Gokul Kolady <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 14 May 2026 19:17:28 +0000
Gerrit-HasComments: Yes

Reply via email to