Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24282 )

Change subject: IMPALA-14961: Query Profile Redaction
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc
File be/src/service/query-profile-redaction.cc:

http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@50
PS3, Line 50: static string Trim(string_view s) {
This is called in a single place so you could also embed 
`boost::algorithm::trim_copy(string(s))` in the code instead.


http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@55
PS3, Line 55: static bool IsLikelyHostnameToken(string_view token) {
I think instead of regex search we could rely on extracting hostnames based on 
following sections (lines) which we know will be there in the query profiles?

```
Per Host Min Memory Reservation: d1c0917.pcd.dfw.cloudera.com:27000(0) 
d1c0919.pcd.dfw.cloudera.com:27000(1.00 MB) 
d1c0916.pcd.dfw.cloudera.com:27000(140.94 MB)

Per Host Number of Fragment Instances: d1c0917.pcd.dfw.cloudera.com:27000(0) 
d1c0919.pcd.dfw.cloudera.com:27000(1) d1c0916.pcd.dfw.cloudera.com:27000(4)
```


http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@70
PS3, Line 70: string escaped = string(buffer.GetString(), buffer.GetSize());
            :   if (escaped.size() >= 2 && escaped.front() == '"' && 
escaped.back() == '"') {
            :     escaped = boost::algorithm::erase_first_copy(escaped, "\"");
            :     escaped = boost::algorithm::erase_last_copy(escaped, "\"");
            :   }
We're doing 3 copies here L70, L72 and L73. One copy is unavoidable so we 
should aim for that, so maybe something like:

```
const char* p = buffer.GetString();
const size_t n = buffer.GetSize();
if (n >= 2 && p[0] == '"' && p[n - 1] == '"') {
  return string(p + 1, n - 2);
}
return string(p, n);
```



--
To view, visit http://gerrit.cloudera.org:8080/24282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0c5b4911a64888f319f212155df6e08c1800b32
Gerrit-Change-Number: 24282
Gerrit-PatchSet: 3
Gerrit-Owner: Gokul Kolady <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Gokul Kolady <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Wed, 13 May 2026 18:31:59 +0000
Gerrit-HasComments: Yes

Reply via email to