Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/24282 )
Change subject: IMPALA-14961: Query Profile Redaction ...................................................................... Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc File be/src/service/query-profile-redaction.cc: http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@50 PS3, Line 50: static string Trim(string_view s) { This is called in a single place so you could also embed `boost::algorithm::trim_copy(string(s))` in the code instead. http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@55 PS3, Line 55: static bool IsLikelyHostnameToken(string_view token) { I think instead of regex search we could rely on extracting hostnames based on following sections (lines) which we know will be there in the query profiles? ``` Per Host Min Memory Reservation: d1c0917.pcd.dfw.cloudera.com:27000(0) d1c0919.pcd.dfw.cloudera.com:27000(1.00 MB) d1c0916.pcd.dfw.cloudera.com:27000(140.94 MB) Per Host Number of Fragment Instances: d1c0917.pcd.dfw.cloudera.com:27000(0) d1c0919.pcd.dfw.cloudera.com:27000(1) d1c0916.pcd.dfw.cloudera.com:27000(4) ``` http://gerrit.cloudera.org:8080/#/c/24282/3/be/src/service/query-profile-redaction.cc@70 PS3, Line 70: string escaped = string(buffer.GetString(), buffer.GetSize()); : if (escaped.size() >= 2 && escaped.front() == '"' && escaped.back() == '"') { : escaped = boost::algorithm::erase_first_copy(escaped, "\""); : escaped = boost::algorithm::erase_last_copy(escaped, "\""); : } We're doing 3 copies here L70, L72 and L73. One copy is unavoidable so we should aim for that, so maybe something like: ``` const char* p = buffer.GetString(); const size_t n = buffer.GetSize(); if (n >= 2 && p[0] == '"' && p[n - 1] == '"') { return string(p + 1, n - 2); } return string(p, n); ``` -- To view, visit http://gerrit.cloudera.org:8080/24282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0c5b4911a64888f319f212155df6e08c1800b32 Gerrit-Change-Number: 24282 Gerrit-PatchSet: 3 Gerrit-Owner: Gokul Kolady <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Gokul Kolady <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Wed, 13 May 2026 18:31:59 +0000 Gerrit-HasComments: Yes
