bharath v created IMPALA-6348:
---------------------------------

             Summary: Redact only the query string in runtime profile
                 Key: IMPALA-6348
                 URL: https://issues.apache.org/jira/browse/IMPALA-6348
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 2.10.0, Impala 2.8.0, Impala 2.7.0, Impala 2.11.0
            Reporter: bharath v


Currently, the redactor is run on every info string in the run-time profile.

{noformat}
void RuntimeProfile::AddInfoStringInternal(
    const string& key, const string& value, bool append) {
  // Values may contain sensitive data, such as a query.
  const string& info = RedactCopy(value);  <-----
  lock_guard<SpinLock> l(info_strings_lock_);
  InfoStrings::iterator it = info_strings_.find(key);
  if (it == info_strings_.end()) {
    info_strings_.insert(make_pair(key, info));
    info_strings_display_order_.push_back(key);
  } else {
    if (append) {
      it->second += ", " + value;
    } else {
      it->second = info;
    }
  }
}
{noformat}

For example, if the user tries to redact with the following regex with the 
intention that all emails in the query string to be redacted, the side effect 
of the bug is that it redacts the "User" and "Connected user" parts of the 
query profile.

{noformat}
{
  "version": 1,
  "rules": [
    {
      "description": "Email addresses",
      "search": 
"\\b([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\\-\\._]*[A-Za-z0-9])@([A-Za-z0-9\\.]|[A-Za-z\\.][A-Za-z0-9\\-\\.]*[A-Za-z0-9\\.])+\\b",
      "caseSensitive": true,
      "replace": "em...@redacted.host"
    }
  ]
{noformat}

{noformat}
Query (id=e24f32fa563e2c5d:9ddefb2300000000)
  Summary
    Session ID: 634deaf67308fdd0:781af1fe76464ca9
    Session Type: BEESWAX
    Start Time: 2017-12-13 13:34:31.984911000
    End Time: 2017-12-13 13:34:37.781489000
    Query Type: QUERY
    Query State: FINISHED
    Query Status: OK
    Impala Version: impalad version 2.10.0 RELEASE (build 
871adff6d6e56b57de33059dec2d7fe38e2366bd)
    User: em...@redacted.host <================ not expected
    Connected User: em...@redacted.host <====== not expected
{noformat}

Expected fix: Redact only the query-string. Do not redact anything else in the 
run-time profiles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to