Riza Suminto created IMPALA-9612:
------------------------------------

             Summary: Runtime filter wait longer than it should be
                 Key: IMPALA-9612
                 URL: https://issues.apache.org/jira/browse/IMPALA-9612
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
            Reporter: Riza Suminto
            Assignee: Riza Suminto


In one of my query profile, I found an info string like this:
  Runtime filters: All filters arrived. Waited 59s783ms. Maximum arrival delay: 
15s296ms.
If all runtime filters arrived within 15s296ms, then it should not wait until 
59s783ms to proceed.

Looking at runtime-filter.cc, It looks like there is a potential race condition 
in function RuntimeFilter::WaitForArrival().
bool RuntimeFilter::WaitForArrival(int32_t timeout_ms) const {
  unique_lock<mutex> l(arrival_mutex_);
  while (arrival_time_.Load() == 0) {
    int64_t ms_since_registration = MonotonicMillis() - registration_time_;
    int64_t ms_remaining = timeout_ms - ms_since_registration;
    if (ms_remaining <= 0) break;
    arrival_cv_.WaitFor(l, ms_remaining * MICROS_PER_MILLI);
  }
  return arrival_time_.Load() != 0;
}
Between checking arrival_time_.Load() and calling arrival_cv_.WaitFor(), 
arrival_cv_ might be already signaled by either RuntimeFilter::SetFilter() or 
RuntimeFilter::Cancel() because they do not acquire arrival_mutex_ first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to