[jira] [Created] (IMPALA-6997) Query execution should notice UDF MemLimitExceeded errors more quickly

Joe McDonnell (JIRA) Tue, 08 May 2018 15:36:29 -0700

Joe McDonnell created IMPALA-6997:
-------------------------------------

             Summary: Query execution should notice UDF MemLimitExceeded errors 
more quickly
                 Key: IMPALA-6997
                 URL: https://issues.apache.org/jira/browse/IMPALA-6997
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 2.13.0
            Reporter: Joe McDonnell



When a UDF hits a memory limit, it calls RuntimeState::SetMemLimitExceeded() 
which sets the query status, but it has no way of returning status directly. It 
relies on the caller checking status periodically.

HdfsTableSink::Send() checks for errors by calling 
RuntimeState::CheckQueryState() once at the beginning. If it is evaluating a 
UDF and that UDF hits the memory limit, it will need to process the whole 
RowBatch before it aborts the query. This could be 1024 rows and each row may 
hit a memory limit in that UDF. Other locations that process UDFs may be 
processing considerably more rows.

There are two general approaches:
 # Code locations should check for status more frequently and thus abort faster 
after a RuntimeState::SetMemLImitExceeded() call.
 # RuntimeState::SetMemLimitExceeded() should be substantially cheaper, 
allowing the rows to be processed faster.

RuntimeState::SetMemLimitExceeded() currently calls 
MemTracker::MemLimitExceeded() unconditionally. It then checks to see if it 
should update query_status_ (i.e. query_status_ is currently ok). Then it logs 
this error. This is wasteful, because MemTracker::MemLimitExceeded() is not a 
cheap function, and this is flooding the log for each row. 
RuntimeState::SetMemLimitExceeded() should check status before running 
MemTracker::MemoryLimitExceeded(). If query_status_ is already not ok, it can 
avoid the cost of the dump and logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (IMPALA-6997) Query execution should notice UDF MemLimitExceeded errors more quickly

Reply via email to