[ https://issues.apache.org/jira/browse/IMPALA-6997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joe McDonnell reassigned IMPALA-6997: ------------------------------------- Assignee: Joe McDonnell > Query execution should notice UDF MemLimitExceeded errors more quickly > ---------------------------------------------------------------------- > > Key: IMPALA-6997 > URL: https://issues.apache.org/jira/browse/IMPALA-6997 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 2.13.0 > Reporter: Joe McDonnell > Assignee: Joe McDonnell > Priority: Major > > When a UDF hits a memory limit, it calls RuntimeState::SetMemLimitExceeded() > which sets the query status, but it has no way of returning status directly. > It relies on the caller checking status periodically. > HdfsTableSink::Send() checks for errors by calling > RuntimeState::CheckQueryState() once at the beginning. If it is evaluating a > UDF and that UDF hits the memory limit, it will need to process the whole > RowBatch before it aborts the query. This could be 1024 rows and each row may > hit a memory limit in that UDF. Other locations that process UDFs may be > processing considerably more rows. > There are two general approaches: > # Code locations should check for status more frequently and thus abort > faster after a RuntimeState::SetMemLImitExceeded() call. > # RuntimeState::SetMemLimitExceeded() should be substantially cheaper, > allowing the rows to be processed faster. > RuntimeState::SetMemLimitExceeded() currently calls > MemTracker::MemLimitExceeded() unconditionally. It then checks to see if it > should update query_status_ (i.e. query_status_ is currently ok). Then it > logs this error. This is wasteful, because MemTracker::MemLimitExceeded() is > not a cheap function, and this is flooding the log for each row. > RuntimeState::SetMemLimitExceeded() should check status before running > MemTracker::MemoryLimitExceeded(). If query_status_ is already not ok, it can > avoid the cost of the dump and logging. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org