Joe McDonnell created IMPALA-9714:
-------------------------------------
Summary: SimpleLogger does not respect limits when there are high
frequency appends
Key: IMPALA-9714
URL: https://issues.apache.org/jira/browse/IMPALA-9714
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 4.0
Reporter: Joe McDonnell
SimpleLogger provides a basic guarantee to limit disk space usage for logs. It
limits the number of items in each log file, and it limits the total number of
log files. When adding tests for this, both limits can be exceeded when a
SimpleLogger has a high rate of appends.
The first issue is that SimpleLogger names its files with a prefix plus the
current time in milliseconds. When SimpleLogger reaches its limit of entries
for the current file, it flushes that file and calculates a new filename to
write new output. However, if appends are happening at a high rate, one
millisecond may not have elapsed, in which case the new filename is the same as
the old filename. It will just keep appending to the current file.
The second issue has to do with how we enforce the limit on the number of
files. SimpleLogger relies on LoggingSupport::DeleteOldLogs() to enforce the
limit on the number of files. DeleteOldLogs() lists the files in the directory
matching the prefix pattern and inserts them into a map sorted by their mtime.
The mtime has a time_t type, which has a granularity of seconds. When there are
high frequency appends to a SimpleLogger, multiple files can be created per
second, causing collisions in this map. DeleteOldLogs() can only see one file
per distinct mtime, so it can't enforce the limit. This also means that it can
only delete at most one file per distinct mtime in each run.
The first issue is offset by the second issue. The second issue makes
DeleteOldLogs() slower, which limits the number of records written per
millisecond.
It doesn't seem like the existing users of SimpleLogger have these types of
high frequency updates. It argues for caution when setting the number of log
entries per file. A small value for log entries per file can exacerbate these
cases. This mainly impacts writing unit tests for SimpleLogger.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]