Joe McDonnell created IMPALA-9714:
-------------------------------------

             Summary: SimpleLogger does not respect limits when there are high 
frequency appends
                 Key: IMPALA-9714
                 URL: https://issues.apache.org/jira/browse/IMPALA-9714
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 4.0
            Reporter: Joe McDonnell


SimpleLogger provides a basic guarantee to limit disk space usage for logs. It 
limits the number of items in each log file, and it limits the total number of 
log files. When adding tests for this, both limits can be exceeded when a 
SimpleLogger has a high rate of appends.

The first issue is that SimpleLogger names its files with a prefix plus the 
current time in milliseconds. When SimpleLogger reaches its limit of entries 
for the current file, it flushes that file and calculates a new filename to 
write new output. However, if appends are happening at a high rate, one 
millisecond may not have elapsed, in which case the new filename is the same as 
the old filename. It will just keep appending to the current file.

The second issue has to do with how we enforce the limit on the number of 
files. SimpleLogger relies on LoggingSupport::DeleteOldLogs() to enforce the 
limit on the number of files. DeleteOldLogs() lists the files in the directory 
matching the prefix pattern and inserts them into a map sorted by their mtime. 
The mtime has a time_t type, which has a granularity of seconds. When there are 
high frequency appends to a SimpleLogger, multiple files can be created per 
second, causing collisions in this map. DeleteOldLogs() can only see one file 
per distinct mtime, so it can't enforce the limit. This also means that it can 
only delete at most one file per distinct mtime in each run.

The first issue is offset by the second issue. The second issue makes 
DeleteOldLogs() slower, which limits the number of records written per 
millisecond.

It doesn't seem like the existing users of SimpleLogger have these types of 
high frequency updates. It argues for caution when setting the number of log 
entries per file. A small value for log entries per file can exacerbate these 
cases. This mainly impacts writing unit tests for SimpleLogger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to