[
https://issues.apache.org/jira/browse/IMPALA-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105912#comment-17105912
]
ASF subversion and git services commented on IMPALA-9714:
---------------------------------------------------------
Commit 0815a184fdfeb3293849a8441ba003d63a588dab in impala's branch
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0815a18 ]
IMPALA-9714: Fix edge cases in SimpleLogger and add test
SimpleLogger is used for several existing log types and a change
to use it for the data cache access trace is underway. Since this
is commonly used, it is useful to nail down specific semantics and
test them.
This fixes the following edge cases:
1. LoggingSupport::DeleteOldLogs() currently maintains a map from mtime
to the filename in order to decide which files need to be deleted.
This stops working when there are fast updates to the log, because
mtime has seconds resolution and DeleteOldLogs() is only able to
recognize a single file per mtime with the current map. This changes
the map to a set of pairs of mtime + filename. The behavior is
identical except that if there are multiple files with the same
mtime, they each get their own entry in the set. This allows
DeleteOldLogs() to more accurately maintain the maximum log files.
2. SimpleLogger::Init() now enforces the limit on the maximum number
of log files. This provides a clear semantic when dealing with
preexisting files from a previous incarnation of the same logger.
3. SimpleLogger will now create any intermediate directories when
creating the logging directory (i.e. existingdir/a/b/c works).
4. This changes the enforcement moves enforcement max_audit_event_log_files
to use the limits provided by SimpleLogger rather than a background
thread calling DeleteOldLogs() periodically.
This also introduces SimpleLogger::GetLogFiles(), which is a static
function to get the log files given a directory and prefix. This
is necessary for testing, but it also will be useful for code that
wants to process logs from SimpleLogger.
Testing:
- Added a new simple-logger-test that codifies the expected behavior
- Ran core tests
Change-Id: Idd092a65b31d34f40a660cab7b5e0695a3627c78
Reviewed-on: http://gerrit.cloudera.org:8080/15861
Reviewed-by: Thomas Tauber-Marshall <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> SimpleLogger does not respect limits when there are high frequency appends
> --------------------------------------------------------------------------
>
> Key: IMPALA-9714
> URL: https://issues.apache.org/jira/browse/IMPALA-9714
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.0
> Reporter: Joe McDonnell
> Priority: Major
>
> SimpleLogger provides a basic guarantee to limit disk space usage for logs.
> It limits the number of items in each log file, and it limits the total
> number of log files. When adding tests for this, both limits can be exceeded
> when a SimpleLogger has a high rate of appends.
> The first issue is that SimpleLogger names its files with a prefix plus the
> current time in milliseconds. When SimpleLogger reaches its limit of entries
> for the current file, it flushes that file and calculates a new filename to
> write new output. However, if appends are happening at a high rate, one
> millisecond may not have elapsed, in which case the new filename is the same
> as the old filename. It will just keep appending to the current file.
> The second issue has to do with how we enforce the limit on the number of
> files. SimpleLogger relies on LoggingSupport::DeleteOldLogs() to enforce the
> limit on the number of files. DeleteOldLogs() lists the files in the
> directory matching the prefix pattern and inserts them into a map sorted by
> their mtime. The mtime has a time_t type, which has a granularity of seconds.
> When there are high frequency appends to a SimpleLogger, multiple files can
> be created per second, causing collisions in this map. DeleteOldLogs() can
> only see one file per distinct mtime, so it can't enforce the limit. This
> also means that it can only delete at most one file per distinct mtime in
> each run.
> The first issue is offset by the second issue. The second issue makes
> DeleteOldLogs() slower, which limits the number of records written per
> millisecond.
> It doesn't seem like the existing users of SimpleLogger have these types of
> high frequency updates. It argues for caution when setting the number of log
> entries per file. A small value for log entries per file can exacerbate these
> cases. This mainly impacts writing unit tests for SimpleLogger.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]