[
https://issues.apache.org/jira/browse/RANGER-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826562#comment-15826562
]
Ramesh Mani edited comment on RANGER-1310 at 1/17/17 6:32 PM:
--------------------------------------------------------------
[~bosco], Yes your right!
Plan is to create a new AuditProvider (AuditFileCacheProivder) with a File
spooler ( we will use similar functionality of AuditFileSpool to do this) to
stash the Audit Events to local disk. This AuditFileCacheProivider is
synchronous to log audit into local file. This AuditProvider will also take
AsyncQueue as Consumer, which gets those audit events from the local file
periodically. Once Audit Event is sent to AsyncQueue and gets flushed, the
current functionality of AsyncQueue ( async_queue -> summary_queue ->
multidestination -> batch_queue --> hdfs destination / solr / kafka / log4j..)
which will sent it to multi-destination will take care of the rest of message
propagation. Here when Audit event is sent to AsyncQueue it get flushed so it
reaches destination immediate. In case of hdfs destination hflush() is done so
we drain the pipe.
Earlier in a scenarios when the AuditBatchQueue memory buffer get destroyed by
restart of component, all the logs in this memory buffer are lost and also it
result is partial records in the hdfs filesystem ( because of streaming
happening when hdfs gets restarted). This also sometime results in dangling
files with 0 bytes in hdfs without any reference and unclosed files).
With the current proposal we avoid these issues as we flush immediately the
pipe to destination. Proposal is to flush frequently ( 5 to 10 minutes
interval). When destination is down and memory gets destroyed , we have the
local spool file to send it again and flush. Also if there is any issue in
spooling the records to local file, Authorization of the request will be failed
with the error message to correct this spooling issue. Please let me know if
you have any concerns on this approach.
was (Author: rmani):
[~bosco], Yes your right!
Plan is to create a new AuditProvider (AuditFileCacheProivder) with a File
spooler ( we will similar functionality of AuditFileSpool to do this) to stash
the Audit Events to local disk. This AuditFileCacheProivider is synchronous to
log audit into local file. This AuditProvider will also take AsyncQueue as
Consumer, which gets those audit events from the local file periodically. Once
Audit Event is sent to AsyncQueue and gets flushed, the current functionality
of AsyncQueue ( async_queue -> summary_queue -> multidestination ->
batch_queue --> hdfs destination / solr / kafka / log4j..) which will sent it
to multi-destination will take care of the rest of message propagation. Here
when Audit event is sent to AsyncQueue it get flushed so it reaches destination
immediate. In case of hdfs destination hflush() is done so we drain the pipe.
Earlier in a scenarios when the AuditBatchQueue memory buffer get destroyed by
restart of component, all the logs in this memory buffer are lost and also it
result is partial records in the hdfs filesystem ( because of streaming
happening when hdfs gets restarted). This also sometime results in dangling
files with 0 bytes in hdfs without any reference and unclosed files).
With the current proposal we avoid these issues as we flush immediately the
pipe to destination. Proposal is to flush frequently ( 5 to 10 minutes
interval). When destination is down and memory gets destroyed , we have the
local spool file to send it again and flush. Also if there is any issue in
spooling the records to local file, Authorization of the request will be failed
with the error message to correct this spooling issue. Please let me know if
you have any concerns on this approach.
> Ranger Audit framework enhancement to provide an option to allow audit
> records to be spooled to local disk first before sending it to destinations
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: RANGER-1310
> URL: https://issues.apache.org/jira/browse/RANGER-1310
> Project: Ranger
> Issue Type: Bug
> Reporter: Ramesh Mani
>
> Ranger Audit framework enhancement to provide an option to allow audit
> records to be spooled to local disk first before sending it to destinations.
> xasecure.audit.provider.filecache.is.enabled = true ==> This will enable
> this functionality of AuditFileCacheProivder to log the audits locally in a
> file.
> xasecure.audit.provider.filecache.filespool.file.rollover.sec = \{rollover
> time - default is 1 day\} ==> this provides time to send the audit records
> from local to destination and flush the pipe.
> xasecure.audit.provider.filecache.filespool.dir=/var/log/hadoop/hdfs/audit/spool
> ==> provides the directory where the Audit FileSpool cache is present.
> This helps in avoiding missing / partial audit records in the hdfs
> destination which may happen randomly due to restart of respective plugin
> components.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)