[
https://issues.apache.org/jira/browse/RANGER-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831042#comment-15831042
]
Ramesh Mani commented on RANGER-1310:
-------------------------------------
Thanks [~bosco] for your detailed explanation.
AuditFileCacheProvider which I referred here is same as the FileQueue you are
mentioning.
Here is what I was think
1) We will have One FileQueue which will store the audit in File first using a
FileSpooler. This FileQueue will be synchronous and will be replacing the
AsyncBatchQueue. Only One FileQueue will be there for all the destinations.
2) FileSpooler in the FileQueue which will periodically take Files which are
closed( which is batch here) and send it to AsyncBatchQueue. Here
AsyncBatchQueue is the existing one which send it to Multiple destination, it
will have the existing spooling / backing for each of its destination.
3) If Summary is enabled FileSpooler in the FileQueue will send to
AsyncSummaryBatchQueue which is also existing one and from there summary will
be sent to multiple destination. Summary is done per file.
4) Flow rate in this case would be same across destination ( Based on the time
period in FileQueue to close and open a audit file ). E.g. Solr will get data
every 5 minutes if the file rollover time is 5 minutes. HDFS will also get the
the data in the same rate and flushed to hdfs cache.
Regarding Point A "Would we have one FileQueue per Destination or each
Destination choose the reliability level. E.g. Only HDFSDestination needs
reliability”
When you say reliability requirement, are you mentioning that each
destination should have it own FileQueue to send it to destination at different
rate? Or One destination will use FileQueue ( say hdfs) and another will be
using the existing process of auditing without FileQueue, based on the
reliability requirement? Or all together a new destination with High
Availability like KAFKA which will cater the audit to HDFS / SOLR etc.?
Regarding the data lose what I found is
1) In case of HDFS Plugin sending audit to HDFS, when NameNode get restarted,
the existing reference to an open file in the hdfs is lost. HDFS periodically
flushes data, but some case when this is not done yet we see 0 bytes dangling
file. Surely this is the issue with closing of the file. Also when NameNode is
restarted the data in the Memory buffer of the AsyncBatchQueue is also lost.
2) In case of say HiveServer2 Plugin sending Audit to HDFS when HiveServer2 is
restarted then data in AsyncBatchQueue memory queue is lost. If case of
NameNode getting restarted and if the stream of audit is going on into a hdfs
file, I see hdfs files are getting closed with partial data, I.e audit
framework had sent the data to HDFS and it getting committed, but due to abrupt
NameNode restart partial records are present ( truncated records)
Thanks.
> Ranger Audit framework enhancement to provide an option to allow audit
> records to be spooled to local disk first before sending it to destinations
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: RANGER-1310
> URL: https://issues.apache.org/jira/browse/RANGER-1310
> Project: Ranger
> Issue Type: Bug
> Reporter: Ramesh Mani
> Assignee: Ramesh Mani
>
> Ranger Audit framework enhancement to provide an option to allow audit
> records to be spooled to local disk first before sending it to destinations.
> xasecure.audit.provider.filecache.is.enabled = true ==> This will enable
> this functionality of AuditFileCacheProivder to log the audits locally in a
> file.
> xasecure.audit.provider.filecache.filespool.file.rollover.sec = \{rollover
> time - default is 1 day\} ==> this provides time to send the audit records
> from local to destination and flush the pipe.
> xasecure.audit.provider.filecache.filespool.dir=/var/log/hadoop/hdfs/audit/spool
> ==> provides the directory where the Audit FileSpool cache is present.
> This helps in avoiding missing / partial audit records in the hdfs
> destination which may happen randomly due to restart of respective plugin
> components.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)