[
https://issues.apache.org/jira/browse/RANGER-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983247#comment-15983247
]
Yan commented on RANGER-1501:
-----------------------------
[~coheig]
1) This jira is about the HDFS is not actually flushed *when the abstract
AuditDestination.flush()* is called so HDFS users can view the audit records.
The problem you are experiencing seems to be related to somehow
AuditDestination.flush() not being called at all.
2) PrinterWriter.flush() should also flush the underlying wrapped HDFS streams.
But the point is that HDFS has 3 flush()-related mechanisms. And the plain
flush() by HDFS does not flush the data all the way to HDFS datanode. See
https://issues.apache.org/jira/browse/HADOOP-6313 for details. On the other
hand, "real flushing" all the way to DN for each logJSON call may have
performance impacts.
> Audit Flush to HDFS does not actually cause the audit logs to be flushed to
> HDFS
> ---------------------------------------------------------------------------------
>
> Key: RANGER-1501
> URL: https://issues.apache.org/jira/browse/RANGER-1501
> Project: Ranger
> Issue Type: Bug
> Components: audit
> Affects Versions: 0.7.0
> Reporter: Yan
> Assignee: Yan
> Fix For: 1.0.0
>
> Attachments:
> 0001-RANGER-1501-Audit-Flush-to-HDFS-does-not-actually-ca.patch
>
>
> The reason is that HDFS file stream's flush() call does not really flush the
> data all the way to disk, nor even makes the data visible to HDFS users. See
> the HDFS semantics of the flush/sync at
> https://issues.apache.org/jira/browse/HADOOP-6313.
> Consequently the audit logs on HDFS won't be visible/durable from HDFS client
> until the log file is closed. This will, among other issues, boost chances of
> losing audit logs in case of system failure.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)