[ 
https://issues.apache.org/jira/browse/RANGER-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983247#comment-15983247
 ] 

Yan commented on RANGER-1501:
-----------------------------

[~coheig] 
1) This jira is about the HDFS is not actually flushed *when the abstract 
AuditDestination.flush()* is called so HDFS users can view the audit records. 
The problem you are experiencing seems to be related to somehow 
AuditDestination.flush() not being called at all.
2) PrinterWriter.flush() should also flush the underlying wrapped HDFS streams. 
But the point is that HDFS has 3 flush()-related mechanisms. And the plain 
flush() by HDFS does not flush the data all the way to HDFS datanode. See 
https://issues.apache.org/jira/browse/HADOOP-6313 for details. On the other 
hand, "real flushing" all the way to DN for each logJSON call may have 
performance impacts.

> Audit Flush to HDFS does not actually cause the audit logs to be flushed to 
> HDFS 
> ---------------------------------------------------------------------------------
>
>                 Key: RANGER-1501
>                 URL: https://issues.apache.org/jira/browse/RANGER-1501
>             Project: Ranger
>          Issue Type: Bug
>          Components: audit
>    Affects Versions: 0.7.0
>            Reporter: Yan
>            Assignee: Yan
>             Fix For: 1.0.0
>
>         Attachments: 
> 0001-RANGER-1501-Audit-Flush-to-HDFS-does-not-actually-ca.patch
>
>
> The reason is that HDFS file stream's flush() call does not really flush the 
> data all the way to disk, nor even makes the data visible to HDFS users. See 
> the HDFS semantics of the flush/sync at 
> https://issues.apache.org/jira/browse/HADOOP-6313.
> Consequently the audit logs on HDFS won't be visible/durable from HDFS client 
> until the log file is closed. This will, among other issues, boost chances of 
> losing audit logs in case of system failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to