[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419413#comment-13419413
 ] 

Marcelo Vanzin commented on HDFS-3680:
--------------------------------------

Thanks for the comments everyone. Good to know FSNamesystem is a singleton, so 
no need to worry about that issue.

As for queuing / blocking, I understand the concerns, but I don't see how 
they're any different than today. To do something like this today, you'd do one 
of the following:

(i) Process logs post-facto, by tailing the HDFS log file or something along 
those lines.

This would be the "completely off process" model, not affecting the NN 
operation.

(ii) Use a custom log appender that parses log messages inside the NN.

This is almost the same as what my patch does; except it's tied to the log 
system implementation.

Both cases suffer from turning a log message into something expected to be a 
"stable" interface; the second approach (which is doable today, just to make 
that clear) adds on top of that all the concerns you guys listed.

Does anyone know how the different log systems behave when using file loggers, 
which I guess would be the vast majority of cases for this code? Do they do 
queuing, do they block waiting for the message to be written, what happens when 
they flush buffers, what if the log file is on NFS, etc? Lots of the concerns 
raised here are similar to those questions.

I agree that implementations of this interface can do all sorts of bad things, 
but I don't see how that's any worse than today. Unless you guys want to forgo 
using a log system at all for audit logging, and force writing to files as the 
only option, having your own custom code to do it and avoid as many of the 
issues discussed here as possible.

The code could definitely force queuing on this code path; since not everybody 
may need that (the current log approach being the example), I'm wary of turning 
that into a requirement.

So, those out of the way, a few comments about other things:
. audit logging under the namesystem lock: that can be hacked around. One ugly 
way would be to store the audit data in a thread local, and flush it in the 
unlock() methods.

. using the interface for the existing log: that can be easily done; my goal 
with not changing that part was to not change the existing behavior. I could 
use the "AUDITLOG access logger" as the default one, that would be very easy to 
do. A custom access logger would replace it (or we could make the config option 
a list, this allowing the use of both again).
                
> Allows customized audit logging in HDFS FSNamesystem
> ----------------------------------------------------
>
>                 Key: HDFS-3680
>                 URL: https://issues.apache.org/jira/browse/HDFS-3680
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>         Attachments: accesslogger-v1.patch, accesslogger-v2.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to