[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

ASF GitHub Bot (Jira) Fri, 29 Oct 2021 01:20:16 -0700


     [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671868&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671868
 ]


ASF GitHub Bot logged work on HDFS-16266:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Oct/21 08:19
            Start Date: 29/Oct/21 08:19
    Worklog Time Spent: 10m 
      Work Description: tasanuma commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-954546028


   Thanks for updating it, @tomscut.
   I tried it with my RBF cluster. There is a client server (1.1.1.1), a DFS 
Router (2.2.2.2), and NameNode. When a client sends a request to the Router, 
NameNode logs the following.
   ```
   INFO FSNamesystem.audit: allowed=true   ugi=tasanuma ip=/2.2.2.2       
cmd=listStatus  src=/user/tasanuma      dst=null        perm=null       
proto=rpc       callerContext=CLI,clientIp:1.1.1.1,clientPort:33070
   ```
   In this case, `clientIp:1.1.1.1` is the IP of the client server, but 
`clientPort:33070` is the port of the DFS Router (2.2.2.2), not the one of the 
client server. It would be confusing for the users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 671868)
    Time Spent: 5h 50m  (was: 5h 40m)

> Add remote port information to HDFS audit log
> ---------------------------------------------
>
>                 Key: HDFS-16266
>                 URL: https://issues.apache.org/jira/browse/HDFS-16266
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: tomscut
>            Assignee: tomscut
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

Reply via email to