[ 
https://issues.apache.org/jira/browse/HDFS-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-8468:
--------------------------------
    Attachment: HDFS-8468-HDFS-7285.patch

Attached the patch.

Idea is to carry {{ECSchema}} and {{stripeCellSize}} in {{LocatedBlocks}} 
instead of {{HdfsFileStatus}} at the time of reading.
fetched {{LocatedBlocks}} can be re-used inside {{DFSInputStream}} unless 
refresh is required.

So,in NN side only 'open' command will be logged in audit log per file, as 
earlier. (again unless refresh is required).

Main testcase for this fix are existing 'TestAuditLogs' which are failing 
currently in the branch.

> 2 RPC calls for every file read in DFSClient#open(..) resulting in double 
> Audit log entries
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8468
>                 URL: https://issues.apache.org/jira/browse/HDFS-8468
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFS-8468-HDFS-7285.patch
>
>
> In HDFS-7285 branch, 
> To determine whether file is striped/not and get the Schema for the file, 2 
> RPCs done to Namenode.
> This is resulting in double audit logs for every file read for both 
> striped/non-striped.
> This will be a major impact in size of audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to