[ 
https://issues.apache.org/jira/browse/HDFS-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5620:
--------------------------------

    Summary: NameNode: implement Global ACL Set as a memory optimization.  
(was: NameNode: implement Global ACL Set as a space optimization.)

Hi, Haohui.  Thanks for pointing out HDFS-5793.  It looks to me like that patch 
was focused on serialization optimization.  The scope I intended for HDFS-5620 
was to de-duplicate the {{AclFeature}} instances in memory.  For example, 
assuming 10 inodes that all have an ACL with the exact same 10 entries, the 
current HDFS-4685 codebase results in 10 instances of {{AclFeature}} and 100 
instances of {{AclEntry}}.  I'd prefer to reduce that to 1 instance of 
{{AclFeature}} and 10 instances of {{AclEntry}}, with the 1 {{AclFeature}} 
shared by all 10 inodes.  {{AclEntry}} instances are immutable, so it's safe to 
share them.  My use of the word "storage" in the jira title might have been 
misleading, so I've changed it to "memory".

It seems like the serialization optimizations in HDFS-5793 aren't directly 
applicable to this jira, but let me know if I'm missing something.  As far as 
ACL serialization, I agree that we don't need to optimize that on the feature 
branch.  The current code does a full serialization of all entries for each 
inode in fsimage or OP_SET_ACL in edits.  We expect far fewer inodes 
propotionally to have an ACL (unlike {{PermissionStatus}}, which is present on 
every inode), so we're likely to see less performance improvement and storage 
reduction from optimizing ACL serialization.  However, if we want to do it 
later, then it looks like HDFS-5793 is a great approach.  I expect the ACL 
entries could share the same string table for user and group names.

FWIW, I expect HDFS-5620 won't take much work.  I'm attaching a prototype patch 
using the Guava interner.  Quick manual tests using jmap -histo:live show that 
the instances are de-duplicating as I would expect.  I need to put it through 
more testing though.

> NameNode: implement Global ACL Set as a memory optimization.
> ------------------------------------------------------------
>
>                 Key: HDFS-5620
>                 URL: https://issues.apache.org/jira/browse/HDFS-5620
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: HDFS ACLs (HDFS-4685)
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HDFS-5620.1.patch
>
>
> The {{AclManager}} can maintain a Global ACL Set to store all distinct ACLs 
> in use by the file system.  All inodes that have the same ACL entries can 
> share the same ACL instance.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to