Lei (Eddy) Xu commented on HDFS-7878:

[~chris.douglas] Thanks a lot for working on it. 

I'd prefer to use {{FileStatus}} with {{file id}} instead a new {{FileHandler}} 
or {{InodeId}}.  As it is more familiar with the users, and if people want to 
use {{FileId}} alone to save memory (e.g., using in cache), they have the 
choice of using {{FileStatus#getFileId()}}.  I think it'd be easier to make 
this API be used by downstream projects. Additionally, {{InodeId}} looks 
implementation-specific to me, which makes this API not useful to or be 
supported natively by other backend (i.e., Azure or S3)?. One additional point 
is that {{stat(2)}} returns inode ({{stat.st_inode}}) as well, so it should not 
be too surprised for the end user.

And it might be worthwhile to take this chance to finally change {{FileStatus}} 
to be serializable for Protobuf (HDFS-6984) from Hadoop 3 and onward.  


> API - expose an unique file identifier
> --------------------------------------
>                 Key: HDFS-7878
>                 URL: https://issues.apache.org/jira/browse/HDFS-7878
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, 
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, 
> HDFS-7878.06.patch, HDFS-7878.patch
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by 
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be 
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct 
> when file is overwritten.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to