[
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HDFS-7878:
--------------------------------
Attachment: HDFS-7878.19.patch
Updated patch with a simple unit test for the {{/.reserved/.inode/fileId}}
behavior. These special paths should be documented. Filed HDFS-12729
bq. How do commented out lines ensure wire compatibility? It would make sense
if these were obsolete fields and we didn't want to reuse obsolete number in
case older messages get misinterpreted, but then we should be reusing.
Nevertheless, it appears we're not doing that in the latest patch anymore.
Sorry, that was too cursory. I'll summarize some of the discussion from
HDFS-6984. The idea was that the {{FileStatus}} format should match
{{HdfsFileStatus}}. When the {{PathHandle}} is part of the payload, it could be
deserialized as an opaque blob in the {{FileStatus}} schema or with the
attributes of an {{HdfsPathHandle}} when the type is known. If HDFS were to
embed other information in the {{PathHandle}}, it could be interpreted by an
intermediary without dropping fields.
With the {{open(PathHandle)}} pattern, we're more-or-less asserting that the
caller is the only one who can do the translation. So if a process wants to
pass or preserve a handle, then it passes the {{PathHandle}}; it's insufficient
to serialize the {{FileStatus}} in PB on one end, pick it up on the other, and
construct a {{PathHandle}}. This is what the unit test used to verify, but that
is no longer part of the contract.
bq. In testCrossSerializationProto and testJavaSerialization we're removing
assertions that the PathHandle to what should be the same file should be
identical. Isn't that still true, and should be?
The assertions verified that the {{PathHandle}} payload in {{FileStatus}} is
preserved. Since we're making {{PathHandle}} serializable across processes not
{{FileStatus}} PB serialization, the unit test only verifies {{PathHandle}}
serialization.
> API - expose an unique file identifier
> --------------------------------------
>
> Key: HDFS-7878
> URL: https://issues.apache.org/jira/browse/HDFS-7878
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Labels: BB2015-05-TBR
> Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch,
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch,
> HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch,
> HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch,
> HDFS-7878.12.patch, HDFS-7878.13.patch, HDFS-7878.14.patch,
> HDFS-7878.15.patch, HDFS-7878.16.patch, HDFS-7878.17.patch,
> HDFS-7878.18.patch, HDFS-7878.19.patch, HDFS-7878.patch
>
>
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct
> when file is overwritten.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]