[ 
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292968#comment-15292968
 ] 

Thomas Demoor commented on HDFS-9806:
-------------------------------------

Thanks [~chris.douglas] for the architecture doc. Very interesting feature.
 
First, the way we interpreted the document, the external (provided) storage is 
the source of truth so any changes there should be updated in HDFS and any 
inconsistencies that arise would favour the external store. With that in mind, 
we had some questions mostly relating to the following two paragraphs in 
section 3.4:
{quote}
Periodically, and/or when a particular directory or file is accessed on the 
Namenode, the Namenode queries the PROVIDED store to validate its cache. If the 
ID changed since its last update, the Namenode updates the corresponding 
metadata and block information.
The Datanode is also responsible for verifying the nonce when servicing read 
requests. Without this check, it may return data that does not match the record 
in the Namenode (e.g., if another file is renamed onto the same path in the 
external store).
{quote}
Questions:
# If the Namenode is accessing the PROVIDED storage to update its mapping 
shouldn’t it also update the nonce data at the same time and instruct the 
datanode to refresh too? Or is the intention for the Namenode to only update 
the directory information and not the actual nonce data for the files? (If so, 
how could the Namenode apply heuristics to detect “promoting output to a parent 
directory”?).
# How should this work in the face of Storage Policies? For example, if we have 
a StoragePolicy of {SSD, DISK, PROVIDED} it seems to us that it would make 
sense for the Namenode to use a HEAD request (or equivalent) to see if the data 
is still valid. If so, tell the client to talk to the Datanode with the file on 
SSD. Otherwise, the data needs to be refreshed across all three Datanodes. As 
the Namenode currently manages replication requests, it seems that it would 
make sense for it to trigger requests to refresh the data from the PROVIDED 
storage system.
# When you say “Periodically and/or when a particular directory or file is 
accessed on the Namenode” do you mean this is something to be configured, or 
just that it hasn’t been decided if both are required. We think periodically is 
required since this is the only way to clean up directory listings with files 
that have been removed from the PROVIDED storage. On access, it makes sense to 
always make a HEAD request (or equivalent) to make sure it isn’t stale.
# Finally, do you anticipate changes to the wire protocol between the Namenode 
and Datanode?


> Allow HDFS block replicas to be provided by an external storage system
> ----------------------------------------------------------------------
>
>                 Key: HDFS-9806
>                 URL: https://issues.apache.org/jira/browse/HDFS-9806
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Chris Douglas
>         Attachments: HDFS-9806-design.001.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous 
> storage systems. The guarantees and semantics provided by these systems are 
> often similar, but not identical to those of 
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
>  Any client accessing multiple storage systems is responsible for reasoning 
> about each system independently, and must propagate/and renew credentials for 
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to 
> immutable file regions, opaque IDs, or other tokens that represent a 
> consistent view of the data. While correctness for arbitrary operations 
> requires careful coordination between stores, in practice we can provide 
> workable semantics with weaker guarantees.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to