[
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292968#comment-15292968
]
Thomas Demoor commented on HDFS-9806:
-------------------------------------
Thanks [~chris.douglas] for the architecture doc. Very interesting feature.
First, the way we interpreted the document, the external (provided) storage is
the source of truth so any changes there should be updated in HDFS and any
inconsistencies that arise would favour the external store. With that in mind,
we had some questions mostly relating to the following two paragraphs in
section 3.4:
{quote}
Periodically, and/or when a particular directory or file is accessed on the
Namenode, the Namenode queries the PROVIDED store to validate its cache. If the
ID changed since its last update, the Namenode updates the corresponding
metadata and block information.
The Datanode is also responsible for verifying the nonce when servicing read
requests. Without this check, it may return data that does not match the record
in the Namenode (e.g., if another file is renamed onto the same path in the
external store).
{quote}
Questions:
# If the Namenode is accessing the PROVIDED storage to update its mapping
shouldn’t it also update the nonce data at the same time and instruct the
datanode to refresh too? Or is the intention for the Namenode to only update
the directory information and not the actual nonce data for the files? (If so,
how could the Namenode apply heuristics to detect “promoting output to a parent
directory”?).
# How should this work in the face of Storage Policies? For example, if we have
a StoragePolicy of {SSD, DISK, PROVIDED} it seems to us that it would make
sense for the Namenode to use a HEAD request (or equivalent) to see if the data
is still valid. If so, tell the client to talk to the Datanode with the file on
SSD. Otherwise, the data needs to be refreshed across all three Datanodes. As
the Namenode currently manages replication requests, it seems that it would
make sense for it to trigger requests to refresh the data from the PROVIDED
storage system.
# When you say “Periodically and/or when a particular directory or file is
accessed on the Namenode” do you mean this is something to be configured, or
just that it hasn’t been decided if both are required. We think periodically is
required since this is the only way to clean up directory listings with files
that have been removed from the PROVIDED storage. On access, it makes sense to
always make a HEAD request (or equivalent) to make sure it isn’t stale.
# Finally, do you anticipate changes to the wire protocol between the Namenode
and Datanode?
> Allow HDFS block replicas to be provided by an external storage system
> ----------------------------------------------------------------------
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous
> storage systems. The guarantees and semantics provided by these systems are
> often similar, but not identical to those of
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
> Any client accessing multiple storage systems is responsible for reasoning
> about each system independently, and must propagate/and renew credentials for
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to
> immutable file regions, opaque IDs, or other tokens that represent a
> consistent view of the data. While correctness for arbitrary operations
> requires careful coordination between stores, in practice we can provide
> workable semantics with weaker guarantees.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]