[
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314609#comment-15314609
]
Zhe Zhang commented on HDFS-9806:
---------------------------------
Thanks for posting the design and PoC [~chris.douglas]! It's really exciting to
see this work moving forward.
A few questions / comments about the current design doc:
# Having a {{PROVIDED}} storage type is an interesting idea. There are a few
tricky issues though. How should we update the over-replication logic to work
with caching? If replication factor is 1, and a {{PROVIDED}} block is cached by
DN, NN will try to remove the excess replica right? If we specify a replication
factor > 1, NN will always try to create DN-local replicas, which is probably
not what we want as the opportunistic caching logic. How should we specify the
preference of caching on SSD vs. HDD? How about {{Mover}} and {{Balancer}}?
# bq. blocks in the PROVIDED storage type are not included by any Datanode as
part of its block report.
So does a DN still reports connectivity to the {{PROVIDED}} store to NN at each
BR? I guess an alternative is for NN itself to periodically check the
connectivity?
# Per section 3.4, I think the NN also needs to have a "PROVIDED store client"
anyway, right?
bq. Data and metadata in the external store can change out-of-band (e.g., daily
log data uploaded).
This would be a tricky case to handle. How are directories persisted in the
external store? Consider the below case:
#* An empty HDFS cluster is built on WASB (only {{/}})
#* {{mkdir /data}} through HDFS. The metadata should be persisted in WASB in
some form right?
#* {{/data/log1.txt}} is uploaded by some other WASB client (not the HDFS on
top of it)
#* {{ls /data}} is done through HDFS. I guess HDFS NN can check the WASB data
structure for {{/data}} and get the update
#* How about when another directory {{/jobs}} is created through other WASB
client? Are we assuming HDFS has created data structure in WASB to track root
dir {{/}}?
# I think more details can be added to Section 2 for clarification. In
particular, per the above comment, is this work mainly intended for "using a
big external store to back a single smaller HDFS"? Or the above "out-of-band
update" use case is also important? Is it better to have a phase 1 for
single-HDFS use case (no other updates to external store)?
> Allow HDFS block replicas to be provided by an external storage system
> ----------------------------------------------------------------------
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous
> storage systems. The guarantees and semantics provided by these systems are
> often similar, but not identical to those of
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
> Any client accessing multiple storage systems is responsible for reasoning
> about each system independently, and must propagate/and renew credentials for
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to
> immutable file regions, opaque IDs, or other tokens that represent a
> consistent view of the data. While correctness for arbitrary operations
> requires careful coordination between stores, in practice we can provide
> workable semantics with weaker guarantees.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]