[
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15322378#comment-15322378
]
Ewan Higgs commented on HDFS-9806:
----------------------------------
Hi all. I worked with [~PieterReuse] and [~thodemoor] to get the PoC working vs
S3 through s3a. It required a few minor changes which we can subimt as a PR to
the Microsoft project on Github if you’re interested.
In using the proof of concept, we came up with more questions:
1. It makes sense to us for there to be a series of commands to attach, detach,
and rescan provided storage from the command line. This way, administrators can
attach provided storage to the system without having to restart any nodes. We
propose {{hdfs providedstorage \[-mount,-unmount,-rescan\]}}. We could also
name it {{attach}} and {{detatch}}.
{{hdfs providedstorage -c
org.apache.hadoop.hdfs.server.common.S3AFileRegionFormat -mount s3a://bucket
/s3a-bucket}}
This would end up using a lot of the concepts and code from fs2img but instead
of dumping an fsimg to disk it would send a series of updates to the NN.
{{-rescan}} would be similar but only needs the mount point as an argument.
{{-unmount}} would also need only the mount point but would also require that
the {{StoragePolicy}} was set to {{PROVIDED}} (e.g. no DISK or MEM replication
to unwind).
Alternatively, we could just assume that a NN managing provided storage will be
preoccupied with managing the provided store and would be federated with other
NNs to mount the provided storage. This uses existing functionality to keep a
separation of concerns (storage tiering vs. managing where remote storage
systems are attached to the system).
2. On from @zhz’s questions, the {{PROVIDED}} blocks are not stored with the
{{INodeFile}} (instead, the {{BlockManager}} inserts the locations in
{{createBlockLocation}}), so it seems that anything trying to manage the
replicas would need to have special logic to permit n-1 replicas on the system
if one element of the storage policy is {{PROVIDED}}. This would require
special logic just for the {{PROVIDED}} storage type which seems like a burden
to maintain. Is this a correct understanding of the proposal? Is there a way to
disentangle the block types so there doesn’t need to be special ‘if this is a
PROVIDED block then do some special replication logic which no one in HDFS has
ever done.’
3. If we want to attach multiple provided storage locations within a single
NN, would this mean multiple {{ProvidedStorageMap}} objects in the
{{BlockManager}}, or do you think this should be a mapping in
{{ProvidedStorageMap}} to multiple {{ProvidedStorageMap.BlockProvided}} subtype
objects.
> Allow HDFS block replicas to be provided by an external storage system
> ----------------------------------------------------------------------
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous
> storage systems. The guarantees and semantics provided by these systems are
> often similar, but not identical to those of
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
> Any client accessing multiple storage systems is responsible for reasoning
> about each system independently, and must propagate/and renew credentials for
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to
> immutable file regions, opaque IDs, or other tokens that represent a
> consistent view of the data. While correctness for arbitrary operations
> requires careful coordination between stores, in practice we can provide
> workable semantics with weaker guarantees.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]