[
https://issues.apache.org/jira/browse/HDFS-10706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HDFS-10706:
---------------------------------
Attachment: HDFS-10706.001.patch
Attached tool from proof of concept. If committed to the HDFS-9806 branch, this
would serve as a base for refinement.
The current tool generates a specific version of an FSImage using its own
writer. Since HDFS-9835, perhaps this could be refactored to use that code. If
mounts can be created dynamically in the NN, then this tool could be a thin
wrapper around that logic, if it is useful to maintain it.
The tool walks an existing namespace using the {{TreeWalk}} classes. A walk is
a traversal yielding a hierarchical sequence of paths, such that every node
visited has an ancestor present in the image (as a bias against orphaned
inodes). For example, {{FSTreeWalk}} walks a {{FileSystem}} from a given root,
but any mapping into HDFS would be satisfactory. Concurrent traversal may be
effected using a {{fork()}} on the iterator.
Each node accepted by a filter over the iterator is partitioned into blocks,
assigned ownership and replication, and written to the FSImage. Some UDFs
intercept this flow to modify mappings. Each block in the image is assigned a
{{BlockAlias}} entry in a configurable block map. The patch includes CSV format
for testing; implementations using Azure and AWS services exist for the PoC.
The patch includes some core interfaces (particularly {{BlockAlias}}, some
{{\*Resolver}} classes) to compile, but these will be designed in other JIRAs.
> Add tool generating FSImage from external store
> -----------------------------------------------
>
> Key: HDFS-10706
> URL: https://issues.apache.org/jira/browse/HDFS-10706
> Project: Hadoop HDFS
> Issue Type: Task
> Components: namenode, tools
> Reporter: Chris Douglas
> Attachments: HDFS-10706.001.patch
>
>
> To experiment with provided storage, this provides a tool to map an external
> namespace to an FSImage/NN storage. By loading it in a NN, one can access the
> remote FS using HDFS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]