[
https://issues.apache.org/jira/browse/HDFS-11686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Elek, Marton updated HDFS-11686:
--------------------------------
Description:
Once a container is closed we need to copy the container to the correct pool or
re-encode the container to use erasure coding. The copyContainer allows users
to get the container as a tarball from the remote machine.
The copyContainer is a basic step to move the raw container data from one
datanode to an other node. It could be used by higher level components such
like the scm which ensures that the replication rules are satisfied.
The CopyContainer by default works in pull model: the destination datanode
could read the raw data from one or more source datanode where the container
exists.
The source provides a binary representation of the container over a common
interface which has two method:
# prepare(containerName)
# copyData(String containerName, OutputStream destination)
Prepare phase is called right after the closing event and the implementation
could prepare for the copy by precreate a compressed tar file from the
container data. As a first step we can provide a simple implementation which
creates the tar files on demand.
The destination datanode should retry the copy if the container in the source
node not yet prepared.
The raw container data is provided over HTTP. The HTTP endpoint should be
separated from the existing KeyValue
was:Once a container is closed we need to copy the container to the correct
pool or re-encode the container to use erasure coding. The copyContainer
allows users to get the container as a tarball from the remote machine.
> Ozone: Support CopyContainer
> ----------------------------
>
> Key: HDFS-11686
> URL: https://issues.apache.org/jira/browse/HDFS-11686
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Anu Engineer
> Assignee: Anu Engineer
> Priority: Major
> Labels: OzonePostMerge
>
> Once a container is closed we need to copy the container to the correct pool
> or re-encode the container to use erasure coding. The copyContainer allows
> users to get the container as a tarball from the remote machine.
> The copyContainer is a basic step to move the raw container data from one
> datanode to an other node. It could be used by higher level components such
> like the scm which ensures that the replication rules are satisfied.
> The CopyContainer by default works in pull model: the destination datanode
> could read the raw data from one or more source datanode where the container
> exists.
> The source provides a binary representation of the container over a common
> interface which has two method:
> # prepare(containerName)
> # copyData(String containerName, OutputStream destination)
> Prepare phase is called right after the closing event and the implementation
> could prepare for the copy by precreate a compressed tar file from the
> container data. As a first step we can provide a simple implementation which
> creates the tar files on demand.
> The destination datanode should retry the copy if the container in the source
> node not yet prepared.
> The raw container data is provided over HTTP. The HTTP endpoint should be
> separated from the existing KeyValue
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]