[jira] [Commented] (HDFS-10419) Building HDFS on top of new storage layer (HDSL)

Sanjay Radia (JIRA) Fri, 16 Mar 2018 16:32:23 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403116#comment-16403116
 ]


Sanjay Radia commented on HDFS-10419:
-------------------------------------

In the " [VOTE] Merging branch HDFS-7240 to trunk" thread [~andrew.wang] asked:
{quote}*Sanjay says*:
 >- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen 
 >acknowledge the >benefit of the >>new block layer).  We have two choices here

>** a) Evolve NN so that it can interact with both old and new block layer,

 >**  b) Fork and create new NN that works only with new block layer, the old 
NN will continue to work with old >>block layer.

>There are trade-offs but clearly the 2nd option has least impact on the old 
>HDFS code.

*Andrew asks*: Are you proposing that we pursue the 2nd option to integrate 
HDSL with HDFS?
{quote}
Originally I would have preferred (a), but Owen made a strong case for (b) in 
my discussions with his last week. I believe approach (a) or (b) will depend 
strongly on what we want to do. For example if we do milestone-1 and get the 2x 
scalability and decide to stop there then clearly go with option (a) - it will 
require little refactoring and one can run old and new HDFS side-by-side. If 
you are planning to follow up milestone-1 with say the caching the working set 
of the namespace, then forking the NN code (ie option b) might be better, and 
the new NN will have to keep pulling over features and bug fixes from the old 
NN.. Konstantine has proposed  other alternatives and we would  evaluate (a) or 
(b) for his alternative.  I am not locked into any particular path or how we 
would do it.

 

> Building HDFS on top of new storage layer (HDSL)
> ------------------------------------------------
>
>                 Key: HDFS-10419
>                 URL: https://issues.apache.org/jira/browse/HDFS-10419
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>            Priority: Major
>         Attachments: Evolving NN using new block-container layer.pdf
>
>
> In HDFS-7240, Ozone defines storage containers to store both the data and the 
> metadata. The storage container layer provides an object storage interface 
> and aims to manage data/metadata in a distributed manner. More details about 
> storage containers can be found in the design doc in HDFS-7240.
> HDFS can adopt the storage containers to store and manage blocks. The general 
> idea is:
> # Each block can be treated as an object and the block ID is the object's key.
> # Blocks will still be stored in DataNodes but as objects in storage 
> containers.
> # The block management work can be separated out of the NameNode and will be 
> handled by the storage container layer in a more distributed way. The 
> NameNode will only manage the namespace (i.e., files and directories).
> # For each file, the NameNode only needs to record a list of block IDs which 
> are used as keys to obtain real data from storage containers.
> # A new DFSClient implementation talks to both NameNode and the storage 
> container layer to read/write.
> HDFS, especially the NameNode, can get much better scalability from this 
> design. Currently the NameNode's heaviest workload comes from the block 
> management, which includes maintaining the block-DataNode mapping, receiving 
> full/incremental block reports, tracking block states (under/over/miss 
> replicated), and joining every writing pipeline protocol to guarantee the 
> data consistency. These work bring high memory footprint and make NameNode 
> suffer from GC. HDFS-5477 already proposes to convert BlockManager as a 
> service. If we can build HDFS on top of the storage container layer, we not 
> only separate out the BlockManager from the NameNode, but also replace it 
> with a new distributed management scheme.
> The storage container work is currently in progress in HDFS-7240, and the 
> work proposed here is still in an experimental/exploring stage. We can do 
> this experiment in a feature branch so that people with interests can be 
> involved.
> A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10419) Building HDFS on top of new storage layer (HDSL)

Reply via email to