[
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225532#comment-16225532
]
Jitendra Nath Pandey commented on HDFS-7240:
--------------------------------------------
[~shv] Thank you for taking out time to review ozone. I appreciate your
comments and questions.
{quote}
There are two main limitations in HDFS
a) The throughput of Namespace operations. Which is limited by the number of
RPCs the NameNode can handle
b) The number of objects (files + blocks) the system can maintain. Which is
limited by the memory size of the NameNode.
{quote}
I agree completely. We believe ozone attempts to address both these issues
for HDFS.
Let us look at the Number of objects problem. Ozone directly addresses the
scalability of number of blocks by introducing storage containers that can hold
multiple blocks together. The earlier efforts on this were complicated by the
fact that block manager and namespace are intertwined in HDFS Namenode. There
have been efforts in past to separate block manager from namespace for e.g.
HDFS-5477. Ozone addresses this problem by cleanly separating the block layer.
Separation of block layer also addresses the file/directories scalability
because it frees up the blockmap from the namenode.
Separate block layer relieves namenode from handling block reports, IBRs,
heartbeats, replication monitor etc, and thus reduces the contention on
FSNamesystem lock and significantly reduces the GC pressure on the namenode.
These improvements will greatly help the RPC performance of the Namenode.
bq. Ozone is probably just the first step in rebuilding HDFS under a new
architecture. With the next steps presumably being HDFS-10419 and HDFS-11118.
The design doc for the new architecture has never been published.
We do believe that Namenode can leverage the ozone’s storage container
layer, however, that is also a big effort. We would like to first have block
layer stabilized in ozone before taking that up. However, we would certainly
support any community effort on that, and in fact it was brought up in last BoF
session at the summit.
Big data is evolving rapidly. We see our customers needing scalable file
systems, Objects stores(like S3) and Block Store(for docker and VMs). Ozone
improves HDFS in two ways. It addresses throughput and scale issues of HDFS,
and enriches it with newer capabilities.
bq. Ozone is a big enough system to deserve its own project.
I took a quick look at the core code in ozone and the cloc command reports
22,511 lines of functionality changes in Java.
This patch also brings in web framework code like Angular.js and that brings in
bunch of css and js files that contribute to the size of the patch, and the
rest are test and documentation changes.
I hope this addresses your concerns.
> Object store in HDFS
> --------------------
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Jitendra Nath Pandey
> Assignee: Jitendra Nath Pandey
> Attachments: HDFS-7240.001.patch, HDFS-7240.002.patch,
> HDFS-7240.003.patch, HDFS-7240.003.patch, HDFS-7240.004.patch,
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS.
> As part of the federation work (HDFS-1052) we separated block storage as a
> generic storage layer. Using the Block Pool abstraction, new kinds of
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]