[jira] [Commented] (HDFS-7240) Object store in HDFS

Colin Patrick McCabe (JIRA) Mon, 06 Jul 2015 13:57:18 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615639#comment-14615639
 ]


Colin Patrick McCabe commented on HDFS-7240:
--------------------------------------------

bq. 3) We will partition objects using hash partitioning and range 
partitioning. The document already talks about hash partitioning, I will add 
more details for range partition support. The small objects will also be stored 
in the container. In the document I mentioned leveldbjni, but we are also 
looking at rocksDB for container implementation to store objects in the 
container.

Interesting, thanks for posting more details about this.

Maybe this has been discussed on another JIRA (I apologize if so),  but does 
this mean that the admin will have to choose between hash and range 
partitioning for a particular bucket?  This seems suboptimal to me... we will 
have to support both approaches, which is more complex, and admins will be left 
with a difficult choice.

It seems better just to make everything range-partitioned.  Although this is 
more complex than simple hash partitioning, it provides "performance 
compatibility" with s3 and other object stores.  s3 provides a fast 
(sub-linear) way of getting all the keys in between some A and B.  It will be 
very difficult to really position ozone as s3-compatible if operations that are 
quick in s3 such as listing all the keys between A and B are O(num_keys^2) in 
ozone.

For example, in (all of) Hadoop's s3 filesystem implementations, listStatus 
uses this quick listing of keys between A and B.  When someone does "listStatus 
/a/b/c", we can ask s3 for all the keys between /a/b/c/ and /a/b/c0 (0 is the 
ASCII value right after slash).  Of course, s3 does not really have 
directories, but we can treat the keys in this range as being in the directory 
/a/b/c for the purposes of s3a or s3n.  If we just had hash partitioning, this 
kind of operation would be O(N^2) where N is the number of keys.  It would just 
be infeasible for any large bucket.

> Object store in HDFS
> --------------------
>
>                 Key: HDFS-7240
>                 URL: https://issues.apache.org/jira/browse/HDFS-7240
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: Ozone-architecture-v1.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7240) Object store in HDFS

Reply via email to