[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504691#comment-14504691
 ] 

Jitendra Nath Pandey commented on HDFS-7240:
--------------------------------------------

[~clamb], thanks for a detailed review and feedback.
Some of the answers are below, for others I will post the updated document with 
details as you have pointed out.

bq. Is the 1KB key size limit a hard limit or just a design/implementation 
target
 It is a design target. Amazon's S3 limits the keys to 1KB. I doubt there would 
be many use cases that need beyond it. I see the point that instead of hard 
limit allow for degradation. But at this point in the project, I would prefer 
to have more strict limits, and relax later instead of setting user 
expectations too high to begin with.
bq. Caching to reduce network traffic
I agree that a good caching layer will significantly help the performance. 
Ozone handler seems like a natural place for caching. However, a thick client 
can do its own caching without overloading datanodes. The focus of phase 1 is 
to get the semantics right and lay down the basic architecture in place. We 
plan to attack performance improvements in a later phase of the project.
bq. Security mechanisms
Frankly, I haven't thought about anything other than kerberos. I agree, we 
should evaluate it against what other popular object stores use.
bq. Hot spots in hash partitioning.
It is possible for a pathological sequence of keys, but in practice hash 
partitioning has been successfully used to avoid hot spots e.g. 
hash-partitioned indexes in databases. We would need to pick hash functions 
with nice distribution properties.
bq. Secondary indexing consistency
The secondary index need not be strictly consistent with the bucket. That means 
a listing operation with prefix or key range may not reflect the latest of the 
bucket. We will have a more concrete proposal in the second phase of the 
project.
bq. Storage volume GET for admin
  I believed that it is not a security concern in allowing users to see all 
storage volume names. However, it is possible to conceive a use case where an 
admin would want to restrict that. Probably we can support both the modes.

bq.  "no guarantees on partially written objects"
    The object will not be visible until completely written. Also, no recovery 
is planned for the first phase if the write fails. In future, we would like to 
support multi-part uploads.

bq. Re-using block management implementation for container management.
    We intend to reuse the DatanodeProtocol that datanode uses to talk to 
namenode. I will add more details to the document and on the corresponding jira.

bq. storage container prototype using leveldbjni
  We will add lot more details on this in its own jira. The idea is to use 
leveldbjni in the storage container in the datanodes. We plan to prototype a 
storage container that stores objects as individual files within the container 
however, that would need an index within the container to map a key to a file. 
We will use leveldbjni for that index.
  Another possible prototype is to put the entire object in the leveldbjni 
itself. It will take some experimentation to zero-down to the right approach. 
We will also try to make the storage container implementation pluggable to make 
it easy to try different implementations.
bq. How are quotas enabled and set? ....who enforces them
  All the Ozone APIs are implemented in ozone handler. The quota will also be 
enforced by the ozone handler. I will update the document with the APIs.

> Object store in HDFS
> --------------------
>
>                 Key: HDFS-7240
>                 URL: https://issues.apache.org/jira/browse/HDFS-7240
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: Ozone-architecture-v1.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to