[
https://issues.apache.org/jira/browse/HDFS-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099438#comment-16099438
]
Weiwei Yang commented on HDFS-11920:
------------------------------------
Hi [~vagarychen]
Thanks for the patch, it looks good to me overall. I have few comments please
let me know if that makes sense to you,
1. *DistributedStorageHandler*
line 410: I am wondering why it is building the containerKey to
"/volume/bucket/blockID", why not use simply {{BlockID}} here? This seems to be
the key that written to container.db in container metadata.
2. *ChunkOutputStream*
I am thinking if we really need to let it know about an ozone object key, see
line 56. Right now it writes a chunk file like
{{ozoneKeyName_stream_streamId_chunk_n}}, why not
{{blockId_stream_streamId_chunk_n}} instead? I think we can remove this
variable from this class.
line 168: it writes {{b}} length to the outputstream but the position only
moves 1, seems incorrect.
3. *TestMultipleContainerReadWrite*
In {{TestWriteRead}}, can we check the number of chunk files for the key
actually matches the desired number of split?
4. Looks like chunk group input or output stream maintains a list of streams
and r/w in liner manner, can we optimize this to do parallel r/w as they are
independent chunks. That says to have a thread fetch a certain length of
content from a chunk, then merge them together afterwards. It doesn't have to
be done in this patch, but I think that might be a good improvement.
Thanks
> Ozone : add key partition
> -------------------------
>
> Key: HDFS-11920
> URL: https://issues.apache.org/jira/browse/HDFS-11920
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Chen Liang
> Assignee: Chen Liang
> Attachments: HDFS-11920-HDFS-7240.001.patch,
> HDFS-11920-HDFS-7240.002.patch, HDFS-11920-HDFS-7240.003.patch,
> HDFS-11920-HDFS-7240.004.patch
>
>
> Currently, each key corresponds to one single SCM block, and putKey/getKey
> writes/reads to this single SCM block. This works fine for keys with
> reasonably small data size. However if the data is too huge, (e.g. not even
> fits into a single container), then we need to be able to partition the key
> data into multiple blocks, each in one container. This JIRA changes the
> key-related classes to support this.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]