[
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174502#comment-16174502
]
Weiwei Yang commented on HDFS-12506:
------------------------------------
Hi [~linyiqun]
Thanks for the comments
bq. Why here is store.peekAround(0, dbVolumeRootKey);? Actually we should find
the right key of dbVolumeRootKey(/#vol/#) and use store.peekAround(1,
dbVolumeRootKey), right?
I think both peek 0 or 1 can work but 0 is slightly easier so I use this
approach, let me review this part of code again to see if there if I can make
it simpler. In the past we have more checks because keys order is different
with it now.
bq. In addition, the following failed UT seem related.
This jenkins job was testing an incorrect patch, the first patch I uploaded
missed the changes to {{OzoneConsts}} that's why they were failing. I deleted
that and re-uploaded the patch. Lets wait for a new jenkins job result on the
real v1 patch.
Thanks
> Ozone: ListBucket is too slow
> -----------------------------
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Priority: Blocker
> Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this
> ends up with scanning all keys under volume /v1. The problem with this design
> is we don't have an efficient approach to locate all buckets without scanning
> the keys.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]