[
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173408#comment-16173408
]
Xiaoyu Yao commented on HDFS-12506:
-----------------------------------
Thanks [~cheersyang] for reporting this. An easy fix might be assigning a
different prefix for the volume, bucket object key itself. Example,
For volume in your example will be keyed like
/#v1
For bucket in your example will be keyed like
/#v1/#b1
A regular key be keyed as-is today without the special prefix:
/v1/b1/k1
This way, if you want to just list volume or bucket, it will not be affected by
how many objects contained. With some minor changes in the KSM MetadataManager,
we should be able handle this with better performance. Let me know your
thoughts.
> Ozone: ListBucket is too slow
> -----------------------------
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Weiwei Yang
> Priority: Blocker
> Labels: ozoneMerge
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this
> ends up with scanning all keys under volume /v1. The problem with this design
> is we don't have an efficient approach to locate all buckets without scanning
> the keys.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]