rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r581604059
##########
File path:
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,273 @@ public OzoneKey next() {
* @param prevKey
* @return {@code List<OzoneKey>}
*/
- private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+ List<OzoneKey> getNextListOfKeys(String prevKey) throws
IOException {
return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
listCacheSize);
}
}
+
+
+ /**
+ * An Iterator to iterate over {@link OzoneKey} list.
+ * buck-1
+ * |
+ * a
+ * |
+ * ---------------------------
+ * | | |
+ * b1 b2 b3
+ * ----- -------- ----------
+ * | | | | | | | |
+ * c1 c2 d1 d2 d3 e1 e2 e3
+ * | |
+ * d21.txt e11.txt
+ *
+ * Will do Depth-First-Traversal and visit node in this fashion:
+ *
+ * a -> b1 -> c1 -> c2 -> b2 -> d1 -> d2 -> d21.txt -> d3 -> b3 -> e1 ->
+ * e11.txt -> e2 -> e3
+ *
+ * Note: there is no order guarantee.
+ */
+ private class KeyIteratorV1 extends KeyIterator{
+
+ private Stack<String> stack;
+ private List<OzoneKey> pendingItemsToBeBatched;
+ private boolean addedKeyPrefix;
+
+ /**
+ * Creates an Iterator to iterate over all keys after prevKey in the
bucket.
+ * If prevKey is null it iterates from the first key in the bucket.
+ * The returned keys match key prefix.
+ *
+ * @param keyPrefix
+ * @param prevKey
+ */
+ KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+ super(keyPrefix, prevKey);
+ addedKeyPrefix = true;
+ }
+
+ @Override
+ List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+ if (stack == null) {
+ stack = new Stack();
+ pendingItemsToBeBatched = new ArrayList<>();
+ }
+
+ // normalize paths
+ if (!addedKeyPrefix) {
+ prevKey = OmUtils.normalizeKey(prevKey, true);
+ String keyPrefixName = "";
+ if (StringUtils.isNotBlank(getKeyPrefix())) {
+ keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+ }
+ setKeyprefix(keyPrefixName);
+ }
+
+ // Get immediate children
+ List<OzoneKey> keysResultList = new ArrayList<>();
+ listChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+ // TODO: Back and Forth seek all the files & dirs, starting from
+ // startKey till keyPrefix.
+
+ return keysResultList;
+ }
+
+ /**
+ * List children under the given keyPrefix path. This doesn't guarantee
+ * order. Internally, it does recursive #listStatus calls to list all the
+ * sub-keysResultList.
Review comment:
Thanks a lot @linyiqun for the reviews and your comments greatly help to
push this feature fast.
**[FileSystem#listStatus](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1890)**
its not guaranteeing sorted order. `ozone#listStatus` user final status list
is also not maintaining sorted order.
We are sorting files and dirs list [separately in OM
KeyManagerImpl](https://github.com/apache/ozone/blob/HDDS-2939/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L2353),
this is done for batch-wise key iteration. OM maintains only seek order to
avoid infinite looping - first it will seek files(in Filetable) and then
dirs(in DirTable). If the `startKey` is a directory that means, all the files
has been fetched in previous iteration and it can stop seeking after fetching
all the dirs from DirTable.
`Ozone#listKeys `also not expects sorted order and existing javadoc didn't
have anything specific about the ordering. But user can experience a sorted
list in V0 code because RocksDB maintains order. So, I thought of explicitly
mention about the order part in V1 code base.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]