[ 
https://issues.apache.org/jira/browse/HDFS-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040118#comment-16040118
 ] 

Weiwei Yang commented on HDFS-11926:
------------------------------------

Thanks [~anu] and [~xyao] for the comments, I just updated v4 patch to 
incorporate following changes based on your comments.

[~anu]'s comments
bq. if count is 0 or less than zero, I think we should just throw an 
IllegalArgumentException.

Fixed. Also updated the java doc.

bq. We do a DbIter.seekToFirst – shouldn't we do that if and only if the 
startKey == null ? It looks like it is a frivolous operation if startKey 
argument is not null.

Fixed. It only seekToFirst when startKey is null. Thanks for pointing this out.

bq. Even though I did suggest that we should throw if cannot find the startKey, 
We also need a plan to handle the situation where someone is iterating a bucket 
with concurrent deletes going on ...

Well for this common helper, I think it should be good to throws an exception 
when startKey not found. The scenario you mentioned, I am not sure, didn't we 
have a read/write lock in KSM metadata manager that to avoid such races? Agree 
to open another jira to investigate this more.

bq. One more minor suggestion, Would it be possible to log the time taken to 
execute this function...

That's a very good suggestion, I have added a debug info in the code to print 
the time consumed for this function.

[~xyao]'s comments
bq. FilteredKeys.java

This class is removed, now {{KeyManagerImpl#listKey}} calls the common helper 
to avoid duplicate code.

bq. Race between iterator and modification.

May or may not. I am not sure, I failed to get an answer from levelDB doc (poor 
docs :(). Anyway I have tested the snapshot approach, it doesn't seem to be 
expensive. I setup some tests, with different data size in levelDB. From 10 
entries, 10000 to 10,000,000 entries, data size from a few KB to over 180mb, 
the time to take a snapshot is trivial (around 1ms). I did not read the levelDB 
implementation, but if the iterator reads from the memory table and so as how 
snapshot created, it probably makes no big difference of the 2 approach. 
However, with snapshot maybe safer.

Thanks

> Ozone: Implement a common helper to return a range of KVs in levelDB
> --------------------------------------------------------------------
>
>                 Key: HDFS-11926
>                 URL: https://issues.apache.org/jira/browse/HDFS-11926
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Blocker
>         Attachments: HDFS-11926-HDFS-7240.001.patch, 
> HDFS-11926-HDFS-7240.002.patch, HDFS-11926-HDFS-7240.003.patch, 
> HDFS-11926-HDFS-7240.004.patch
>
>
> There are quite some *LIST* operations need to get a range of keys or values 
> from levelDB, and filter entries with key prefix. 
> # HDFS-11782 listKeys
> # HDFS-11779 listBuckets
> # HDFS-11773 listVolumes
> # HDFS-11679 listContainers
> we need to implement a common utility for them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to