[
https://issues.apache.org/jira/browse/HDFS-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040159#comment-16040159
]
Anu Engineer commented on HDFS-11926:
-------------------------------------
[~cheersyang] Me and [~xyao] spend quite some time today trying to pin point if
the semantics of RocksDB and LevelDB is same. Unfortunately the LevelDB
documentation seems to indicate that we cannot rely on what RocksDB is
guaranteeing.
https://github.com/google/leveldb/blob/master/doc/index.md#concurrency
{quote}
Concurrency
A database may only be opened by one process at a time. The leveldb
implementation acquires a lock from the operating system to prevent misuse.
Within a single process, the same leveldb::DB object may be safely shared by
multiple concurrent threads. I.e., different threads may write into or fetch
iterators or call Get on the same database without any external synchronization
(the leveldb implementation will automatically do the required
synchronization). However other objects (like Iterator and WriteBatch) may
require external synchronization. If two threads share such an object, they
must protect access to it using their own locking protocol. More details are
available in the public header files.
{quote}
However we found some discussion in Google groups which seemed to indicate what
[~xyao] said in his comments to be true. So for all we know it *might* work
without snapshots. The downside is that we will have to reproduce the race
condition to test it and your comments seems to indicate the cost of snapshot
is very low. So a strict interpretation of the above documentation by us was
the if you use the same object then iteration is going to be thread safe
between 2 threads. However we are not able to understand how this behaves if
there is are concurrent readers and writers using different objects. That is,
it does not give us any information if 2 calls in a thread would see the same
state (based on a version number) or not.
There are 2 options for us, one is to read the source of LevelDB to understand
how this code behaves or switch over to use RocksDB, which explicitly
documents this behavior.
> Ozone: Implement a common helper to return a range of KVs in levelDB
> --------------------------------------------------------------------
>
> Key: HDFS-11926
> URL: https://issues.apache.org/jira/browse/HDFS-11926
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Priority: Blocker
> Attachments: HDFS-11926-HDFS-7240.001.patch,
> HDFS-11926-HDFS-7240.002.patch, HDFS-11926-HDFS-7240.003.patch,
> HDFS-11926-HDFS-7240.004.patch
>
>
> There are quite some *LIST* operations need to get a range of keys or values
> from levelDB, and filter entries with key prefix.
> # HDFS-11782 listKeys
> # HDFS-11779 listBuckets
> # HDFS-11773 listVolumes
> # HDFS-11679 listContainers
> we need to implement a common utility for them.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]