[ 
https://issues.apache.org/jira/browse/HDDS-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17160023#comment-17160023
 ] 

Ethan Rose commented on HDDS-3976:
----------------------------------

Implementing seekLast() correctly and efficiently with a filter requires 
seeking to the end of the RocksIterator and stepping back over mismatched keys 
using its prev() method. Because KeyValueBlockIterator uses a MetaStoreIterator 
as a wrapper over the RocksIterator, and MetaStoreIterator does not support 
prev(), this method cannot be implemented in the current state of the code. 
Options:
 # Add support for prev() to MetaStoreIterator.
 # Remove KeyValueBlockIterator#seekToLast since it is only used in tests.

> KeyValueBlockIterator#nextBlock skips valid blocks
> --------------------------------------------------
>
>                 Key: HDDS-3976
>                 URL: https://issues.apache.org/jira/browse/HDDS-3976
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Ethan Rose
>            Assignee: Ethan Rose
>            Priority: Major
>
> HDDS-3854 fixed a bug in KeyValueBlockIterator#hasNext, but introduced 
> another one in KeyValueBlockIterator#nextBlock, which depends on the behavior 
> of that method. When the first key encountered does not pass the filter, the 
> internal nextBlock field is never intialized. Then a call to nextBlock() 
> results in call to hasNext() which returns true, which recursively calls 
> nextBlock(), again calling hasNext(), etc until the end of the set is reached 
> and an exception is thrown. This skips all valid keys that may occur past the 
> first invalid key.
> Additionally, the current implementation of KeyValueBlockIterator#seekLast 
> depends on the internal RocksDB iterators seekLast() method, which will skip 
> to the last key in the DB regardless of whether it matches the filter or not. 
> This could be different from last key according to the filter.
> This bug was identified while working on HDDS-3869, which adds a strong 
> typing layer before objects are serialized into RocksDB for datanode. Due to 
> RocksDB internals, this changes the database layout so that all prefixed keys 
> are returned at the beginning of the key set, instead of in the end. Since 
> the original layout returned all prefixed keys at the end of the key set, 
> this bug was not evident in any of the original unit tests, since the 
> behavior described above could not occur.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to