dlg99 commented on issue #3759:
URL: https://github.com/apache/bookkeeper/issues/3759#issuecomment-1421382854

   @gaozhangmin Thank you, this looks like an interesting problem. 
   
   +1 to the @hangc0276 's questions
   
   > For a non-existed entry, it searched the a range of ledgers from current 
ledger id to Long.MAX_VALUE
   
   I think there is a misunderstanding.
   
   ```java
       private long getLastEntryInLedgerInternal(long ledgerId) throws 
IOException {
           LongPairWrapper maxEntryId = LongPairWrapper.get(ledgerId, 
Long.MAX_VALUE);
   
           // Search the last entry in storage
           Entry<byte[], byte[]> entry = locationsDb.getFloor(maxEntryId.array);
   ```
   
   `LongPairWrapper.get(ledgerId, Long.MAX_VALUE)` is a key that consists of 
(ledgerId, entryId) and not a (ledgerId, the last ledger id).
   
   `locationsDb.getFloor` will use it as a max boundary to find a max key 
smaller or equal to this, if that matches the same ledgerId => we have last 
entryId, otherwise no entry found. 
   This is implemented in KeyValueStorageRocksDB:
   ```
       public Entry<byte[], byte[]> getFloor(byte[] key) throws IOException {
           try (Slice upperBound = new Slice(key);
                    ReadOptions option = new 
ReadOptions(optionCache).setIterateUpperBound(upperBound);
                    RocksIterator iterator = db.newIterator(option)) {
               iterator.seekToLast();
               if (iterator.isValid()) {
                   return new EntryWrapper(iterator.key(), iterator.value());
               }
           }
           return null;
       }
   ```
   
   Now the real question is what's going on with `seekToLast`.
   I think the problem is similar to 
https://github.com/facebook/rocksdb/issues/261
   ```
   this post on our discussion group might be insightful: 
https://www.facebook.com/groups/rocksdb.dev/permalink/604723469626171/. 
   Since deletes are not actually deletes, but rather tombstones (which get 
cleaned up after a compaction), 
   your iterator might have to skip a bunch of deletes the get to the first key 
in the database.
   ```
   
   This leads us to https://github.com/apache/bookkeeper/pull/2686 (reverted in 
https://github.com/apache/bookkeeper/pull/3144) 
   There should be a way to check number of tombstones or get metrics from 
RocksDb (sorry, I never dealt with this).
   there should be a way to trigger compaction on RocksDB. or maybe the search 
in your case got slowed down by running compaction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to