Hang,

Il giorno mar 15 mar 2022 alle ore 02:47 Hang Chen
<chenh...@apache.org> ha scritto:
>
> Hi BookKeeper Community,
>
>   For BookKeeper 4.14.0+, I have noticed that index deletion is
> sometimes taking around 60 seconds which cause the CPU to spike to
> 100%
> ```
> [2022-02-28T07:25:42.531Z] INFO db-storage-cleanup-10-1
> EntryLocationIndex:191 Deleting indexes for ledgers: [3385184,
> 3385239, 3385159, 3385142, 3385124, 3385193, 3384879, 3385165,
> 3385916]
> [2022-02-28T07:26:34.089Z] INFO db-storage-cleanup-10-1
> EntryLocationIndex:266 Deleted indexes for 201065 entries from 9
> ledgers in 51.557 seconds
> [2022-02-28T07:40:42.534Z] INFO db-storage-cleanup-10-1
> EntryLocationIndex:191 Deleting indexes for ledgers: [3385379,
> 3385367, 3385718, 3385365, 3385412, 3385167, 3385357, 3386141]
> [2022-02-28T07:41:47.867Z] INFO db-storage-cleanup-10-1
> EntryLocationIndex:266 Deleted indexes for 134590 entries from 8
> ledgers in 65.332 seconds
> ```
>
> RocksDB compaction is a heavy operation and the checkpoint will be
> triggered in high frequency, which causes db-storage-cleanup thread
> always into high load, and makes the cpu keep 100%.
>
> This change was introduced by
> https://github.com/apache/bookkeeper/pull/2686, The motivation of this
> Pr is:
>
> > After deleting many ledgers, seeking to the end of the RocksDB metadata can 
> > take a long time and trigger timeouts upstream. Address this by improving 
> > the seek logic as well as compacting out tombstones in situations where 
> > we've just deleted many entries. This affects the entry location index and 
> > the ledger metadata index.
>
> For RocksDB, the CompactRange operation is a high overload operation.
> we'd better avoid manual calls. Since RocksDB 7.0, the `compactRange`
> API has been removed.
> https://github.com/facebook/rocksdb/pull/9444
>
> IMO, we'd better remove the manual call compactRange in this PR, and
> increase the `max_background_jobs` to accelerate auto compaction.
>
> Would you please give me more ideas?
I don't have much experience with RocksDB.

Did you make a prototype ?

Sharing some results in a prototype would help a lot.

I am not sure, but maybe we can add a option to enable/disable manual
compaction and to tune max_background_jobs
this way we can rollback in case of problems with your proposal

Enrico

>
> Thanks,
> Hang

Reply via email to