hangc0276 opened a new pull request, #3940:
URL: https://github.com/apache/bookkeeper/pull/3940

   ### Motivation
   When the bookie triggers compaction, both Transaction compactor and 
EntryLogCompactor, the write throughput becomes jittery. 
   <img width="853" alt="image" 
src="https://user-images.githubusercontent.com/5436568/235288456-482bc1c2-3511-48a3-a698-f70bf93c20dd.png";>
   <img width="847" alt="image" 
src="https://user-images.githubusercontent.com/5436568/235288468-d51fe344-ed10-4ef9-ad7f-62fa83577a7b.png";>
   <img width="843" alt="image" 
src="https://user-images.githubusercontent.com/5436568/235288476-e723b87a-e600-457e-af44-5121d3a4204b.png";>
   
   I take a deep look at the logs and found during compacting the entry log 491 
in transaction compaction, there are Write Cache operations and roll new entry 
log files.
   ```
   2023-04-26T06:14:16,356+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.TransactionalEntryLogCompactor - Compacting entry 
log 491 with usage 8.796982355448682E-6.
   2023-04-26T06:14:16,356+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.EntryLoggerAllocator - Created new entry log file 
/pulsar/data/bookkeeper/ledgers-0/current/5d3.log.compacting for logId 1491.
   2023-04-26T06:14:17,198+0000 [bookie-io-8-2] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:19,030+0000 [bookie-io-8-2] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:19,806+0000 [db-storage-4-1] INFO  
org.apache.bookkeeper.bookie.EntryLogManagerBase - Creating a new entry log 
file : createNewLog = false, reachEntryLogLimit = true
   2023-04-26T06:14:19,806+0000 [db-storage-4-1] INFO  
org.apache.bookkeeper.bookie.EntryLogManagerBase - Flushing entry logger 1488 
back to filesystem, pending for syncing entry loggers : 
[BufferedChannel{logId=1488, logFile
   =/pulsar/data/bookkeeper/ledgers-0/current/5d0.log, ledgerIdAssigned=-1}].
   2023-04-26T06:14:19,807+0000 [pool-4-thread-1] INFO  
org.apache.bookkeeper.bookie.EntryLoggerAllocator - Created new entry log file 
/pulsar/data/bookkeeper/ledgers-0/current/5d4.log for logId 1492.
   2023-04-26T06:14:19,968+0000 [db-storage-4-1] INFO  
org.apache.bookkeeper.bookie.EntryLogManagerForSingleEntryLog - Synced entry 
logger 1488 to disk.
   2023-04-26T06:14:20,971+0000 [bookie-io-8-2] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:22,999+0000 [bookie-io-8-2] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:24,960+0000 [bookie-io-8-2] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:27,109+0000 [bookie-io-8-1] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:29,204+0000 [bookie-io-8-1] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:32,569+0000 [bookie-io-8-1] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:32,610+0000 [BookieJournal-3181] INFO  
org.apache.bookkeeper.bookie.JournalChannel - Opening journal 
/pulsar/data/bookkeeper/journal-0/current/187bb72bb20.txn
   2023-04-26T06:14:35,383+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.DefaultEntryLogger - Flushed compaction log file 
/pulsar/data/bookkeeper/ledgers-0/current/5d3.log.compacting with logId 1491.
   2023-04-26T06:14:36,265+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.EntryLogManagerBase - Creating a new entry log 
file : createNewLog = false, reachEntryLogLimit = true
   2023-04-26T06:14:36,266+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.EntryLogManagerBase - Flushing entry logger 1490 
back to filesystem, pending for syncing entry loggers : 
[BufferedChannel{logId=1490, 
logFile=/pulsar/data/bookkeeper/ledgers-0/current/5d2.log, 
ledgerIdAssigned=-1}].
   2023-04-26T06:14:36,266+0000 [pool-4-thread-1] INFO  
org.apache.bookkeeper.bookie.EntryLoggerAllocator - Created new entry log file 
/pulsar/data/bookkeeper/ledgers-0/current/5d5.log for logId 1493.
   2023-04-26T06:14:36,362+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.EntryLogManagerForSingleEntryLog - Synced entry 
logger 1490 to disk.
   2023-04-26T06:14:36,720+0000 [bookie-io-8-1] INFO  
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Write 
cache is full, triggering flush
   2023-04-26T06:14:36,970+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.Journal - garbage collected journal 187bb72bb1f.txn
   2023-04-26T06:14:37,498+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.GarbageCollectorThread - Removing entry log 
metadata for 491
   2023-04-26T06:14:37,498+0000 [GarbageCollectorThread-6-1] INFO  
org.apache.bookkeeper.bookie.TransactionalEntryLogCompactor - Compacted entry 
log : 491.
   ```
   
   Let's take a look at the compaction steps first before analyzing the root 
cause.
   
   #### Transaction Compactor
   - Step1: Create a new entry log file with `compacting` suffix
   - Step2: Open the original entry log file and read the ledgers one by one, 
filter out those deleted ledgers, and write the remaining ledgers into the 
**newly created entry log file**
   - Step3: Flush the newly created entry log file
   - **Step4: Flush the current ledger storage (Trigger a new checkpoint)** 
https://github.com/apache/bookkeeper/blob/c765aea600fde1a8ac7cb8a53a1157a372026894/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java#L947
   - Step5: Update the entries' new index in the newly created entry log file 
into the RocksDB
   - Step6: Remove the `compacting` suffix
   - Step7: Delete the original entry log file and remove it from the 
entryLogMetaMap
   
   #### EntryLog Compactor
   - Step1: Open the original entry log file and read the ledgers one by one, 
filter out those deleted ledgers, and write the remaining ledgers into the 
**current writing entry log file**
   - Step2: Flush the current writing entry log file to ensure those written 
ledgers are flushed into the disk
   - **Step3: Flush the current ledger storage (Trigger a new checkpoint)** 
https://github.com/apache/bookkeeper/blob/c765aea600fde1a8ac7cb8a53a1157a372026894/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java#L947
   - Step4: Update the entries' new index in the current entry log file into 
the RocksDB
   - Step5: Delete the original entry log file and remove it from the 
entryLogMetaMap
   
   For Step4 in Transaction Compactor and Step3 in EntryLog Compactor, it 
triggers a new checkpoint and makes the write cache switch no matter the write 
cache store how much data.
   
   However, looking back at the whole transaction steps, ` Flush the current 
ledger storage` may want to ensure the remaining data has been flushed into the 
entry log file before updating the entres' index into RocksDB. But both 
`Transaction Compactor` and `EntryLog Compactor` are triggered flush operations 
to ensure those remaining data are flushed into the entry log file. So I think 
the `Flush the current ledger storage` step is unnecessary and it will bring 
more throughput impact.
   
   
   ### Changes
   Remove the flush operation in the 
`SingleDirectoryDbLedgerStorage#updateEntriesLocations` operation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to