reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369856263
 
 
   As far as I understood, EntryLogger is the abstraction layer for all of the 
operations on entrylog files. Be it how entries are written to entrylog file 
(how active/rotated entrylog files are organized), read from entrylog files 
(how read files are organized), flush/checkpoint path and getting info about 
the state of entrylog files for garbage collection/compaction. 
   
   Also, I tried to explain in detail why it is the right thing to do 
abstraction in EntryLogger in this comment 
   https://github.com/apache/bookkeeper/issues/570#issuecomment-368597767
   
   Entrylogger is the right place to deal with multiple entrylogs, it is more 
organic, less churn of the code, since ideally other components of the bookie 
doesn't need to be modified much.
   
   But yes, regarding sub-task4, it is valid point to raise the need of 
SortedLedgerStorage if we are going to have entrylog per ledger. I need to have 
some perf numbers to validate the write/read scenarios in the case of entrylog 
per ledger using InterleavedLedgerStorage vs SortedLedgerStorage.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   > @reddycharan is there any way you can sub-divide your subtask-5?
   
   Yes, I think it should be possible. Probably in the first task I could 
introduce EntryLogManager interface and EntryLogManagerForSingleEntryLog and in 
the final task I can introduce EntryLogManagerForEntryLogPerLedger.
   
   So
    
   **sub-task1** - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   **sub-task2** - move the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   **sub-task3** - make changes to SyncThread/checkpoint logic, so that incase 
of entrylogperledger, checkpoint happens for every flushInterval but not when 
active entrylog is created/rolled over.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger. (evaluate the need of SortedLedgerStorage by doing perf 
comparisons)
   
   **sub-task5** - introduce EntryLogManager interface and 
EntryLogManagerForSingleEntryLog
   
   **sub-task6** - introduce EntryLogManagerForEntryLogPerLedger
   
   Thanks guys for providing the feedback. Will proceed with the plan and start 
creating new pull requests for the sub-tasks. But I'll leave the existing 
pullrequest (https://github.com/apache/bookkeeper/pull/1201), since it might be 
helpful to refer to get the full picture.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to