[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638896#comment-13638896
 ] 

Ivan Kelly edited comment on BOOKKEEPER-564 at 4/23/13 9:01 AM:
----------------------------------------------------------------

{quote}
In this case, when a new LedgerStorage implementation comes in, it should again 
re-define the checkpointing algo. IMHO, instead of this can we think of an 
approach where we can decouple the checkpointing algo from Interleaved storage. 
Bookie can own this checkpointing logic and let him control. With this approach 
Bookie will have more control over the checkpointing irrespective of the 
plugged-in ledger storage. How does it sound?. Sijie Guo, are you also thinking 
in similar way?{quote}
If we want the LedgerStorage to control when checkpointing should occur, then 
LedgerStorage has to run the checkpoint itself. Otherwise you have coupled the 
LedgerStorage to the Bookie.SyncThread. There's no problem with breaking the 
sync thread out into a separate class, which multiple LedgerStorage 
implementations can use, but it should be owned by the LedgerStorage

{quote}
1) you moved LogMark to ledger storage, which makes journal contructor 
"Journal(conf, logmark)" behavior unclear,{quote}
This should be better. The journal should just be constructed with 
Journal(conf). LastSyncedLogMark should only come into play for 
Journal#replay(JournalScanner) which should become Journal#replay(LogMark from, 
JournalScanner).

{quote}
sync thread (checkpointing) logic should be maintained by Bookie itself{quote}
I strongly disagree with this because...

{quote}
as the sync(checkpointing) logic is part of bookie not ledger storage
{quote}
...all the logic to do the checkpoint is in the LedgerStorage. The decision to 
make the checkpoint is taken from within the ledger storage. So this is false. 
The logic is part of ledger storage.

{quote}
it should be common across different ledger storage implementations.{quote}
It can be broken out into a different class which can be shared by different 
implementations. It should be owned by the ledger storage though.

{quote}
 1), making LogMark as a part to journal would make Journal clearer on the 
replaying behaviour.
{quote}
The log mark is dependent on the ledger storage and only means anything in the 
context of the ledger storage. It should only be stored when a checkpoint has 
occurred. This means that the ledger storage is what decides which log mark to 
store. If the journal is storing the mark, the ledger storage is triggering 
behaviour on the journal. Again, this is another piece that could be broken out 
into a separate class to be used by multiple ledger storage implementations, 
but it should remain owned by the ledger storage.

To reiterate, this changes need to be done to make it possible to benchmark the 
ledger storage in a way that the ledger storage will behave the same as it does 
when running under a bookie.
                
      was (Author: ikelly):
    {quote}
In this case, when a new LedgerStorage implementation comes in, it should again 
re-define the checkpointing algo. IMHO, instead of this can we think of an 
approach where we can decouple the checkpointing algo from Interleaved storage. 
Bookie can own this checkpointing logic and let him control. With this approach 
Bookie will have more control over the checkpointing irrespective of the 
plugged-in ledger storage. How does it sound?. Sijie Guo, are you also thinking 
in similar way?{quote}
If we want the LedgerStorage to control when checkpointing should occur, then 
LedgerStorage has to run the checkpoint itself. Otherwise you have coupled the 
LedgerStorage to the Bookie.SyncThread. There's no problem with breaking the 
sync thread out into a separate class, which multiple LedgerStorage 
implementations can use, but it should be owned by the LedgerStorage

{quote}
1) you moved LogMark to ledger storage, which makes journal contructor 
"Journal(conf, logmark)" behavior unclear,{quote}
This should be better. The journal should just be constructed with 
Journal(conf). LastSyncedLogMark should only come into play for 
Journal#replay(JournalScanner) which should become Journal#replay(LogMark from, 
JournalScanner).

{quote}
sync thread (checkpointing) logic should be maintained by Bookie itself{quote}
I strongly disagree with this because...

{quote}
as the sync(checkpointing) logic is part of bookie not ledger storage
{quote}
...all the logic to do the checkpoint is in the LedgerStorage. The decision to 
make the checkpoint is taken from within the ledger storage. So this if alse. 
The logic is part of ledger storage.

{quote}
it should be common across different ledger storage implementations.{quote}
It can be broken out into a different class which can be shared by different 
implementations. It should be owned by the ledger storage though.

{quote}
 1), making LogMark as a part to journal would make Journal clearer on the 
replaying behaviour.
{quote}
The log mark is dependent on the ledger storage and only means anything in the 
context of the ledger storage. It should only be stored when a checkpoint has 
occurred. This means that the ledger storage is what decides which log mark to 
store. If the journal is storing the mark, the ledger storage is triggering 
behaviour on the journal. Again, this is another piece that could be broken out 
into a separate class to be used by multiple ledger storage implementations, 
but it should remain owned by the ledger storage.

To reiterate, this changes need to be done to make it possible to benchmark the 
ledger storage in a way that the ledger storage will behave the same as it does 
when running under a bookie.
                  
> Better checkpoint mechanism
> ---------------------------
>
>                 Key: BOOKKEEPER-564
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-564
>             Project: Bookkeeper
>          Issue Type: Improvement
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Sijie Guo
>             Fix For: 4.3.0
>
>         Attachments: 0001-BOOKKEEPER-564-Better-checkpoint-mechanism.patch, 
> 0001-BOOKKEEPER-564-Better-checkpoint-mechanism.patch, 
> 0002-BOOKKEEPER-564-Better-checkpoint-mechanism.patch, BOOKKEEPER-564.patch, 
> BOOKKEEPER-564.patch
>
>
> Currently, SyncThread made a checkpoint too frequently, which affects 
> performance. data is writing to entry logger file might be blocked by syncing 
> same entry logger file, which affect bookie to achieve higher throughput. We 
> could schedule checkpoint only when rotating an entry log file. so new 
> incoming entries would be written to newer entry log file and old entry log 
> file could be synced.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to