[
https://issues.apache.org/jira/browse/BOOKKEEPER-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638896#comment-13638896
]
Ivan Kelly edited comment on BOOKKEEPER-564 at 4/23/13 9:01 AM:
----------------------------------------------------------------
{quote}
In this case, when a new LedgerStorage implementation comes in, it should again
re-define the checkpointing algo. IMHO, instead of this can we think of an
approach where we can decouple the checkpointing algo from Interleaved storage.
Bookie can own this checkpointing logic and let him control. With this approach
Bookie will have more control over the checkpointing irrespective of the
plugged-in ledger storage. How does it sound?. Sijie Guo, are you also thinking
in similar way?{quote}
If we want the LedgerStorage to control when checkpointing should occur, then
LedgerStorage has to run the checkpoint itself. Otherwise you have coupled the
LedgerStorage to the Bookie.SyncThread. There's no problem with breaking the
sync thread out into a separate class, which multiple LedgerStorage
implementations can use, but it should be owned by the LedgerStorage
{quote}
1) you moved LogMark to ledger storage, which makes journal contructor
"Journal(conf, logmark)" behavior unclear,{quote}
This should be better. The journal should just be constructed with
Journal(conf). LastSyncedLogMark should only come into play for
Journal#replay(JournalScanner) which should become Journal#replay(LogMark from,
JournalScanner).
{quote}
sync thread (checkpointing) logic should be maintained by Bookie itself{quote}
I strongly disagree with this because...
{quote}
as the sync(checkpointing) logic is part of bookie not ledger storage
{quote}
...all the logic to do the checkpoint is in the LedgerStorage. The decision to
make the checkpoint is taken from within the ledger storage. So this is false.
The logic is part of ledger storage.
{quote}
it should be common across different ledger storage implementations.{quote}
It can be broken out into a different class which can be shared by different
implementations. It should be owned by the ledger storage though.
{quote}
1), making LogMark as a part to journal would make Journal clearer on the
replaying behaviour.
{quote}
The log mark is dependent on the ledger storage and only means anything in the
context of the ledger storage. It should only be stored when a checkpoint has
occurred. This means that the ledger storage is what decides which log mark to
store. If the journal is storing the mark, the ledger storage is triggering
behaviour on the journal. Again, this is another piece that could be broken out
into a separate class to be used by multiple ledger storage implementations,
but it should remain owned by the ledger storage.
To reiterate, this changes need to be done to make it possible to benchmark the
ledger storage in a way that the ledger storage will behave the same as it does
when running under a bookie.
was (Author: ikelly):
{quote}
In this case, when a new LedgerStorage implementation comes in, it should again
re-define the checkpointing algo. IMHO, instead of this can we think of an
approach where we can decouple the checkpointing algo from Interleaved storage.
Bookie can own this checkpointing logic and let him control. With this approach
Bookie will have more control over the checkpointing irrespective of the
plugged-in ledger storage. How does it sound?. Sijie Guo, are you also thinking
in similar way?{quote}
If we want the LedgerStorage to control when checkpointing should occur, then
LedgerStorage has to run the checkpoint itself. Otherwise you have coupled the
LedgerStorage to the Bookie.SyncThread. There's no problem with breaking the
sync thread out into a separate class, which multiple LedgerStorage
implementations can use, but it should be owned by the LedgerStorage
{quote}
1) you moved LogMark to ledger storage, which makes journal contructor
"Journal(conf, logmark)" behavior unclear,{quote}
This should be better. The journal should just be constructed with
Journal(conf). LastSyncedLogMark should only come into play for
Journal#replay(JournalScanner) which should become Journal#replay(LogMark from,
JournalScanner).
{quote}
sync thread (checkpointing) logic should be maintained by Bookie itself{quote}
I strongly disagree with this because...
{quote}
as the sync(checkpointing) logic is part of bookie not ledger storage
{quote}
...all the logic to do the checkpoint is in the LedgerStorage. The decision to
make the checkpoint is taken from within the ledger storage. So this if alse.
The logic is part of ledger storage.
{quote}
it should be common across different ledger storage implementations.{quote}
It can be broken out into a different class which can be shared by different
implementations. It should be owned by the ledger storage though.
{quote}
1), making LogMark as a part to journal would make Journal clearer on the
replaying behaviour.
{quote}
The log mark is dependent on the ledger storage and only means anything in the
context of the ledger storage. It should only be stored when a checkpoint has
occurred. This means that the ledger storage is what decides which log mark to
store. If the journal is storing the mark, the ledger storage is triggering
behaviour on the journal. Again, this is another piece that could be broken out
into a separate class to be used by multiple ledger storage implementations,
but it should remain owned by the ledger storage.
To reiterate, this changes need to be done to make it possible to benchmark the
ledger storage in a way that the ledger storage will behave the same as it does
when running under a bookie.
> Better checkpoint mechanism
> ---------------------------
>
> Key: BOOKKEEPER-564
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-564
> Project: Bookkeeper
> Issue Type: Improvement
> Components: bookkeeper-server
> Reporter: Sijie Guo
> Assignee: Sijie Guo
> Fix For: 4.3.0
>
> Attachments: 0001-BOOKKEEPER-564-Better-checkpoint-mechanism.patch,
> 0001-BOOKKEEPER-564-Better-checkpoint-mechanism.patch,
> 0002-BOOKKEEPER-564-Better-checkpoint-mechanism.patch, BOOKKEEPER-564.patch,
> BOOKKEEPER-564.patch
>
>
> Currently, SyncThread made a checkpoint too frequently, which affects
> performance. data is writing to entry logger file might be blocked by syncing
> same entry logger file, which affect bookie to achieve higher throughput. We
> could schedule checkpoint only when rotating an entry log file. so new
> incoming entries would be written to newer entry log file and old entry log
> file could be synced.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira