[
https://issues.apache.org/jira/browse/BOOKKEEPER-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13636201#comment-13636201
]
Ivan Kelly commented on BOOKKEEPER-564:
---------------------------------------
{quote}
bq. We don't. We always construct a full bookie, because it's impossible to
test the ledger storage without the bookie. Another place I came across this
was with the bkvhbase benchmark. I had to implement my own SyncThread.
you could use ledger storage add/read/flush independently. it is a
full-functioned module. You could use it as an independent module in other
place if you like for different purpose. I don't understand how is bad as you
said.
{quote}
Its not a complete module. We're going to be introducing a new ledger storage
to trunk hopefully soon. This is going to need to be tested and benchmarked
extensively. If the behaviour of the ledger storage is dependendent on the
behaviour of the sync thread, which it is, then it's going to make this job
much more awkward. We will have to reimplement the sync thread. But of course,
the sync thread will have to match the behaviour of the existing sync thread.
We can't use the existing sync thread, because it's coupled with the journal
and we can't benchmark using the entire bookie, because the journal throughput
will interfere with the throughput of the ledger storage (i.e. it will throttle
it as the journal should be the bottleneck). _This is the core reason why I
want to avoid this coupling_.
{quote}
bq. It's a command rather than a guide. And how the ledger storage behaves is
dependent on the sync thread. This is coupling.
if its behavior is part to a bookie's behavior, I don't think it is coupling.
so I don't know what behaviors that is just belonged to ledger storage and not
belonged to bookie.
again, why I think it is a guide rather than a command. in checkpointer
interface, ledger storage just tell a better pointer to the implementation. the
really execution is decided by checkpointer itself (bookie), it could use the
guide offered by ledger storage, or could use a different checkpoint based on
other condition (for example, CheckPoint.MAX), so the control part is up to
checkpointer.
{quote}
With checkpointer, an #onRotateEntryLog event in InterleavedLedgerStorage
triggers a call to #startCheckpoint which pushes a sync request onto the
request queue which causes a #checkpoint() to be called in the
InterleavedLedgerStorage. This isn't a guide. A guide would be something that
whoever triggering the checkpoint would request and based on that decide
whether to checkpoint or not. LedgerStorage#isFlushRequired() was a guide. But
checkpointer is not. Checkpointer uses a push mechanism, a guide would use a
pull mechanism.
But there's a deeper issue here. The decision to checkpoint is taken inside of
the ledger storage. This suggests that the actual action to checkpoint should
take place there too.
{quote}
one more point that using the CheckPointer interface, you could still implement
a Periodical sync thread, if you don't like using the checkpoint offered by
ledger storage. but it would allow us using the optimization.
{quote}
This doesn't require the checkpoint interface. It can be achieved by simply
exposing #flush(). I don't think we should do this until there's a strong
requirement for it though.
> Better checkpoint mechanism
> ---------------------------
>
> Key: BOOKKEEPER-564
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-564
> Project: Bookkeeper
> Issue Type: Improvement
> Components: bookkeeper-server
> Reporter: Sijie Guo
> Assignee: Sijie Guo
> Fix For: 4.3.0
>
> Attachments: 0001-BOOKKEEPER-564-Better-checkpoint-mechanism.patch,
> 0001-BOOKKEEPER-564-Better-checkpoint-mechanism.patch,
> 0002-BOOKKEEPER-564-Better-checkpoint-mechanism.patch, BOOKKEEPER-564.patch,
> BOOKKEEPER-564.patch
>
>
> Currently, SyncThread made a checkpoint too frequently, which affects
> performance. data is writing to entry logger file might be blocked by syncing
> same entry logger file, which affect bookie to achieve higher throughput. We
> could schedule checkpoint only when rotating an entry log file. so new
> incoming entries would be written to newer entry log file and old entry log
> file could be synced.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira