[GitHub] sijie commented on issue #570: Multiple active entrylogs

GitBox Tue, 27 Feb 2018 12:40:34 -0800

sijie commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369018351

> Where has this been discussed? I can't find anything on the dev@ list.

The discussion has been started since May 2017 and it has been
mentioned/discussed at different forms in different email threads, different
community meeting or direct meetings.

http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAAFz1KPYb4f%2BrUfP2Bhvy%2BigksBLsuXE-QMw6U%2B%3DBDiRXQ19LA%40mail.gmail.com%3E

http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201705.mbox/%3CCAAFz1KNr4uB8RMiXK97gkiqfXCVEoygfjeR2rJKcddvF9ZM1yg%40mail.gmail.com%3E

> There's a big difference. Making the change in the entrylogger itself,
means that the implementation gets pulled into SortedLedgerStorage, and that
means he has to deal with the EntryMemTable, which should have exactly nothing
to do with the implementation.

In order to fully leverage "per ledger" characteristics, parallel flushing
is a necessary change. But it doesn't mean the change in EntryMemTable is only
dedicated to per ledger entry log manager. It can be useful for single entry
log manager. I requested Charan to not enable parallel flushing feature for
single entry log manager until we have confident on that change. And that's why
there is boolean flags around that piece.

This change here is actually taking `SortedLedgerStorage` in a right
direction: 1) MemTable does cache and sorting, you can have different flushing
policy (sequential vs parallel) 2) EntryLogManager manages entry log files,
when to create them, how to rotate them and how to flush/checkpoint them. 3)
EntryLogger becomes the entrypoint to append entries, where the complexity of
managing files are handed over to entry log manager.

This change takes the effort to allow us provide a better modularized
implementation of SortedLedgerStorage, which it is the default storage
implementation for most of bookkeeper users.

> Shipping is a function of many things, not just getting code into branch.

I completely agreed with you. and this change is not "getting code into
branch" if you have followed the discussion since May last year.

> 3000 LOC is never easy to review. 300-400 LOC is the max that can be
reviewed competently in an hour, any more, people start skimming. I wrote an
email about this in october[1], so I won't go through it again.

I agreed with you - smaller patches are much easier to review and get
changes in. However this is a very subjective thing that varies on different
things and to different people. sometimes reviewers won't see a clear picture
if a change is small and they will get confused; reviewers will be overwhelmed
if they don't follow closely on the discussion if the change is big. so both
cases can exist and this process can not really be quantified. What we have to
do is to follow what we called guidelines or practices and encourage people to
send changes in small batches.

However there can still be exceptions. for example, code changes has been
made before (like changes have been made in branches and run on production. for
example, the merges come from branches). In this case, it doesn't make any
sense for the people cherry-picking the change to redo the changes by breaking
down into small patches.

I can't speak for all the cases. But for the case here, this feature has
been discussed since May last year, and the change has been reviewed/discussed
multiple times internally by Salesforce folks and externally by me during this
long discussion and review recycle. so I am fine with reviewing the change in
its current form.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] sijie commented on issue #570: Multiple active entrylogs

Reply via email to