ivankelly commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369188860
 
 
   > The discussion has been started since May 2017 and it has been 
mentioned/discussed at different forms in different email threads, 
   
   Ah, I didn't go back as far as July. I don't see any concensus on design or 
a design document though. In fact, the discussions highlighted aren't even 
about per ledger ledger storage, but a question about synchronization. In fact, 
I see a lot of the issues I am raising being risen there.
   
   > different community meeting 
   
   There's nothing in the notes about a decision on a design. That's not to say 
that it wasn't discussed, but there's no record of a design 
discussion/decision, so it may as well not have happened. The Apache Board 
highlighted exactly this issue a year ago (feedback on 2017-02-27 report).
   
   > direct meetings.
   
   As far as the community is concerned, what happens in direct meetings may as 
well not have happened.
   
   > 1) MemTable does cache and sorting, you can have different flushing policy 
(sequential vs parallel) 
   
   If this feature is useful in itself, then break it out into a separate 
change.
   
   >  2) EntryLogManager manages entry log files, when to create them, how to 
rotate them and how to flush/checkpoint them. 3) EntryLogger becomes the 
entrypoint to append entries, where the complexity of managing files are handed 
over to entry log manager.
   
   EntryLogger creates and owns the entry log manager, so Entrylogger is 
effectively managing both when viewed from a higher level.
   
   > sometimes reviewers won't see a clear picture if a change is small and 
they will get confused; 
   
   This is where a design doc is useful. Or push a wip branch, and then break 
out small pieces.
   
   > However there can still be exceptions. for example, code changes has been 
made before (like changes have been made in branches and run on production. for 
example, the merges come from branches). In this case, it doesn't make any 
sense for the people cherry-picking the change to redo the changes by breaking 
down into small patches.
   
   This change modifies a lot of internals which are used by all running 
bookies. 
   - EntryLogger
   - EntryMemTable
   - InterleavedLedgerStorage
   - SortedLedgerStorage
   - BufferedChannel
   
   A problematic change in any of these means lost data. It's a high risk 
change, or rather these are all high risk changes, so each one needs to be 
justified and examined closely.
   
   Contrast this to a similar change, bringing rocks db storage into master 
(https://github.com/apache/bookkeeper/commit/bba1c6f5d0). It's actually a 
similar size change. But it doesn't touch any existing class, so it's low risk.
   
   > I can't speak for all the cases. But for the case here, this feature has 
been discussed since May last year, and the change has been reviewed/discussed 
multiple times internally by Salesforce folks and externally by me during this 
long discussion and review recycle. so I am fine with reviewing the change in 
its current form.
   
   To summarize, I'm -1 on the changes in their current form. This doesn't mean 
they can't go in obviously, just that you'll need two +1 from elsewhere.
   
   My primary concern is the modification of core classes in an uberpatch. A 
secondary concern is how per ledger ledger storage behaves in the class 
hierarchy (i.e. it shouldn't touch interleaved).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to