Mike Percy has posted comments on this change.

Change subject: KUDU-1601. Delete ancient UNDO delta blocks in the background
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4363/5//COMMIT_MSG
Commit Message:

PS5, Line 15: The task was written to operate at the server level because the
            : UpdateStats() part of the task spends a budgeted amount of time 
opening
            : (initializing) undo delta blocks in order to read the stats from 
their
            : headers.
> Maybe I'm missing something obvious, but I don't see how the explanation (s
> What is it about UpdateStats() doing I/O that forces the op to be 
> server-level, or makes it more desirable?

The idea is to attempt to budget the amount of time that can be spent per MM 
cycle on initializing delta files. If we have # of these tasks == # tablets 
then it's not really obvious how to do that without building a special facility 
for it. Doing it at the server level makes it easy because it's effectively a 
singleton, so we can put that logic in UpdateStats().

> Would it be possible to make opening of eager UNDO delta blocks an MM op in 
> its own right? Doing I/O in UpdateStats() (regardless of at what level) seems 
> weird; it has the potential to stall other cheap MM operations (such as log 
> GC) if the I/O is blocked for a really long time.

That seems problematic because it's not really clear how to schedule it. In 
general we want to sort-of "always" be opening deltas. Once we have all deltas 
open, it's pretty cheap to make decisions about GCing data (and we want to do 
it with high priority because it's cheap and important to do).

> It also puts the I/O on the scheduler thread instead of on the MM operations 
> thread(s).

I do agree that putting the scheduler at the mercy of disk latency is not 
ideal. OTOH, if the disk(s) are really so burdened, all of the MM ops will be 
slow anyway.

> If you don't want to introduce two new ops, perhaps they can be "combined" 
> such that Perform() either opens some UNDO blocks, or does ancient history 
> GC, and we'll decide which approach to take during UpdateStats()?

I suppose another way to do this could be to schedule the job with very high 
priority but not more than some fraction of the time (I've already added logic 
to do this, and defaulted it to 50% of the time as the max). Then Perform() 
would basically consist of: InitializeAncientBlocks(tablets, budget_deadline); 
DeleteAncientBlocks(tablets);

Such an approach doesn't try to solve the related but separate problem of 
eventually opening *all* delta files, not just ancient ones... but maybe it 
could be made to in a follow-up change.


http://gerrit.cloudera.org:8080/#/c/4363/5/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 2357:   // The catalog manager has only one tablet peer.
> It's possible for sys_catalog_ to not yet be set, in which case this functi
Thanks for the catch.


-- 
To view, visit http://gerrit.cloudera.org:8080/4363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0309bf7acfb6d018860c80f354012c3500da5c68
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to