Todd Lipcon has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
......................................................................


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3076/14/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

Line 791:                             vector<rowid_t>* gced_input_rows) {
I'm not really convinced of this vector here -- it's a bit expensive 
memory-wise, and it seems like we should be able to "recalculate" on the second 
pass whether something was GCed on the first pass without having to track this 
O(n) memory usage through.

Flushes in particular can get to be 10-20GB worth of rows, and in a time series 
workload, the rows themselves can be pretty small -- so we could potentially 
have many millions of gced input rows in a worst case. Given that I think 
there's a reasonable way to maintain O(1) memory usage, I'd like to explore 
that option. (if it turns out to be crazy complex, this isn't a bad second 
choice)


-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to