Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 19:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS19, Line 210:   // Returns 0 if the log is shut down.
> hrm, this might be misleading, as the WAL might still be taking space (e.g.
Mmm after looking at it again I decided it's better to listen to Alexey and 
cache the size so we can return it if the log is closed but not yet deleted. 
Even if a closed log implies the log will be deleted "soon", it's better to be 
correct while the log still exists, and to be prepared for potential changes to 
the tablet lifecycle in the future.


http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS19, Line 364: OnDiskDataSize
> maybe OnDiskBaseDataSize()?
Done


PS19, Line 348:   // A rowset's space-occupying components are as follows:
              :   // - cfile set
              :   //   - base data
              :   //   - bloom file
              :   //   - ad hoc index
              :   // - delta files
              :   //   - UNDO deltas
              :   //   - REDO deltas
              : 
              :   // Return the size on-disk of this rowset, in bytes.
              :   uint64_t OnDiskSize() const OVERRIDE;
              : 
              :   // Return the on-disk size of this rowset's cfile set, in 
bytes.
              :   uint64_t CFileSetOnDiskSize() const;
              : 
              :   // Return the size on-disk of the base data (no deltas) in 
this rowset's cfile set, in bytes.
              :   uint64_t OnDiskDataSize() const OVERRIDE;
              : 
              :   // Return the size on-disk of this rowset's REDO deltas, in 
bytes.
              :   uint64_t RedoDeltaOnDiskSize() const;
              : 
              :   // Return the size on-disk of this rowset's cfile set's base 
data and this rowset's REDO deltas.
              :   uint64_t OnDiskDataSizeNoUndos() const OVERRIDE;
              : 
              :   size_t DeltaMemStoreSize() const OVERRIDE;
> this is getting kind of messy. would it be too much work to clean this up a
I don't think it's worth it, for the following reasons:
1. There's more subdivisions than what's listed there that are needed. We also 
need the cfile set base data size, for example. The result would be one field 
for each piece of each component and many fields.
2. It wouldn't really remove the need for the above functions. Each of them has 
a specific use, so if there were a summary struct then either it would need to 
be reprocessed in a new definition of these functions or the calls to the 
functions would be replaced by the functions' logic, which would mix the disk 
space logic with, e.g. the compaction selection logic.
3. The functions are structured to do the least locking to get what they need. 
If one function returned a summary, for each 2 or 3 bits actually needed we 
would acquire the locks to gather all the information.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-HasComments: Yes

Reply via email to