David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/8860 )
Change subject: design-docs: improve cfile.md ...................................................................... Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md File docs/design-docs/cfile.md: http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@18 PS1, Line 18: compound primary key, then the associated ad-hoc : index will be stored in a CFile maybe: "in a separate CFile" to make the distinction from when it's stored inline with the data. http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@23 PS1, Line 23: A CFile is : written to a single BlockManager Block (not to be confused with the data blocks, : which are internal to CFiles, discussed below). Maybe it's be clearer to mention that a CFIle is a logically a unit with header, middle section and footer, but that its then split into blocks and the mapping of blocks to actual files is determined by the block manager http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@64 PS1, Line 64: TODO(dan): point out when/what version the v1 -> v2 transition was made. Maybe add a version history with a super short blurb and commit hashes? http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@76 PS1, Line 76: How big is a data block in bytes and row count, typically? : - How do we decide when a data block is full (by data size, by # of values, ...)? does this necessarily have to be in this file, if there are no limits to these things? maybe it'd be more helpful to list the most relevant flags? http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@170 PS1, Line 170: TODO(dan): What's the branching factor of the B-Tree? again, should this be here? http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@212 PS1, Line 212: TODO(dan): are the index blocks written out contiguously, or are they : interspersed with data blocks? interspersed, iirc http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@217 PS1, Line 217: The info about the different types of indexes is super important, : conceptually. It should probably be part of the intro paragraphs. +1 Also we should hint at their uses. http://gerrit.cloudera.org:8080/#/c/8860/1/docs/design-docs/cfile.md@225 PS1, Line 225: reponsible not your fault, but typo. -- To view, visit http://gerrit.cloudera.org:8080/8860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I770028bba3f7a49c96f32893c285221c84be39ce Gerrit-Change-Number: 8860 Gerrit-PatchSet: 1 Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Fri, 22 Dec 2017 20:43:26 +0000 Gerrit-HasComments: Yes
