Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8970 )
Change subject: Add 'kudu fs list' tool ...................................................................... Add 'kudu fs list' tool This tool aims to replace exploratory usages of 'kudu fs dump' and 'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is more flexible, easier to use, and can show more information. Output is formatted using the DataTable abstraction, which gives it good default pretty-printing, with options to output in CSV and JSON for scripts. Results can easily be filtered to a specific table, tablet, column, rowset, or block using flags. The tool can output many different fields: table, table-id, tablet-id, partition, rowset-id, block-id, block-kind, column, column-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size cfile-incompatible-features, cfile-compatible-features, cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should be straightforward to add. The tool transparently joins information from tablet superblocks with CFile footers, only materializing the metadata necessary to satisfy the requested fields and filters. Examples: To get our bearings, let's look at what tablets are stored on a local tablet server: ```bash $ kudu fs list --fs-wal-dir=/data/kudu/tserver \ --columns="table, table-id, tablet-id, partition" table | table-id | tablet-id | partition -----------------------------------------------+----------------------------------+----------------------------------+--------------------------------------------------------- loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED foo | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa | loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED ``` The 'foo' table looks interesting; let's drill down into its tablet, and see what rowsets and blocks it has, and some of their associated metadata: ```bash $ kudu fs list --fs-wal-dir=/data/kudu/tserver \ --columns="rowset-id, column, column-id, block-kind, block-id" \ --tablet-id=c3ce418c72ab4fea8548387f236dd1fa rowset-id | column | column-id | block-kind | block-id -----------+--------+-----------+-------------+---------------- 0 | k1 | 10 | column | 90680632611552 0 | k2 | 11 | column | 90680632611553 0 | k3 | 12 | column | 90680632611554 0 | k4 | 13 | column | 90680632611555 0 | v1 | 14 | column | 90680632611556 0 | v2 | 15 | column | 90680632611557 0 | v3 | 16 | column | 90680632611558 0 | v4 | 17 | column | 90680632611559 0 | | | bloom | 90680632611560 0 | | | adhoc-index | 90680632611561 1 | k1 | 10 | column | 90680632611564 1 | k2 | 11 | column | 90680632611565 1 | k3 | 12 | column | 90680632611566 1 | k4 | 13 | column | 90680632611567 1 | v1 | 14 | column | 90680632611568 1 | v2 | 15 | column | 90680632611569 1 | v3 | 16 | column | 90680632611570 1 | v4 | 17 | column | 90680632611571 1 | | | bloom | 90680632611572 1 | | | adhoc-index | 90680632611573 ``` We can immediately see that this tablet has two rowsets, each of which has 8 column blocks, a bloom block, and an ad-hoc index block. Lets drill down futher and inspect the 'v4' column: ```bash $ kudu fs list --fs-wal-dir=<> \ --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \ --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \ --column-id=17 block-id | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size ----------------+-----------------+----------------+-------------------+------------------+------------ 90680632611555 | int64 | BIT_SHUFFLE | NO_COMPRESSION | 5.09M | 782.6K 90680632611567 | int64 | BIT_SHUFFLE | NO_COMPRESSION | 5.40M | 830.1K ``` And we can immediately see the CFile's on-disk encoding and compression, the number of cells, and the CFile/block size. Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d Reviewed-on: http://gerrit.cloudera.org:8080/8911 Reviewed-by: Adar Dembo <a...@cloudera.com> Tested-by: Kudu Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/8970 Reviewed-by: Todd Lipcon <t...@apache.org> --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/gutil/strings/join.h M src/kudu/tablet/rowset_metadata.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/tools/tool_action_tserver.cc 8 files changed, 585 insertions(+), 22 deletions(-) Approvals: Kudu Jenkins: Verified Todd Lipcon: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8970 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.5.x Gerrit-MessageType: merged Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d Gerrit-Change-Number: 8970 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org>