Dan Burkert has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8970 )

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, cfile-compatible-features,
cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should
be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id           
  |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH 
(key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH 
(key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH 
(key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH 
(key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH 
(key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH 
(key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH 
(key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH 
(key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH 
(key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH 
(key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH 
(key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH 
(key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH 
(key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH 
(key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | 
e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH 
(key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH 
(key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, 
cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | 
cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M  
          | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M  
          | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Reviewed-on: http://gerrit.cloudera.org:8080/8911
Reviewed-by: Adar Dembo <a...@cloudera.com>
Tested-by: Kudu Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/8970
Reviewed-by: Todd Lipcon <t...@apache.org>
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/gutil/strings/join.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
8 files changed, 585 insertions(+), 22 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Todd Lipcon: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/8970
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8970
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to