Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd 
Lipcon,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#3).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, and cfile-compatible-features.
More fields should be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id           
  |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH 
(key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH 
(key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH 
(key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH 
(key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH 
(key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH 
(key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH 
(key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH 
(key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH 
(key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH 
(key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH 
(key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH 
(key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH 
(key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH 
(key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | 
e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 
800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH 
(key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 
84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH 
(key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, 
cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | 
cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M  
          | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M  
          | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
5 files changed, 512 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/3
--
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Burkert <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>

Reply via email to