Andrew Wong created KUDU-3229:
---------------------------------
Summary: Tooling to examine and operate on log blocks
Key: KUDU-3229
URL: https://issues.apache.org/jira/browse/KUDU-3229
Project: Kudu
Issue Type: Improvement
Components: fs, ops-tooling
Reporter: Andrew Wong
It's somewhat troublesome to examine the contents of a log block container
today. Tooling exists in the form of {{kudu pbc dump}} for metadata and
{{hexdump}} for data, but it'd be nice to have more specialized tooling for
examining containers to understand things like:
* What blocks are in this container? When was each block last updated? You can
piece this together from the {{kudu pbc dump}} on the metadata, but having
something more tabular might be nice.
* Does each block actually contain any data? If not, which don't?
* Does each block have a valid header if it were a CFile block?
Some of the information I'd like to get at falls out of the purview of the log
block manager itself, and requires information like what kind of blocks we're
dealing with. But the underlying struggle I'd like to address is: given a
container, can we be more rigorous about our checks that the data is OK, and
flag blocks that appear broken?
The context of this was a (Kudu version 1.5.x) case in which some form of
corruption occurred, and we were left with containers that appeared to have
holes punched out of them, resulting in messages complaining about bad CFile
header magic values of "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" (vs
the expected "kuducfl2"). The log block metadata and tablet metadata both had
records of many blocks, but the corresponding locations in the data files were
all zeroes. It's unclear how this happened, but even just examining the
containers and blocks therein was not well-documented.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)