Hi hongyin, That's an amazing and interesting project, I wonder that if you can pack tsfile-cli with some tsfile-skills, just like lark-cli and lark-skills( https://github.com/larksuite/cli), and publish it as a plugin that we can directly install in some agents like claude code and codex. If so, we can let those agents to help us to do some data analysis in tsfile.
Best regards, ---------------------- Yuan Tian On Thu, Jun 4, 2026 at 12:35 PM 张洪胤 <[email protected]> wrote: > Hi all, > > > About me > -------- > I'm Zhang Hongyin, Apache IoTDB PMC. I've been working with the > TsFile format and wanted an easier way to inspect .tsfile files from the > command line, which led to the proposal below. (This is my first > contribution > to TsFile -- happy to adjust anything to match the project's > conventions.) > > > Motivation > ---------- > TsFile can be inspected programmatically and with the existing > print/sketch > utilities, but there is no single, pipe-friendly command that lets users > explore a .tsfile from the shell the way parquet-cli / pqrs do for > Parquet. > I put together "tsfile-cli", a single C++ binary (under cpp/tools/) that > provides the common read-only verbs plus a simple CSV/TSV import. It is > built > entirely on the existing public storage::TsFileReader and > storage::TsFileTableWriter APIs and does not modify the storage engine. > > > What it does > ------------ > Read / inspect (data goes to stdout, diagnostics to stderr, so it > composes > with awk/jq/sort): > ls list devices (tree model) or tables (table model) > schema per-series datatype / encoding / compression > meta file-level summary (model, counts, time range, size) > stats per-series count / min / max / first / last / sum (from > statistics) > head,cat preview / stream rows, with column projection and time-range > filters > sample deterministic reservoir sample > Output formats: csv | tsv | json | table (TTY-adaptive). Exit codes > 0/1/2/3. > > > Write / import: > write import CSV/TSV rows into a new table-model .tsfile, using an > explicit > --columns "name:TYPE:tag|field" schema (no type inference), > with > stdin support and silent-on-success (Unix style). > > Scope and non-goals (first iteration) > ------------------------------------- > - Read commands cover both tree and table models. > - "write" targets the table model with CSV/TSV input only. Tree-model > writes, > JSON input, type inference, and tsfile->tsfile transforms (convert / > merge / > rewrite) are deliberately left as follow-ups. > - Includes unit and in-process end-to-end tests (argument parsing, > formatters, > statistics, CSV/TSV parsing, and a write->read round-trip), plus a > README > under cpp/tools/. > > PR: https://github.com/apache/tsfile/pull/829 > > Feedback I'd especially appreciate: > 1. Whether a unified "tsfile-cli <command>" dispatcher is a direction > the > project wants. > 2. The verb surface and option naming -- anything missing or > non-idiomatic? > 3. The write command's scope and the explicit --columns schema > approach. > > Thanks for taking a look! > > Best regards, > Zhang Hongyin
