[
https://issues.apache.org/jira/browse/CASSANDRA-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318912#comment-17318912
]
Paulo Motta commented on CASSANDRA-16513:
-----------------------------------------
Hi [~Si_d], thanks for your interest on working on this ticket. I updated the
ticket description to reflect the previous discussion.
Currently you can query a [virtual
table|https://cassandra.apache.org/doc/latest/new/virtualtables.html] with
[cqlsh|https://cassandra.apache.org/doc/latest/tools/cqlsh.html]:
{code:java}
cqlsh:system_views> SELECT * FROM sstable_tasks;
keyspace_name | table_name | task_id | kind
| progress | total | unit
---------------+------------+--------------------------------------+------------+----------+----------+-------
basic | wide2 | c3909740-cdf7-11e9-a8ed-0f03de2d9ae1 | compaction |
60418761 | 70882110 | bytes
basic | wide2 | c7556770-cdf7-11e9-a8ed-0f03de2d9ae1 | compaction |
2995623 | 40314679 | bytes
{code}
However before virtual tables were a thing, operators could query this
information more easily with
[nodetool|https://cassandra.apache.org/doc/latest/tools/nodetool/nodetool.html]:
{code:java}
nodetool compactionstats
pending tasks: 5
compaction type keyspace table completed
total unit progress
Compaction Keyspace1 Standard1 282310680
302170540 bytes 93.43%
Compaction Keyspace1 Standard1 58457931
307520780 bytes 19.01%
Active compaction remaining time : 0h00m16s
{code}
The benefit of the first approach is that it's very easy for developers to add
new virtual tables to cassandra, since you only need to do this on the server
side and the client can simply query the virtual table with CQLSH.
The benefit of the second approach is that operators are already used to
discover and query system information via nodetool and the information comes
nicely formatted for human consumption, potentially in different formats. The
downside is that every new information that is added on the server requires a
new client nodetool command to be added.
The idea here is to add a new tool with the best of both worlds: allow
developers easily add new information that operators can query via virtual
tables, while providing a simple way for operators to query this data and
export to different formats (such as JSON or YAML) via a CLI interface.
The good thing about this task is that we can do it very incrementally, we can
start with implementing "admintool show sstable_tasks" to display the
information above, and then add new virtual tables to it incrementally. All
virtual tables should expose the information in tabular, JSON and YAML format
(depending on the format parameter), but in the future each virtual table can
optionally implement a new data formatter to pretty-print virtual table data in
different formats.
I think a simple way to start is to create a simple tool that just fetches and
displays the contents of the "system.sstable_tasks" table in tabular format.
> Add tool to display or export the contents of a virtual table
> -------------------------------------------------------------
>
> Key: CASSANDRA-16513
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16513
> Project: Cassandra
> Issue Type: Improvement
> Components: Observability/Metrics, Tool/nodetool
> Reporter: Paulo Motta
> Priority: Normal
> Labels: gsoc2021, mentor
>
> Several virtual tables were recently added, but they're currently only
> accessible via cqlsh or programmatically. While this is valuable for many use
> cases, operators are accustomed with the convenience of querying system
> metrics with a simple nodetool command.
> In addition to that, a relatively common request is to provide nodetool
> output in different formats (JSON, YAML and even XML) (CASSANDRA-5977,
> CASSANDRA-12035, CASSANDRA-12486, CASSANDRA-12698, CASSANDRA-12503). However
> this requires lots of manual labor as each nodetool subcommand needs to be
> adapted to support new output formats.
> I propose adding a new CLI tool that will consistently print to the standard
> output the contents of a virtual table. By default the command will print the
> output in a tabular format similar to cqlsh, but a "--format" parameter can
> be specified to modify the output to some other format like JSON or YAML.
> It should be possible to add a limit to the amount of rows displayed and
> filter to display only rows from with specific keys (ie. keyspace or table).
> The command should be flexible and provide simple hooks for registration and
> customization of new virtual tables.
> My vision is that this is a path towards deprecating JMX and toward CQL for
> management, as we move information currently available through JMX to virtual
> tables (as CASSANDRA-14457 did with compactionstats) and easily expose them
> in this new tool as more virtual tables are added. Eventually we can also add
> setters when we start supporting writeable virtual tables.
> I propose calling this tool admintool (naming bikeshedding welcome), for
> example:
> {noformat}
> admintool help
> admintool <subcommand> <entity>
> Available subcommands and entities are:
> subcommands:
> - show
> - set (future)
> entities:
> - caches
> - internode_inbound
> - internode_outbound
> - settings
> - sstable_tasks
> - system_properties
> - thread_pools
> nodetool show clients --format yaml
> ...
> nodetool show internode_outboud --format json
> ...
> nodetool show sstabletasks --filter keyspace=my_ks --filter table=my_table
> ...
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]