[
https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046175#comment-14046175
]
Sylvain Lebresne commented on CASSANDRA-7464:
---------------------------------------------
bq. I'm curious what a better tool's output would look like?
That's a good question, thanks for asking it :)
Honestly, I haven't though about it yet. But imo what we'd want is something
that:
# is reasonably easily human readable (but it can still be json so it's also
easy to handle with tools), even if that means something verbose (we're not
targeting performance).
# contains all the information the SSTable store so we can reverse it (but it
doesn't have to be the json representation that is closest to the actual
underlying sstable format)
# can be generated without needing to load the entire sstable in memory.
For instance (and that's just meant to illustrate what I have in mind, I
haven't though it through), I could imagine something along the lines of:
{noformat}
[
{
'type' : 'partition',
'partition_key' : [
{ 'name' : 'pk1', 'value' : 3 }
{ 'name' : 'pk2', 'value' : 'foo' }
]
'deletion_info' : { 'deletion_time' : 32423423, 'tstamp' : 324234234 }
},
{
'type' : 'static_block',
'columns' : [
{ 'name' : 'static_col', 'value' : 'foo', 'tstamp' : 32423423 },
],
},
{
'type' : 'range_tombstone',
'start' : [
{ 'name' : 'ck', 'value' : 10 },
],
'end' : [
{ 'name' : 'ck', 'value' : 50 },
],
'deletion_info' : { 'deletion_time' : 32423423, 'tstamp' : 324234234 }
},
{
'type' : 'row',
'columns' : [
{ 'name' : 'ck', 'value' : 42 },
{ 'name' : 'v1', 'value' : [ 'foo', 'bar' ], 'tstamp' : 213893242 },
{ 'name' : 'v2', 'value' : { 'field1' : 3, 'field2' : 'foo' }, 'tstamp'
: 213893242, 'ttl' : 2133 },
],
'tombstones' : [
{ 'name' : 'v3', 'deletion_time' : 214124124, 'tstamp' : 322342342 }
]
},
...
]
{noformat}
Also, while we're at it, it would be nice if such new tool were able to do
stuff like "show me partition X for this sstable" (which would be done without
scanning the whole sstable obviously)
> Retire/replace sstable2json and json2sstable
> --------------------------------------------
>
> Key: CASSANDRA-7464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7464
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sylvain Lebresne
> Priority: Minor
>
> Both tools are pretty awful. They are primarily meant for debugging (there is
> much more efficient and convenient ways to do import/export data), but their
> output manage to be hard to handle both for humans and for tools (especially
> as soon as you have modern stuff like composites).
> There is value to having tools to export sstable contents into a format that
> is easy to manipulate by human and tools for debugging, small hacks and
> general tinkering, but sstable2json and json2sstable are not that.
> So I propose that we deprecate those tools and consider writing better
> replacements. It shouldn't be too hard to come up with an output format that
> is more aware of modern concepts like composites, UDTs, ....
--
This message was sent by Atlassian JIRA
(v6.2#6252)