[ 
https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046175#comment-14046175
 ] 

Sylvain Lebresne commented on CASSANDRA-7464:
---------------------------------------------

bq. I'm curious what a better tool's output would look like?

That's a good question, thanks for asking it :)

Honestly, I haven't though about it yet. But imo what we'd want is something 
that:
# is reasonably easily human readable (but it can still be json so it's also 
easy to handle with tools), even if that means something verbose (we're not 
targeting performance).
# contains all the information the SSTable store so we can reverse it (but it 
doesn't have to be the json representation that is closest to the actual 
underlying sstable format)
# can be generated without needing to load the entire sstable in memory.

For instance (and that's just meant to illustrate what I have in mind, I 
haven't though it through), I could imagine something along the lines of:
{noformat}
[
  {
    'type' : 'partition',
    'partition_key' : [
        { 'name' : 'pk1', 'value' : 3 }
        { 'name' : 'pk2', 'value' : 'foo' }
    ]
    'deletion_info' : { 'deletion_time' : 32423423, 'tstamp' : 324234234 }
  },
  {
    'type' : 'static_block',
    'columns' : [ 
        { 'name' : 'static_col', 'value' : 'foo', 'tstamp' : 32423423 },
    ],
  },
  {
    'type' : 'range_tombstone',
    'start' : [
        { 'name' : 'ck', 'value' : 10  },
    ],
    'end' : [
        { 'name' : 'ck', 'value' : 50  },
    ],
    'deletion_info' : { 'deletion_time' : 32423423, 'tstamp' : 324234234 }
  },
  {
    'type' : 'row',
    'columns' : [
        { 'name' : 'ck', 'value' : 42  },
        { 'name' : 'v1', 'value' : [ 'foo', 'bar' ], 'tstamp' : 213893242 },
        { 'name' : 'v2', 'value' : { 'field1' : 3, 'field2' : 'foo' }, 'tstamp' 
: 213893242, 'ttl' : 2133 },
    ],
    'tombstones' : [
        { 'name' : 'v3', 'deletion_time' : 214124124, 'tstamp' : 322342342 }
    ]
  },
  ...
]
{noformat}


Also, while we're at it, it would be nice if such new tool were able to do 
stuff like "show me partition X for this sstable" (which would be done without 
scanning the whole sstable obviously)


> Retire/replace sstable2json and json2sstable
> --------------------------------------------
>
>                 Key: CASSANDRA-7464
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7464
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Priority: Minor
>
> Both tools are pretty awful. They are primarily meant for debugging (there is 
> much more efficient and convenient ways to do import/export data), but their 
> output manage to be hard to handle both for humans and for tools (especially 
> as soon as you have modern stuff like composites).
> There is value to having tools to export sstable contents into a format that 
> is easy to manipulate by human and tools for debugging, small hacks and 
> general tinkering, but sstable2json and json2sstable are not that.  
> So I propose that we deprecate those tools and consider writing better 
> replacements. It shouldn't be too hard to come up with an output format that 
> is more aware of modern concepts like composites, UDTs, ....



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to