[jira] [Commented] (CASSANDRA-17698) sstabledump errors when dumping data from index

Stefan Miklosovic (Jira) Thu, 13 Oct 2022 02:02:07 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616891#comment-17616891
 ]


Stefan Miklosovic commented on CASSANDRA-17698:
-----------------------------------------------

Hi [~maxwellguo],

thanks for the patch! This is a good question. I would say that we should not 
add upload / commit sstables into the repository. I think the main reason these 
legacy tables were uploaded there was that Cassandra can not create them 
anymore (well, because it is "legacy"). But we still want to parse them.

Ideally, you should create these sstables in the test and then run the tool 
against them. So we know that, as Cassandra is being developed, it knows how to 
export it - sstables would be always the current ones.

There is a lot of helper methods on org.apache.cassandra.SchemaLoader for 
creating keyspaces, tables and indexes on them with some predefined schema. 
Then you would populate that table (hence index) with data.

You can look into SSTableLoaderTest where in "defineSchema" method the schema 
is created (definitely just pick what you find necessary). Then you may look 
into "testLoadingSSTable" method in the same class, at the beginning it writes 
some rows via CQLSSTableWriter. Do not forget to flush it via Util.flush(cfs); 
as shown in that method.

I would put all of this into "SSTableExportSchemaLoadingTest" as a new test 
method. You may create a dedicated class as well in the same package 
SSTableExportSchemaLoadingTest is. I can imagine that there will be  lot of 
variations of indexes and we should be able to test that every index type is 
dumpable. 

If you look into SchemaLoader.schemaDefinition, there you see it creates 
basically all keyspaces / indexes / counter caches and so on. We should be able 
to verify that we can dump all of them via that tool.



> sstabledump errors when dumping data from index
> -----------------------------------------------
>
>                 Key: CASSANDRA-17698
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17698
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tool/sstable
>            Reporter: Stefan Miklosovic
>            Assignee: maxwellguo
>            Priority: Normal
>
> {code:java}
> cqlsh> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> CREATE TABLE ks1.tb1 ( id text, name text, primary key (id));
> cqlsh> CREATE INDEX IF NOT EXISTS ON ks1.tb1(name);
> cqlsh> INSERT INTO ks1.tb1 (id, name ) VALUES ( '1', 'Joe');
> cqlsh> exit
> ./bin/nodetool flush
> ./tools/bin/sstabledump 
> data/data/ks1/tb1-1c3c5f10ee4711ecab82eda2f44200b3/.tb1_name_idx/nb-1-big-Data.db
>  
> [
>   {
>     "partition" : {
>       "key" : [ "Joe" ],
>       "position" : 0
>     },
>     "rows" : [
>       {
>         "type" : "row",
>         "position" : 17,
>         "clustering" : [ ] } ] } ]Exception in thread "main" 
> java.lang.UnsupportedOperationException
>         at 
> org.apache.cassandra.db.marshal.PartitionerDefinedOrder.toJSONString(PartitionerDefinedOrder.java:87)
>         at 
> org.apache.cassandra.db.marshal.AbstractType.toJSONString(AbstractType.java:187)
>         at 
> org.apache.cassandra.tools.JsonTransformer.serializeClustering(JsonTransformer.java:372)
>         at 
> org.apache.cassandra.tools.JsonTransformer.serializeRow(JsonTransformer.java:269)
>         at 
> org.apache.cassandra.tools.JsonTransformer.serializePartition(JsonTransformer.java:235)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
>         at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
>         at java.util.Iterator.forEachRemaining(Iterator.java:116)
>         at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>         at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
>         at 
> org.apache.cassandra.tools.JsonTransformer.toJson(JsonTransformer.java:113)
>         at 
> org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:214) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-17698) sstabledump errors when dumping data from index

Reply via email to