[ 
https://issues.apache.org/jira/browse/CASSANDRA-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405939#comment-15405939
 ] 

Sylvain Lebresne commented on CASSANDRA-12335:
----------------------------------------------

Yes, super columns appears to be pretty broken.

The first problem is the {{compound}} flag indeed. We need super column 
families to be {{compound}} so we properly extract the first clustering 
component, i.e. so that {{column1}} in this example is correct. And doing that 
is enougn to fix the thrift case from the description of that ticket, or kind 
of.

The one caveat is that the description is trying to use {{cassandra-cli}} on 
3.0 and that's just not going to work: {{cassandra-cli}} was relying on the old 
schema tables, and so the {{null}} output seen above is only a sign of that, 
not of the query being broken (the {{null}} is printed by {{cassandra-cli}} 
directly, it doesn't even reach the server). But if you try the query using a 
direct thrift query, it works properly for the equivalent query (if 
{{compound}} is preserved).

With that said, changing the {{compound}} flag only works for new upgrade. If 
sstables have already been upgraded, then they'll just have the wrong content 
for {{column1}}. And I also don't think we can make scrub fix that for us, 
because I don't think scrub can decide if it needs to fix something (say the 
column is a blob, we can't know by inspecting the value if it's proper or not). 
So the only solution I see would be to do a minor bump of the sstable format, 
and add code when reading the "old" format to extract the proper content.

But then there is CQL. I focused on super columns in thrift so much on 
CASSANDRA-8099 that I forgot about their CQL access. And that's a problem.

First, I don't think we should set the {{dense}} flag for super columns. Doing 
so implies that on upgrade, a super column ends up with 2 clustering columns 
defined, but that's not how the internal layout expect things. And in practice, 
doing so break the thrift part (the thrift query for the example of the 
description throws server side for instance if you force the table {{dense}} 
when upgrading).

Still, we need to preserve backward compatibility for CQL, so we need those 2 
{{column2}} and {{value}} columns from a user point of view, even though we 
don't really have them internally. So I think we unfortunately need to special 
case the shit out of super column families in CQL. We need to silently add 
those 2 "fake" columns, which will never have anything internally, but are use 
to translate between what's exposed by CQL and what's actually stored. Ideally 
those columns should have a special "kind" to distinguish them, though adding a 
new {{Kind}} to {{ColumnDefinition}} might be an issue for backward 
compatibility at this point, so we may have to settle for {{REGULAR}} columns 
with some other mean to distinguish them. Point is, we probably shouldn't make 
any of those definition of kind {{CLUSTERING}} (even though form a use point of 
view, {{column2}} will work like a clustering) as there is way too much 
assumption about those internally, and that's why setting {{dense == true}} 
breaks things.

With those fake columns added, we'll need to modify {{SelectStatement}}, 
{{UpdateStatement}} and {{DeleteStatement}}, so that they translate anything 
refering to the fake column to how thing are actually internally represented 
(and the reverse).


> Super columns are broken after upgrading to 3.0
> -----------------------------------------------
>
>                 Key: CASSANDRA-12335
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12335
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jeremiah Jordan
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.x, 3.x
>
>
> Super Columns are broken after upgrading to cassandra-3.0 HEAD.  The below 
> script shows this.
> 2.1 cli output for get:
> {code}
> [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;
> => (name=name, value=Bob, timestamp=1469724504357000)
> {code}
> cqlsh:
> {code}
> [default@test]
>  key          | blobAsText(column1)
> --------------+---------------------
>  0x53696d6f6e |                attr
>      0x426f62 |                attr
> {code}
> 3.0 cli:
> {code}
> [default@unknown] use test;
> unconfigured table schema_columnfamilies
> [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;
> null
> [default@test]
> {code}
> cqlsh:
> {code}
>  key          | system.blobastext(column1)
> --------------+----------------------------------
>  0x53696d6f6e | \x00\x04attr\x00\x00\x04name\x00
>      0x426f62 | \x00\x04attr\x00\x00\x04name\x00
> {code}
> Run this from a directory with cassandra-3.0 checked out and compiled
> {code}
> ccm create -n 2 -v 2.1.14 testsuper
> echo "####################### Starting 2.1 #######################"
> ccm start
> MYFILE=`mktemp`
> echo "create keyspace test with placement_strategy = 
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = 
> {replication_factor:2};
> use test;
> create column family Sites with column_type = 'Super' and comparator = 
> 'BytesType' and subcomparator='UTF8Type';
> set Sites[utf8('Simon')][utf8('attr')]['name'] = utf8('Simon');
> set Sites[utf8('Bob')][utf8('attr')]['name'] = utf8('Bob');
> get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE
> ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE
> rm $MYFILE
> ~/.ccm/repository/2.1.14/bin/nodetool -p 7100 flush
> ~/.ccm/repository/2.1.14/bin/nodetool -p 7200 flush
> ccm stop
> # run from cassandra-3.0 checked out and compiled
> ccm setdir
> echo "####################### Starting Current Directory 
> #######################"
> ccm start
> ./bin/nodetool -p 7100 upgradesstables
> ./bin/nodetool -p 7200 upgradesstables
> ./bin/nodetool -p 7100 enablethrift
> ./bin/nodetool -p 7200 enablethrift
> MYFILE=`mktemp`
> echo "use test;
> get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE
> ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE
> rm $MYFILE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to