[ 
https://issues.apache.org/jira/browse/CASSANDRA-15811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149502#comment-17149502
 ] 

Marcus Eriksson commented on CASSANDRA-15811:
---------------------------------------------

posting my wip branch for this 
[here|https://github.com/krummas/cassandra/commits/marcuse/15811-2] for 
discussion/early feedback

This adds {{FORCE DROP COMPACT STORAGE}}

Approach is to hide these columns basically in the same way as we do while the 
table is still {{COMPACT STORAGE}}, but now by adding a flag to the 
{{ColumnDefinition}} stating that the columns should be hidden. 

It also has a commit which allows users to create these tables to restore a 
backup or similar, but not entirely sure how/if we should allow that. Allowing 
to ALTER a column to be hidden would allow users who execute {{DROP COMPACT 
STORAGE}} without the {{FORCE}} by mistake to "fix" the table by hiding 
{{column1}}/{{value}}

My first attempt at this actually dropped the columns (including the 
clustering), but that got extremely complicated to handle in the case where the 
schema is not propagated everywhere - one replica might think we have a 
clustering while the coordinator doesn't.

cc [~ifesdjeen]/[~slebresne]

> Improve DROP COMPACT STORAGE
> ----------------------------
>
>                 Key: CASSANDRA-15811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Schema
>            Reporter: Alex Petrov
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x
>
>
> DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to 
> deprecate Thrift. However, current semantics of dropping compact storage 
> flags from tables reveal several columns that are usually empty (colum1 and 
> value in non-dense case, value for dense columns, and a column with an empty 
> name for super column families). Showing these columns  can confuse 
> application developers, especially ones that have never used thrift and/or 
> made writes that assumed presence of those fields, and used compact storage 
> in 3.x because is has “compact” in the name.
> There’s not much we can do in a super column family case, especially 
> considering there’s no way to create a supercolumn family using CQL, but we 
> can improve dense and non-dense cases. We can scan stables and make sure 
> there are no signs of thrift writes in them, and if all sstables conform to 
> this rule, we can not only drop the flag, but also drop columns that are 
> supposed to be hidden. However, this is both not very user-friendly, and is 
> probably not worth development effort. 
> An alternative to scanning is to add {{FORCE DROP COMPACT}} syntax (or 
> something similar) that would just drop columns unconditionally. It is likely 
> that people who were using compact storage with thrift know they were doing 
> that, so they'll usually use "regular" {{DROP COMPACT}}, withouot force, that 
> will simply reveal the columns as it does right now.
> Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually 
> have data[*] we had to remove empty type validation, properly handling 
> compact storage starts making more sense, but we’ll solve it through not 
> having columns, hence not caring about values instead, or keeping values 
> _and_ data, not requiring validation in this case. EmptyType field will have 
> to be handled differently though.
> [*] as it is possible to end up with sstables upgraded from 2.x or written in 
> 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster 
> is guaranteed to have empty values in this column, and this behaviour, even 
> if undesired, might be used by people. 
> Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows 
> any non-empty value to be written to it, but we already allow creating table 
> via CQL, and still write data into it with thrift. It seems to have been 
> unintended, but it might have become a feature people rely on. If we simply 
> back port 15373 to 2.2 and 2.1, we’ll change and break behaviour. Given 
> no-one complained in 3.0 and 3.11, this assumption is unlikely though. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to