[
https://issues.apache.org/jira/browse/CASSANDRA-15504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17016226#comment-17016226
]
Benedict Elliott Smith commented on CASSANDRA-15504:
----------------------------------------------------
It's even more complicated than you might think. These are some of the factors
that come to mind initially, that are probably not a complete catalogue of the
issues:
# 3.0 format sstables persist data in a manner that requires us to know how
many bytes are used, and they do not record what the type was when the sstable
was written. So at minimum to support this we would need to persist this type
information in all sstables (which we should anyway, but don't currently), as
opposed to using the system tables.
# We have to handle data from legacy sstables, which persist no information at
all about what data they contain, and for which it is very possible to find
poorly typed legacy information floating around from before we had proper
checks, and permitted mangling of type casts to write arbitrary things
# So, we'd need (1) and we'd need to ensure we didn't support any such
operation until we had established that no dangerous files exist on the
cluster, on any node (including refusing restoring them from backup or
importing them, for instance), but wait, we're not done
# Currently schema changes are also eventually consistent - this is slated to
be changed, but not for some time, and it will always have eventually
consistent propagation, even if there is serialized decision-making. So: what
happens if a node requests data for a field that used to be a different type
and _still is_ on the other node? How do we know what type we will receive?
We will need to verify the schema we're communicating with for each operation
between each pair of nodes. Which, again, is definitely something that is
likely to be implemented in the future, but it's non-trivial, and not pressing.
The long and the short of it is that schema behaviours were implemented back in
the Wild West era of Cassandra, and it's actually a lot more involved than the
implementors originally imagined. So until we have time to do it properly,
we've had to disable features like this that can lead to corrupted data through
misinterpretation - however unlikely it might be.
That said, in the meantime it's certainly possible to do this as an operator,
it just requires some annoying surgery on your cluster. Or, as I say, we'd be
more than happy for a volunteer with the time to take up this task.
> INT is incompatible with previous type SMALLINT
> -----------------------------------------------
>
> Key: CASSANDRA-15504
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15504
> Project: Cassandra
> Issue Type: Bug
> Reporter: Marcus Truscello
> Priority: Normal
>
> With the release of Cassandra 3.11.5 and the fixing of CASSANDRA-14948, it
> now appears that you can no longer re-add a SMALLINT column as an INT type.
> This is rather surprising as any SMALLINT value should be representable by an
> INT type.
> The following example was run on Cassandra 3.11.5 on CentOS 7 installed from
> official RedHat repo:
>
>
> {noformat}
> cqlsh> CREATE KEYSPACE demo WITH replication = {'class':'SimpleStrategy',
> 'replication_factor' : 1};
> cqlsh> CREATE TABLE demo.demo_table (
> ... user_id BIGINT,
> ... created TIMESTAMP,
> ... points SMALLINT,
> ... PRIMARY KEY (user_id, created)
> ... ) WITH CLUSTERING ORDER BY (created DESC);
> cqlsh> ALTER TABLE demo.demo_table DROP points;
> cqlsh> ALTER TABLE demo.demo_table ADD points INT;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot
> re-add previously dropped column 'points' of type int, incompatible with
> previous type smallint"{noformat}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]