[ 
https://issues.apache.org/jira/browse/CASSANDRA-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665610#comment-16665610
 ] 

Ariel Weisberg commented on CASSANDRA-14841:
--------------------------------------------

[~iamaleksey] in IRC
{noformat}
10:29 AM nope, that's safe, assuming you mean 'system'; 'system_schema' is also 
fine. things in system_distributed and system_traces is in the danger zone
10:29 AM ok, I see the problem
10:33 AM don't necessarily need new tables, but you can't just add a column to 
one of those, no
10:33 AM what I'm not seeing is what is doing the read
10:33 AM nothing in the code does it as far as I can see
10:33 AM code only writes
10:34 AM which would also be a problem if a writes comes from a 4.0 node in the 
mixed-mode
10:35 AM also do: read the comment for StorageService.maybeAddOrUpdateKeyspace()
10:35 AM and MigrationManager.forceAnnounceNewTable() that it calls
10:37 AM on startup 4.0 detects that the keyspace and the table exist, but the 
definitions are different, so it tries to force the migration with the new 
definition; that however will not be propagated to 3.x nodes because there is a 
schema barrier in place between different major-version nodes
10:38 AM that works as intended more or less, just annoying. although at least 
one problem is that the same 0 timestamp is reused, and for conflicting table 
params the new definition might never actually be applied depending on old/new 
value
10:39 AM that timestamp needs to be incremented on every major
10:39 AM I mentioned this before in one of the JIRAs
10:40 AM aweisberg: TL;DR is that reads and writes involving the new column in 
4.0 will fail on 3.0 side. not much you can do about it here
10:40 AM could avoid using those columns until the whole cluster is on the 
latest major
{noformat}

We don't support cross version streaming so people can't run repair anyways. At 
least not unless they are fairly sophisticated about how they invoke repair and 
what order they bounce nodes. They also probably can't be using vnodes. Maybe 
it's fine if this generates an error until the cluster is upgraded?

[~tommy_s] is it possible you have an external process that is reading from the 
repair history table that is causing this error to appear?

WRT to tracing. Creating a new table and writing to both seems a little sketchy 
since it will double the overhead of tracing even when not upgrading. Upgrade 
would also need to do a migration of data from the old table into the new 
table, and we can only perform the migration once all nodes are upgraded. I 
suppose after the cluster is not mixed version we could stop writing to the old 
table.

I'm not sure if the fix here should be documentation in NEW.TXT or 
documentation + changes to avoid generating errors in mixed version clusters by 
not doing the writes.

> Unknown column coordinator_port during deserialization
> ------------------------------------------------------
>
>                 Key: CASSANDRA-14841
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14841
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Tommy Stendahl
>            Assignee: Ariel Weisberg
>            Priority: Major
>
> When upgrading from 3.x to 4.0 I get exceptions in the old nodes once the 
> first 4.0 node starts up. I have tested to upgrade from both 3.0.15 and 
> 3.11.3 and get the same problem.
>  
> {noformat}
> 2018-10-22T11:12:05.060+0200 ERROR 
> [MessagingService-Incoming-/10.216.193.244] CassandraDaemon.java:228 
> Exception in thread Thread[MessagingService-Incoming-/10.216.193.244,5,main]
> java.lang.RuntimeException: Unknown column coordinator_port during 
> deserialization
> at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:452) 
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.db.filter.ColumnFilter$Serializer.deserialize(ColumnFilter.java:482)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:760)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) 
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]{noformat}
> I think it was introduced by CASSANDRA-7544.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to