[ 
https://issues.apache.org/jira/browse/CASSANDRA-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991649#comment-14991649
 ] 

Sylvain Lebresne commented on CASSANDRA-10652:
----------------------------------------------

There seems to be a hole in how we deal with upgrade of that trace keyspace. 
Basically, on upgrades, even if the schema of a trace table changed in 
{{TraceKeyspace}}, this won't be used because the table was pre-existing and 
the old schema is used. That's not a 3.0 bug per-se btw, I've manually checked 
that if you create a trace in 2.1 and upgrade to 2.2, then you won't get the 
{{client}} column (added by CASSANDRA-8162) back. In practice, this means 
CASSANDRA-8162 is only visible on new 2.2 clusters, but useless otherwise, 
which is kind of a problem.

Now, it doesn't have much more consequence in 2.1 -> 2.2 upgrade than making 
that new client column inaccessible, but 3.0 is more picky: while the node will 
use the old metadata, the rest of the tracing code *assumes* the client column 
exists and will write it, and 3.0 don't like writes to columns that don't exist.

Not sure what is the best fix yet. I'm not sure why we don't force the schema 
of those tracing tables to the most recent version. Probably going to try and 
see if something breaks if we do that, but if someone more familiar with the 
dealing of this keyspace wants to chime in with some other idea, feel free to.

> Tracing prevents startup after upgrading
> ----------------------------------------
>
>                 Key: CASSANDRA-10652
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10652
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Carl Yeksigian
>            Priority: Blocker
>             Fix For: 3.0.0
>
>
> After upgrading from 2.1 to 3.0, the {{system_traces.sessions}} table is not 
> properly upgraded to include the {{client}} column added in CASSANDRA-8162. 
> Just upgrading from a clean 2.2 install to 3.0 won't show this error because 
> the column was included in 2.2, it just doesn't break the queries in that 
> release.
> The errors I get when querying {{system_traces.sessions}}:
> {noformat}
> java.lang.RuntimeException: java.lang.IllegalStateException: 
> [ColumnDefinition{name=client, 
> type=org.apache.cassandra.db.marshal.InetAddressType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=command, 
> type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=coordinator, 
> type=org.apache.cassandra.db.marshal.InetAddressType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=duration, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=request, type=org.apache.cassandra.db.marshal.UTF8Type, 
> kind=REGULAR, position=-1}, ColumnDefinition{name=started_at, 
> type=org.apache.cassandra.db.marshal.TimestampType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=parameters, 
> type=org.apache.cassandra.db.marshal.MapType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type),
>  kind=REGULAR, position=-1}] is not a subset of [coordinator duration request 
> started_at parameters]
>       at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2350)
>  ~[main/:na]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>       at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[main/:na]
>       at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> Caused by: java.lang.IllegalStateException: [ColumnDefinition{name=client, 
> type=org.apache.cassandra.db.marshal.InetAddressType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=command, 
> type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=coordinator, 
> type=org.apache.cassandra.db.marshal.InetAddressType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=duration, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=request, type=org.apache.cassandra.db.marshal.UTF8Type, 
> kind=REGULAR, position=-1}, ColumnDefinition{name=started_at, 
> type=org.apache.cassandra.db.marshal.TimestampType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=parameters, 
> type=org.apache.cassandra.db.marshal.MapType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type),
>  kind=REGULAR, position=-1}] is not a subset of [coordinator duration request 
> started_at parameters]
>       at 
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:531) 
> ~[main/:na]
>       at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:465) 
> ~[main/:na]
>       at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:178)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:108)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:96)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:77)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:298)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:136)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:128)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:123)
>  ~[main/:na]
>       at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) 
> ~[main/:na]
>       at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:288) 
> ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1692)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2346)
>  ~[main/:na]
>       ... 4 common frames omitted
> {noformat}
> This means that we cannot read the sessions once a trace has occurred on 3.0. 
> Much worse is when a trace gets into a commit log, Cassandra will stop 
> startup because of replaying that file:
> {noformat}
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: 
> Unexpected error deserializing mutation; saved to 
> /var/folders/j4/f48yz8cx0c302b21nf3fd1vh0000gn/T/mutation5130350977980177204dat.
>   This may be caused by replaying a mutation against a table with the same 
> name but incompatible schema.  Exception follows: java.lang.RuntimeException: 
> Unknown column client during deserialization
>       at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:633)
>  [main/:na]
>       at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:556)
>  [main/:na]
>       at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:509)
>  [main/:na]
>       at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:404)
>  [main/:na]
>       at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:151)
>  [main/:na]
>       at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189) 
> [main/:na]
>       at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169) 
> [main/:na]
>       at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:295) 
> [main/:na]
>       at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:561)
>  [main/:na]
>       at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) 
> [main/:na]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to