[
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400338#comment-15400338
]
Paulo Motta commented on CASSANDRA-12008:
-----------------------------------------
Thanks for the update! See follow-up below:
bq. I've added a new strategy, please let me know what do you think about it.
Instead of building a {{streamedRangesPerEndpoints Map<InetAddress,
Set<Range<Token>>}} manually, maybe it's better to modify {{getStreamedRanges}}
to take a description and keyspace as argument an return a {{Map<InetAddress,
Set<Range<Token>>}} by querying {{"SELECT * FROM system.streamed_ranges WHERE
operation = ? AND keyspace_name = ?"}}.
This way you can query {{getStreamedRanges}} directly to filter already
transferred ranges when iterating {{rangesToStreamByKeyspace}}.
bq. So instead we will obtain StreamSession from
StreamTransferTask.getSession() when each StreamTransferTask is complete i.e
when StreamStateStore.handleStreamEvent is invoked. All these means that we are
going to only pass its responsible keyspace.
I think we can simplify that and instead of adding {{transferTasks}} to
{{SessionCompleteEvent}} we can simply add the session description and
{{transferredRangesPerKeyspace}}, and that's all we will need to populate the
streamed ranges on {{StreamStateStore}}.
A minor nit is that the transferred ranges are always being overriden on
{{addTransferRanges}} while you should append to the existing set if it's
already present on {{transferredRangesPerKeyspace}}.
bq. Don't know if there's some problem with current implementation or there's
something weird in the set-up, but it skips twice the same range:
this is for different keyspaces, so you should add the keyspace name in the log
message so it's not confusing.
bq. I think it's the set-up itself since
StorageService.getChangedRangesForLeaving is also returning the same range twice
that's probably for the same reason as above, maybe it would be a good idea to
add the keyspace name in that log as well.
> Make decommission operations resumable
> --------------------------------------
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
> Issue Type: Improvement
> Components: Streaming and Messaging
> Reporter: Tom van der Woerdt
> Assignee: Kaide Mu
> Priority: Minor
>
> We're dealing with large data sets (multiple terabytes per node) and
> sometimes we need to add or remove nodes. These operations are very dependent
> on the entire cluster being up, so while we're joining a new node (which
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
> at java.lang.Thread.run(Thread.java:745)
> 08:04 PM ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to
> become normal or restart
> See 'nodetool help' or 'nodetool help <command>'.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/<ipaddr>] 2016-06-14 18:05:47,275 StreamSession.java:520 -
> [Stream #<streamid>] Streaming error occurred
> java.net.SocketTimeoutException: null
> at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211)
> ~[na:1.8.0_77]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> ~[na:1.8.0_77]
> at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> ~[na:1.8.0_77]
> at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> {code}
> If implementing retries is not possible, can we have a 'nodetool decommission
> resume'?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)