[jira] [Commented] (CASSANDRA-12008) Make decommission operations resumable

Paulo Motta (JIRA) Thu, 28 Jul 2016 16:31:40 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398414#comment-15398414
 ]


Paulo Motta commented on CASSANDRA-12008:
-----------------------------------------

Thanks for the patch, looking good! See comments below:

bq. Not sure if I did it correctly, but as I did, all StreamTransferTask from 
getSSTableSectionsForRanges will have same ranges, so maybe it's redundant to 
add its ranges each time we invoke StreamTransferTask.addTransferFile

You're right. In this case, let's just store a {{Map<String, Set<Range<Token>> 
transferredRangesPerKeyspace}} on {{StreamSession}} (populated by 
{{addTransferRanges}}) and used that instead on 
{{StreamStateStoreTest.handleStreamEvent}}, so we can get everything from the 
{{StreamSession}} itself. 

bq. If you don't mind I'll leave it for later, I'll create another JIRA ticket 
for re-factoring existing resumable bootstrap code once this one is done.

Sounds good! :-)

bq. If we are running decommission for first time, getStreamedRanges is still 
invoked raising a nodetool exception "Undefined column name operation", for 
solving that we probably should modify isDecommisioning flag behaviour, add a 
new flag indicating we are resuming or running for first time or add a new 
resumeDecommission method.

It seems {{getStreamedRanges}} is querying the 
[AVAILABLE_RANGES|https://github.com/kdmu/cassandra/blob/d546a0a855ef7c7060974920dd321caec4dbd2af/src/java/org/apache/cassandra/db/SystemKeyspace.java#L1328]
 table instead of {{STREAMED_RANGES}}, that's why is generating the {{Undefined 
column name operation}} error.

bq. STREAMED_RANGES table is not updated, maybe is due to handleStreamEvent 
only updates the table if event.eventType == StreamEvent.Type.STREAM_COMPLETE 
and it seems that our StreamEvent status is FAILED, we may need to change this 
to also update the table when event failed.

Maybe it's not working because of the previous error?  Perhaps it would help to 
add a unit test on {{StreamStateStoreTest}} to verify that 
{{updateStreamedRanges}} and {{getStreamedRanges}} is being populated correctly 
and working as expected. You can also add debug logs to troubleshoot.

Minor nits:

* Remove {{resetStreamedRanges()}} method
* {{SystemKeyspace.getStreamedRanges}} is being called from inside a for-loop 
what may be inefficient, it's maybe better to retrieve it before and re-use it 
inside the loop.

> Make decommission operations resumable
> --------------------------------------
>
>                 Key: CASSANDRA-12008
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>            Reporter: Tom van der Woerdt
>            Assignee: Kaide Mu
>            Priority: Minor
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
>         at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>         at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
>         at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>         at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>         at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>         at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>         at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
>         at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
>         at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
>         at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
>         at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
>         at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
>         at java.lang.Thread.run(Thread.java:745)
> 08:04 PM   ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to 
> become normal or restart
> See 'nodetool help' or 'nodetool help <command>'.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/<ipaddr>] 2016-06-14 18:05:47,275 StreamSession.java:520 - 
> [Stream #<streamid>] Streaming error occurred
> java.net.SocketTimeoutException: null
>         at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) 
> ~[na:1.8.0_77]
>         at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
> ~[na:1.8.0_77]
>         at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) 
> ~[na:1.8.0_77]
>         at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
>         at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> {code}
> If implementing retries is not possible, can we have a 'nodetool decommission 
> resume'?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12008) Make decommission operations resumable

Reply via email to