[jira] [Commented] (CASSANDRA-8801) Decommissioned nodes are willing to rejoin the cluster if restarted

Carl Yeksigian (JIRA) Wed, 29 Apr 2015 09:27:18 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519664#comment-14519664
 ]


Carl Yeksigian commented on CASSANDRA-8801:
-------------------------------------------

Yeah, it's below. It happens every time during decommission; running OS X 10.9, 
Java 8.

{code}
java.io.IOError: java.io.IOException: No such file or directory
        at 
org.apache.cassandra.net.MessagingService.shutdown(MessagingService.java:737) 
~[main/:na]
        at 
org.apache.cassandra.service.StorageService$6.run(StorageService.java:3278) 
~[main/:na]
        at 
org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:3351)
 [main/:na]
        at 
org.apache.cassandra.service.StorageService.decommission(StorageService.java:3288)
 [main/:na]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.8.0_31]
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[na:1.8.0_31]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_31]
        at java.lang.reflect.Method.invoke(Method.java:483) ~[na:1.8.0_31]
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) [na:1.8.0_31]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.8.0_31]
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[na:1.8.0_31]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_31]
        at java.lang.reflect.Method.invoke(Method.java:483) ~[na:1.8.0_31]
        at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) [na:1.8.0_31]
        at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
 [na:1.8.0_31]
        at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
 [na:1.8.0_31]
        at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) 
[na:1.8.0_31]
        at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) 
[na:1.8.0_31]
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) 
[na:1.8.0_31]
        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
 [na:1.8.0_31]
        at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) 
[na:1.8.0_31]
        at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
 [na:1.8.0_31]
        at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
 [na:1.8.0_31]
        at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
 [na:1.8.0_31]
        at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
 [na:1.8.0_31]
        at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:828)
 [na:1.8.0_31]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.8.0_31]
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[na:1.8.0_31]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_31]
        at java.lang.reflect.Method.invoke(Method.java:483) ~[na:1.8.0_31]
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323) 
[na:1.8.0_31]
        at sun.rmi.transport.Transport$1.run(Transport.java:200) [na:1.8.0_31]
        at sun.rmi.transport.Transport$1.run(Transport.java:197) [na:1.8.0_31]
        at java.security.AccessController.doPrivileged(Native Method) 
[na:1.8.0_31]
        at sun.rmi.transport.Transport.serviceCall(Transport.java:196) 
[na:1.8.0_31]
        at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) 
[na:1.8.0_31]
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
 [na:1.8.0_31]
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$240(TCPTransport.java:683)
 [na:1.8.0_31]
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$1/497297407.run(Unknown
 Source) [na:1.8.0_31]
        at java.security.AccessController.doPrivileged(Native Method) 
[na:1.8.0_31]
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) 
[na:1.8.0_31]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_31]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_31]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
Caused by: java.io.IOException: No such file or directory
        at sun.nio.ch.NativeThread.signal(Native Method) ~[na:1.8.0_31]
        at 
sun.nio.ch.ServerSocketChannelImpl.implCloseSelectableChannel(ServerSocketChannelImpl.java:283)
 ~[na:1.8.0_31]
        at 
java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(AbstractSelectableChannel.java:234)
 ~[na:1.8.0_31]
        at 
java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:115)
 ~[na:1.8.0_31]
        at sun.nio.ch.ServerSocketAdaptor.close(ServerSocketAdaptor.java:137) 
~[na:1.8.0_31]
        at 
org.apache.cassandra.net.MessagingService$SocketThread.close(MessagingService.java:958)
 ~[main/:na]
        at 
org.apache.cassandra.net.MessagingService.shutdown(MessagingService.java:733) 
~[main/:na]
        ... 43 common frames omitted
{code}

> Decommissioned nodes are willing to rejoin the cluster if restarted
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-8801
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8801
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Eric Stevens
>            Assignee: Brandon Williams
>             Fix For: 3.0
>
>         Attachments: 8801-v2.txt, 8801.txt
>
>
> This issue comes from the Cassandra user group.
> If a node which was successfully decommissioned gets restarted with its data 
> directory in tact, it will rejoin the cluster immediately going to {{UN}} and 
> beginning to serve client requests.
> This is wrong - the node has consistency issues, having missed any writes 
> while it was offline because no hinted handoffs were being kept.  And in the 
> best case scenario (it's spotted and remediated immediately), near-100% 
> overstreaming will still occur.
> Also, whatever reasons the operator had for decommissioning the node would 
> presumably still be valid, so this action may threaten cluster stability if 
> the node is underpowered or suffering hardware issues.
> But what elevates this to critical is that if the node had been offline 
> longer than gc_grace_seconds, it may cause permanent and unrecoverable 
> consistency issues due to data resurrection.
> h3. Recommendation:
> A node should remember that it was decommissioned and refuse to rejoin a 
> cluster without at least a -Dflag forcing it to.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8801) Decommissioned nodes are willing to rejoin the cluster if restarted

Reply via email to