[jira] Commented: (CASSANDRA-713) Stacktrace when node taken offline

Jeff Lerman (JIRA) Wed, 23 Jun 2010 17:27:12 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881966#action_12881966
 ]


Jeff Lerman commented on CASSANDRA-713:
---------------------------------------

Hi all,

I just had this happen in Cassandra 0.6.1.    We're only running two nodes as 
of now and our second one was barely accepting any requests and only being 
replicated to for the most part.   The load went up to 9 consistently so we 
investigated and noticed its "Load" on nodetool was 2x as large as our other 
instance.   I went and cleared out the data and commitlogs, set autobootstrap 
to true and put it back in.

This is where our case gets funky...we noticed the other instance's load going 
up a lot and saw that the one I just readded was not doing much.  After awhile 
of contemplating, I took down the second one again.  Minutes later I found an 
open case about the anticompaction happening before full bootstrapping occurs.  
I found the data/stream dir on the working instance and saw that it was 
complete...but I had already taken down the second one!  So I deleted the 
stream dir to save space and figured I'd start the process again tomorrow.

A few hours later I am getting these Internal errors on writes:


ERROR [pool-1-thread-287117] 2010-06-23 19:16:51,754 Cassandra.java (line 1492) 
Internal error processing insert
java.lang.NullPointerException

The cassandra is still running, so I could sigquit it if anyone is interested 
in this mystery.

Thanks,

Jeff

> Stacktrace when node taken offline
> ----------------------------------
>
>                 Key: CASSANDRA-713
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-713
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Ryan Daum
>            Assignee: Jaakko Laine
>             Fix For: 0.5
>
>
> I took a node offline last week and then attempted to re-bootstrap its token 
> range with a new cassandra install on the same IP. I made gossip forget about 
> the node by restarting all other instances, then brought up the new node. It 
> said was bootstrapping, but it never finished bootstrapping after several 
> days. The node never showed up in the ring, but when I take it offline, I get 
> the following exception continually from all other nodes in the cluster:
> ERROR [pool-1-thread-8] 2010-01-18 21:01:32,405 Cassandra.java (line 1096) 
> Internal error processing batch_insert
> java.lang.NullPointerException
>         at 
> org.apache.cassandra.dht.BigIntegerToken.compareTo(BigIntegerToken.java:38)
>         at 
> org.apache.cassandra.dht.BigIntegerToken.compareTo(BigIntegerToken.java:23)
>         at java.util.Collections.indexedBinarySearch(Collections.java:215)
>         at java.util.Collections.binarySearch(Collections.java:201)
>         at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:130)
>         at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
>         at 
> org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1183)
>         at 
> org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
>         at 
> org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
>         at 
> org.apache.cassandra.service.CassandraServer.batch_insert(CassandraServer.java:445)
>         at 
> org.apache.cassandra.service.Cassandra$Processor$batch_insert.process(Cassandra.java:1088)
>         at 
> org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> In addition, I get frequent UnavailableExceptions on the other nodes.
> I cannot remove the token range for this node because it never officially 
> joined the ring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-713) Stacktrace when node taken offline

Reply via email to