[
https://issues.apache.org/jira/browse/CASSANDRA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
david.pan updated CASSANDRA-742:
--------------------------------
Attachment: 742-write_failed_when_bootstrapping_down.patch
This patch is not a perfect solution for this issue, but I can have a sweet
dream at night and I can deal with this accident the next morning. :-)
This patch will remove the bootstrapping endpoint from the tokenMetadata if
other nodes find this node is down.
The write opertion will be timeout before other nodes find the bootstrapping
node is down, but it will be OK after other nodes remove the bootstrapping node
from the pendingRanges.
> write operation will throw internal error if the bootstrapping node is down
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-742
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.5
> Environment: linux2.6
> Reporter: david.pan
> Fix For: 0.6
>
> Attachments: 742-write_failed_when_bootstrapping_down.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> the opertions are that :
> 1) bootstrap a node A;
> 2) keep on inserting data while bootstrapping;
> 3) stop the service of the node A;
> 4) then the following exception was found:
> ERROR [pool-1-thread-9] 2010-01-26 10:32:39,688 Cassandra.java (line 1064)
> Internal error processing insert
> java.lang.AssertionError
> at org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:213)
> at
> org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:142)
> at
> org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
> at
> org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1188)
> at
> org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
> at
> org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
> at
> org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:417)
> at
> org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:1056)
> at
> org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> I traced the code and found that
> "org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(Collection<InetAddress>)"
> will select a hinted endpoint for a dead endpoint, no mater whether it's a
> normal node or a bootstrapping node. To get the tokenID of the endpoint, this
> method will call "tokenMetadata_.getToken(ep);", but getToken() asserts that
> the endpoint should be a member of the ring only. Of course, the
> bootstrapping endpoint is not a member and a internal exception is throwed
> out.
> This exception will always be throwed out until I re-boostrapping. This is
> really a big prolem for me, because the bootstrapping will last 30 hours and
> my machines are not very durable. I have to get up from bed at night to deal
> with this accident. :-(
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.