I'd like to improve my mental model of how Cassandra bootstrapping
works.  My understanding is that bootstrapping is just an extra step
during a node's startup where the node copies data from neighboring
nodes that, according to its token, it should own; afterwards, the
node behaves like any other node.

If that's correct, I have a few follow-up questions:

- At what point does the new node get inserted into the hash ring so
that reads/writes for keys get directed to it?
- What are the semantics of bootstrapping a node that's been in the
cluster before and already has some data that's possibly outdated?
Should this work?  This might be useful if a node's been out of
commission for sufficiently long period of time.
- If we pick a poor set of initial tokens, would it be sensible to
modify the tokens on existing nodes and then restart them with
bootstrapping in order to rebalance?

I've also noticed that I can get my cassandra cluster into a weird
state via bootstrapping, where it stops accepting reads/writes.  I'm
on Cassandra 0.4.1.  A simple repro case is to start all 3 nodes of a
3 node cluster (replication factor of 2) using bootstrapping.  Getting
a key that I've inserted then leads to an IndexOutOfBoundsException.
Another IndexOutOfBoundsException was thrown later while flushing.

DEBUG [pool-1-thread-2] 2009-10-28 02:18:19,907 CassandraServer.java
(line 258) get
DEBUG [pool-1-thread-2] 2009-10-28 02:18:19,908 CassandraServer.java
(line 307) multiget
ERROR [pool-1-thread-2] 2009-10-28 02:18:19,912 Cassandra.java (line
647) Internal error processing get
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
        at 
org.apache.cassandra.locator.RackUnawareStrategy.getStorageTokens(RackUnawareStrategy.java:99)
        at 
org.apache.cassandra.locator.RackUnawareStrategy.getReadStorageEndPoints(RackUnawareStrategy.java:68)
        at 
org.apache.cassandra.locator.RackUnawareStrategy.getReadStorageEndPoints(RackUnawareStrategy.java:45)
        at 
org.apache.cassandra.service.StorageService.getReadStorageEndPoints(StorageService.java:949)
        at 
org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:296)
        at 
org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
        at 
org.apache.cassandra.service.CassandraServer.multigetColumns(CassandraServer.java:271)
        at 
org.apache.cassandra.service.CassandraServer.multigetInternal(CassandraServer.java:325)
        at 
org.apache.cassandra.service.CassandraServer.multiget(CassandraServer.java:308)
        at 
org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:259)
        at 
org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:639)
        at 
org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:627)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
 INFO [PERIODIC-FLUSHER-POOL:1] 2009-10-28 02:18:45,719
ColumnFamilyStore.java (line 369) LocationInfo has reached its
threshold; switching in a fresh Memtable
 INFO [PERIODIC-FLUSHER-POOL:1] 2009-10-28 02:18:45,720
ColumnFamilyStore.java (line 1178) Enqueuing flush of
Memtable(LocationInfo)@1048641931
 INFO [MEMTABLE-FLUSHER-POOL:1] 2009-10-28 02:18:45,721 Memtable.java
(line 186) Flushing Memtable(LocationInfo)@1048641931
DEBUG [COMMIT-LOG-WRITER] 2009-10-28 02:18:45,877 CommitLog.java (line
466) discard completed log segments for
CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-125\
6696265813.log', position=423), column family 4. CFIDs are Keyspace1:
TableMetadata(Standard2: 1, Super1: 0, Standard1: 2, StandardByUUID1:
3, }), system: TableMetadata(Locatio\
nInfo: 4, HintsColumnFamily: 5, }), Analytics: TableMetadata(total: 6,
domain: 7, movie: 8, provider: 9, country: 10, }), }
DEBUG [COMMIT-LOG-WRITER] 2009-10-28 02:18:45,878 CommitLog.java (line
509) Marking replay position 423 on commit log
/var/lib/cassandra/commitlog/CommitLog-1256696265813.log
 INFO [MEMTABLE-FLUSHER-POOL:1] 2009-10-28 02:18:45,878 Memtable.java
(line 220) Completed flushing
/var/lib/cassandra/data/system/LocationInfo-1-Data.db
DEBUG [BOOT-STRAPPER:1] 2009-10-28 02:18:45,954 BootStrapper.java
(line 83) Exception was generated at : 10/28/2009 02:18:45 on thread
BOOT-STRAPPER:1
-1
java.lang.ArrayIndexOutOfBoundsException: -1
        at java.util.ArrayList.get(ArrayList.java:324)
        at 
org.apache.cassandra.service.StorageService.getAllRanges(StorageService.java:886)
        at 
org.apache.cassandra.dht.BootStrapper.getRangesWithSourceTarget(BootStrapper.java:98)
        at org.apache.cassandra.dht.BootStrapper.run(BootStrapper.java:73)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

Edmond

Reply via email to