[jira] Commented: (CASSANDRA-195) Improve bootstrap algorithm

Sandeep Tata (JIRA) Wed, 12 Aug 2009 14:32:45 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742587#action_12742587
 ]


Sandeep Tata commented on CASSANDRA-195:
----------------------------------------


>> This should throw some kind of error -- either the other nodes are buggy, 
>> sending reads to a bootstrapping node, or your ops team has screwed up and 
>> is allowing client-level traffic to the node. neither should fail silently.

Good call. I'll throw a runtime exception here.

>> Similarly, we should add a verifyNotBootstrapping (better name?) call to 
>> CassandraServer instead of adding && !bootstrapping to all the StorageProxy 
>> calls. (There may be a place we can do this centrally by overriding the 
>> right method from the thrift-generated Cassandra interface.)

There's only one place in StorageProxy.readProtocol that we're testing for 
bootstrap mode.  I don't see why/how to move this to CassandraServer.


> + cf.addColumn(new Column(BOOTSTRAP, 
> BasicUtilities.shortToByteArray(isBootstrap?(short)1:(short)0)));
> + return new StorageMetadata(token, generation,isBootstrap);

> not to be anal, but watch the spacing on the ?: operator and method arguments 
> (this comment applies to other patch lines too)

Okay -- I'll do the right thing and move this into boolean methods in 
BasicUtilities.

>> I don't think storing the mode in ST and trying to second-guess the op is a 
>> good idea. If the op says -b, then we bootstrap. Otherwise we don't.

Yeah, I thought about this a bit ... I did this so that the op doesn't have to 
worry about re-starting correctly in bootstrap mode if the node died during 
bootstrap and got restarted. When it comes back up, you want it to resume 
bootstrap.
It makes it trickier to abort the bootstrap and startup in an unsafe mode. If 
that is really what the op wants to do, he should delete the entry in the ST 
from an admin app. I figured the normal thing would be to resume bootstrap and 
wanted to make that simpler/easier. I'd lean towards keeping this behaviour.

> + protected boolean localOnly;

>I think this means "send messages out to other nodes to bootstrap me" if true, 
>otherwise, bootstrap some other node. Is that right? It seems like those two 
>operations should be in difference classes, not a single class doing two 
>different things based on an if statement.

Yes -- that was the plan, but I ended up not using it. I'll clean this part out 
-- we're really using the same code to do either thing.

> + public static synchronized SSTableReader renameAndOpen(String dataFileName) 
> throws IOException

> fits better in SSTW -- Reader shouldn't be renaming things just as a 
> conceptual purity thing. :)

:-) Okay. Done.


> Improve bootstrap algorithm
> ---------------------------
>
>                 Key: CASSANDRA-195
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-195
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: all
>            Reporter: Sandeep Tata
>            Assignee: Sandeep Tata
>             Fix For: 0.5
>
>         Attachments: 195-v1.patch, 195-v2.patch
>
>
> When you add a node to an existing cluster and the map gets updated, the new 
> node may respond to read requests by saying it doesn't have any of the data 
> until it gets the data from the node(s) the previously owned this range (the 
> load-balancing code, when working properly can take care of this). While this 
> behaviour is compatible with eventual consistency, it would be much 
> friendlier for the new node not to "surface" in the EndPoint maps for reads 
> until it has transferred the data over from the old nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-195) Improve bootstrap algorithm

Reply via email to