Adar Dembo has posted comments on this change.

Change subject: design-docs: multi-master for 1.0 release
......................................................................


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/2527/1/docs/design-docs/multi-master-1.0.md
File docs/design-docs/multi-master-1.0.md:

Line 30: ## Gaps in the master
> good job with those markdown tags.
Yeah, I'm a pro now. Until the next time I use JIRA's "markdown", at which 
point I'll forget all about this.


Line 89: .
> this can also cause the cluster to be unbalanced right? maybe mention that
Seems unlikely, but I'll mention it.


Line 120: f
> points 2 and 3 seem even more serious than the title of the jira ticket. wa
This issue is perhaps the most complicated of the ones listed here, and I'm 
trying to shield readers from some of that complexity.

Point 2 is actually a non-issue due to the code referenced in KUDU-759. For the 
sake of simplicity, I'm assuming in this doc that this code has been removed 
(because it's a pretty bogus workaround for that specific issue). Point 3 is 
legitimate though.

I actually think that a stuck AlterTable() is more serious than reclaiming disk 
space from deleted tablets. Still think I should change KUDU-1353, though?


Line 130: ###
> These are the features to be implemented for 1.0 right? maybe mention that
Maybe, maybe not. For example, I can see us shipping 1.0 without fixing 
KUDU-500. Perhaps even without adding support for making master Raft config 
changes.


Line 147: XXX
> yeah probably remove
Done


Line 150: ####
> yeah likely file a ticket and leave this out
Filed KUDU-1372.


Line 165: #### Table, tablet, and tserver metrics
> same
Filed KUDU-1373.


Line 200: 2. All destructive actions taken by a tserver must be "fenced". That 
is, the
> only destructive or all the state changing operations?
What does broadening the definition buy us? Are we splitting semantic hairs or 
is there a real difference? Maybe you could provide an example?


Line 201: takes
> s/takes/take
Done


Line 204: current master term
> they should keep an opid (i.e. term and index) instead of just term (would 
I think someone (Mike, perhaps?) suggested that the term would be sufficient 
and the index was unnecessary, but I've reviewed the various design docs in 
gdocs and I can't find that suggestion.

Can you help me understand why the term is insufficient on its own?


Line 206: Ensure that the leader master replicates via Raft before triggering an
        :       action. It doesn't matter what is replicated (a no-op would 
suffice);
        :       a successful replication asserts that this master is still the 
leader.
> Need to think about this a bit further. I'm a bit worried that this is poin
To be fair, I think this is more complicated than option #1 at the moment, but 
I included it here for completeness and to evoke a discussion.


Line 212: partially replicated
        : operations
> are you talking about the ops that need more than one consensus round? didn
Sorry for the miscommunication. I'll do some RPC size measurement and update 
the doc with my conclusions.


-- 
To view, visit http://gerrit.cloudera.org:8080/2527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad76012977a45370b72a04d608371cecf90442ef
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to