Alexey Serbin has posted comments on this change.

Change subject: [catalog_manager] categorization of rw operation failures
......................................................................


Patch Set 25:

> I chased the bug. It's the following. Say node A is at term 10 and
 > is leader current TSK seq no is 0.
 > 1 - Starts CatalogManagerBgTasks::Run(), which runs since its
 > leader, but takes a while to actually get to the part
 > TryGenerateNewTskUnlocked() is called.
 > 2 ,- In the meanwhile A loses leadership, B takes over and
 > generates TSK 1, later TSK 2.
 > 3 - B loses leadership, A wins it again.
 > 4 - Before A gets a chance to run the "leader election callback"
 > the bg task from 1 completes (it can because it's leader again).
 > The TSK that gets written is 1, breaking monotonicity.
 > 
 > Note that this is a very contrived scenario that needs leadership
 > interleaving that is likely unrealistic when TSK's last days.

One nit: in that scenario TSK 1 should fail to be written, so there are no TSK 
1, just TSK 0 and TSK 2 in the table.  If TSK 1 is already in the table, it 
would not be possible for the bg task to write a record with the same composite 
key.

-- 
To view, visit http://gerrit.cloudera.org:8080/6170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I826826049e3c08a6c8345949690cbbedaea32ff8
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: No

Reply via email to