Alexey Serbin has posted comments on this change. Change subject: [catalog_manager] categorization of rw operation failures ......................................................................
Patch Set 25: > I chased the bug. It's the following. Say node A is at term 10 and > is leader current TSK seq no is 0. > 1 - Starts CatalogManagerBgTasks::Run(), which runs since its > leader, but takes a while to actually get to the part > TryGenerateNewTskUnlocked() is called. > 2 ,- In the meanwhile A loses leadership, B takes over and > generates TSK 1, later TSK 2. > 3 - B loses leadership, A wins it again. > 4 - Before A gets a chance to run the "leader election callback" > the bg task from 1 completes (it can because it's leader again). > The TSK that gets written is 1, breaking monotonicity. > > Note that this is a very contrived scenario that needs leadership > interleaving that is likely unrealistic when TSK's last days. One nit: in that scenario TSK 1 should fail to be written, so there are no TSK 1, just TSK 0 and TSK 2 in the table. If TSK 1 is already in the table, it would not be possible for the bg task to write a record with the same composite key. -- To view, visit http://gerrit.cloudera.org:8080/6170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I826826049e3c08a6c8345949690cbbedaea32ff8 Gerrit-PatchSet: 25 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: No