Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14061 )
Change subject: [tablet] Fixed the bug of DeltaTracker::CountDeletedRows ...................................................................... Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/14061/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14061/2//COMMIT_MSG@38 PS2, Line 38: This is because there is DeltaTracker lack of lock protection when modify : the number of live rows in rowset_metadata_ and reset the deleted_row_count_. : This caused deleted_row_count_ to be duplicated when calculating the number : of live rows of DRS. It took me a while to understand the race. Below is a thread interleaving that I sketched which describes it: T1 T2 --- --- + In DT::Flush Take compact_flush_lock_ (excl) Take component_lock_ (excl) deleted_row_count_ = ... Release component_lock_ + In DT::FlushDMS Call RSMD::IncrementLiveRows --> RSMD::live_row_count now includes deleted_row_count_ + In DRS::CountLiveRows Take component_lock_ (shared) Call RSMD::live_row_count - DT::CountDeletedRows --> RSMD::live_row_count - deleted_row_count_ --> we double counted deleted_row_count_ !!! Take component_lock_ (excl) deleted_row_count_ = 0 Release component_lock_ Maybe you can clean this up and add it to the commit message? http://gerrit.cloudera.org:8080/#/c/14061/2/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: http://gerrit.cloudera.org:8080/#/c/14061/2/src/kudu/tablet/delta_tracker.cc@746 PS2, Line 746: RETURN_NOT_OK(rowset_metadata_->CommitRedoDeltaDataBlock(dms->id(), block_id)); Is it problematic that the live row count isn't changed atomically as part of this operation? Consider the following sequence: 1. One MM thread is in the middle of flushing a DMS, and makes it as far as IncrementLiveRows before a context switch. Meaning, the RSMD's live row count has changed, but neither the RSMD's last durable DMS ID nor its list of redo block IDs has been updated. 2. Meanwhile, another MM thread does a compaction, flushing the tablet metadata (and all RSMDs), including the updated live row count from #1. 3. The server crashes before the DMS flushing thread can call CommitRedoDeltaDataBlock and flush the tablet metadata. 4. On restart, the tablet is bootstrapped. Its metadata includes a live row count that reflects the DMS flush, but it's as if the DMS flush "never happened" because a) the new redo block ID wasn't committed, and b) the rowset's last durable DMS ID wasn't updated. As I see it, it seems like all of the RSMD changes in the DMS flush need to happen atomically; otherwise the live row count may not reflect reality after a crash. Do you think this is a real problem? If so, could we prove it in a unit test in some way? -- To view, visit http://gerrit.cloudera.org:8080/14061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9bb4456123087778c9dc799777c5990938a84fdf Gerrit-Change-Number: 14061 Gerrit-PatchSet: 2 Gerrit-Owner: Yao Xu <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu <[email protected]> Gerrit-Comment-Date: Wed, 14 Aug 2019 17:23:26 +0000 Gerrit-HasComments: Yes
