[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has submitted this change and it was merged. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. KUDU-1473: fix some tablet lock usage in CatalogManager This was probably due to the refactoring done in commit 59ff89d. Now that we're releasing tablet write locks as early as possible, we need to reacquire them (in read mode) for some operations. The lock misuse reared its head most often in the Java test TestKuduTable.testGetLocations, but could trigger in just about any test that didn't wait for table creation to finish before accessing the table. For example, CreateTableITest.TestCreateWhenMajorityOfReplicasFailCreation was a little flaky too. The backtrace was always the same: cow_object.h:82] Check failed: lock_.HasReaders() || lock_.HasWriteLock() @ 0x7fecb643e37b kudu::CowObject<>::state() at ??:0 @ 0x7fecb6422fe9 kudu::master::CatalogManager::ProcessPendingAssignments() at ??:0 @ 0x7fecb6421d3a kudu::master::CatalogManagerBgTasks::Run() at ??:0 @ 0x7fecb645c1f7 boost::_mfi::mf0<>::operator()() at ??:0 @ 0x7fecb645c15b boost::_bi::list1<>::operator()<>() at ??:0 @ 0x7fecb645c104 boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7fecb645bf2a boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7fecb0b66d82 boost::function0<>::operator()() at ??:0 @ 0x7fecafba96c0 kudu::Thread::SuperviseThread() at ??:0 @ 0x423faa __tsan_thread_start_func at ??:0 @ 0x7fecb28459d1 start_thread at ??:0 @ 0x7fecacd108fd clone at ??:0 Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Reviewed-on: http://gerrit.cloudera.org:8080/3309 Reviewed-by: Mike Percy Tested-by: Adar Dembo --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h 2 files changed, 41 insertions(+), 26 deletions(-) Approvals: Mike Percy: Looks good to me, approved Adar Dembo: Verified -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 7: Verified+1 Overriding Jenkins, another Java testDisconnect failure. -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Mike Percy has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Hello Jean-Daniel Cryans, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3309 to look at the new patch set (#7). Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. KUDU-1473: fix some tablet lock usage in CatalogManager This was probably due to the refactoring done in commit 59ff89d. Now that we're releasing tablet write locks as early as possible, we need to reacquire them (in read mode) for some operations. The lock misuse reared its head most often in the Java test TestKuduTable.testGetLocations, but could trigger in just about any test that didn't wait for table creation to finish before accessing the table. For example, CreateTableITest.TestCreateWhenMajorityOfReplicasFailCreation was a little flaky too. The backtrace was always the same: cow_object.h:82] Check failed: lock_.HasReaders() || lock_.HasWriteLock() @ 0x7fecb643e37b kudu::CowObject<>::state() at ??:0 @ 0x7fecb6422fe9 kudu::master::CatalogManager::ProcessPendingAssignments() at ??:0 @ 0x7fecb6421d3a kudu::master::CatalogManagerBgTasks::Run() at ??:0 @ 0x7fecb645c1f7 boost::_mfi::mf0<>::operator()() at ??:0 @ 0x7fecb645c15b boost::_bi::list1<>::operator()<>() at ??:0 @ 0x7fecb645c104 boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7fecb645bf2a boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7fecb0b66d82 boost::function0<>::operator()() at ??:0 @ 0x7fecafba96c0 kudu::Thread::SuperviseThread() at ??:0 @ 0x423faa __tsan_thread_start_func at ??:0 @ 0x7fecb28459d1 start_thread at ??:0 @ 0x7fecacd108fd clone at ??:0 Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h 2 files changed, 41 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/3309/7 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 7: Build Started http://104.196.14.100/job/kudu-gerrit/1799/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 6: (1 comment) > (1 comment) > > I wish there was an easier way to hide the data object so that we > can only access it through the lock object. It's too easy to make > the mistake what this patch is fixing when using COWLocks. I think the complexity arises because sometimes we also want to lock objects by calling StartMutation()/AbortMutation()/CommitMutation() directly, after which we'd use one of the state accessors to make changes. It's usually done in places where lock objects are awkward to work with. http://gerrit.cloudera.org:8080/#/c/3309/6/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: Line 525: // Send the "create tablet request" to a particular peer. > Comment needs updating -- it's for all peers of a particular tablet Done -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Mike Percy has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 6: (1 comment) I wish there was an easier way to hide the data object so that we can only access it through the lock object. It's too easy to make the mistake what this patch is fixing when using COWLocks. http://gerrit.cloudera.org:8080/#/c/3309/6/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: Line 525: // Send the "create tablet request" to a particular peer. Comment needs updating -- it's for all peers of a particular tablet -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 6: Verified+1 Another Java testDisconnect failure. -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Jean-Daniel Cryans has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Jean-Daniel Cryans has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/3309/5//COMMIT_MSG Commit Message: PS5, Line 13: It's still a mystery why the Java tests > Ah. I had changed the commit description in PS4 (via gerrit) then blew it a Alright good enough. -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 6: Build Started http://104.196.14.100/job/kudu-gerrit/1784/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has uploaded a new patch set (#6). Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. KUDU-1473: fix some tablet lock usage in CatalogManager This was probably due to the refactoring done in commit 59ff89d. Now that we're releasing tablet write locks as early as possible, we need to reacquire them (in read mode) for some operations. The lock misuse reared its head most often in the Java test TestKuduTable.testGetLocations, but could trigger in just about any test that didn't wait for table creation to finish before accessing the table. For example, CreateTableITest.TestCreateWhenMajorityOfReplicasFailCreation was a little flaky too. The backtrace was always the same: cow_object.h:82] Check failed: lock_.HasReaders() || lock_.HasWriteLock() @ 0x7fecb643e37b kudu::CowObject<>::state() at ??:0 @ 0x7fecb6422fe9 kudu::master::CatalogManager::ProcessPendingAssignments() at ??:0 @ 0x7fecb6421d3a kudu::master::CatalogManagerBgTasks::Run() at ??:0 @ 0x7fecb645c1f7 boost::_mfi::mf0<>::operator()() at ??:0 @ 0x7fecb645c15b boost::_bi::list1<>::operator()<>() at ??:0 @ 0x7fecb645c104 boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7fecb645bf2a boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7fecb0b66d82 boost::function0<>::operator()() at ??:0 @ 0x7fecafba96c0 kudu::Thread::SuperviseThread() at ??:0 @ 0x423faa __tsan_thread_start_func at ??:0 @ 0x7fecb28459d1 start_thread at ??:0 @ 0x7fecacd108fd clone at ??:0 Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h 2 files changed, 41 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/3309/6 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/3309/5//COMMIT_MSG Commit Message: PS5, Line 13: It's still a mystery why the Java tests > Can't get a regression test in C++? Ah. I had changed the commit description in PS4 (via gerrit) then blew it away when I pushed PS5. Short answer: at least one C++ test was flaky due to this, and I thought that was good enough as far as regression test coverage is concerned. -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Jean-Daniel Cryans has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/3309/5//COMMIT_MSG Commit Message: PS5, Line 13: It's still a mystery why the Java tests Can't get a regression test in C++? -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 5: Verified+1 Overriding Jenkins, the one failure was a Java test (testDisconnect) in TSAN mode. Dan has been working on that one. -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 5: Build Started http://104.196.14.100/job/kudu-gerrit/1779/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3309 to look at the new patch set (#5). Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. KUDU-1473: fix some tablet lock usage in CatalogManager This was probably due to the refactoring done in commit 59ff89d. Now that we're releasing tablet write locks as early as possible, we need to reacquire them (in read mode) for some operations. It's still a mystery why the Java tests sometimes trigger the locking violations and the C++ tests don't. Per the backtrace in the bug report, any GetTableLocations() call to a table whose tablets have been assigned but are otherwise still under construction may trigger a violation, as GetTableLocations() acquires a tablet lock in read mode. Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h 2 files changed, 41 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/3309/5 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 4: Build Started http://104.196.14.100/job/kudu-gerrit/1777/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3309 to look at the new patch set (#4). Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. KUDU-1473: fix some tablet lock usage in CatalogManager This was probably due to the refactoring done in commit 59ff89d. Now that we're releasing tablet write locks as early as possible, we need to reacquire them (in read mode) for some operations. The lock misuse reared its head most often in the Java test TestKuduTable.testGetLocations, but could trigger in just about any test that didn't wait for table creation to finish before accessing the table. For example, CreateTableITest.TestCreateWhenMajorityOfReplicasFailCreation was a little flaky too. The backtrace was always the same: cow_object.h:82] Check failed: lock_.HasReaders() || lock_.HasWriteLock() @ 0x7fecb643e37b kudu::CowObject<>::state() at ??:0 @ 0x7fecb6422fe9 kudu::master::CatalogManager::ProcessPendingAssignments() at ??:0 @ 0x7fecb6421d3a kudu::master::CatalogManagerBgTasks::Run() at ??:0 @ 0x7fecb645c1f7 boost::_mfi::mf0<>::operator()() at ??:0 @ 0x7fecb645c15b boost::_bi::list1<>::operator()<>() at ??:0 @ 0x7fecb645c104 boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7fecb645bf2a boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7fecb0b66d82 boost::function0<>::operator()() at ??:0 @ 0x7fecafba96c0 kudu::Thread::SuperviseThread() at ??:0 @ 0x423faa __tsan_thread_start_func at ??:0 @ 0x7fecb28459d1 start_thread at ??:0 @ 0x7fecacd108fd clone at ??:0 Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h 2 files changed, 41 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/3309/4 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 3: Build Started http://104.196.14.100/job/kudu-gerrit/1761/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Adar Dembo has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3309/2//COMMIT_MSG Commit Message: Line 15: GetTableLocations() call to a table whose tablets have been assigned but are > I'm not entirely following what GetTableLocations has to do with this bug - Yes, the crash is in ProcessPendingAssignments, but the DCHECK itself is: DCHECK(lock_.HasReaders() || lock_.HasWriteLock()); So we accessed an object without a lock while someone else had that object locked for reading or for writing. I don't think that "someone else" is the background thread, so I think it's got to be another client thread. A GetTableLocations() or GetTabletLocations() on the same table would lock some tablets for reading, and since the Java code in question does exactly that (create a table followed by two calls to GetTabletLocations() on the same table), it's conceivable that it's responsible. It's strange, though, since the Java client is supposed to wait for the table to be created before continuing. That's why I'm going to spend some more time digging into this. -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Todd Lipcon has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3309/2//COMMIT_MSG Commit Message: Line 15: GetTableLocations() call to a table whose tablets have been assigned but are I'm not entirely following what GetTableLocations has to do with this bug - isn't the crash in ProcessPendingAssignments? -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1753/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1745/ -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1473: fix some tablet lock usage in CatalogManager
Hello Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3309 to review the following change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. KUDU-1473: fix some tablet lock usage in CatalogManager This was probably due to the refactoring done in commit 59ff89d. Now that we're releasing tablet write locks as early as possible, we need to reacquire them (in read mode) for some operations. It's still a mystery why the Java tests sometimes trigger the locking violations and the C++ tests don't. Per the backtrace in the bug report, any GetTableLocations() call to a table whose tablets have been assigned but are otherwise still under construction may trigger a violation, as GetTableLocations() acquires a tablet lock in read mode. Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h 2 files changed, 41 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/3309/1 -- To view, visit http://gerrit.cloudera.org:8080/3309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Todd Lipcon