[ 
https://issues.apache.org/jira/browse/KUDU-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved KUDU-1719.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 1.1.0

Commit in:
https://github.com/apache/kudu/commit/2426ef30650318bafd72cc49984aa7b5719b58e0

> Heap use-after-free and deadlock on Messenger::Init() failure
> -------------------------------------------------------------
>
>                 Key: KUDU-1719
>                 URL: https://issues.apache.org/jira/browse/KUDU-1719
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 1.0.1
>            Reporter: Sailesh Mukil
>              Labels: easyfix
>             Fix For: 1.1.0
>
>
> {code}
> Status MessengerBuilder::Build(shared_ptr<Messenger> *msgr) {
>   ...
>   gscoped_ptr<Messenger> new_msgr(new Messenger(*this));
>   Status build_status = new_msgr->Init();
>   if (!build_status.ok()) {
>     new_msgr->AllExternalReferencesDropped();
>     return build_status;
>   }
>   ...
> }
> {code}
> When new_msgr->Init() fails, AllExternalReferencesDropped() will be called 
> which does a retain_self_.reset() which is a shared pointer to itself. If 
> that was the last shared pointer reference to itself, it will call the 
> destructor ~Messenger(). However, when the gscoped pointer 'new_msgr' goes 
> out of scope, it too will call the destructor, but by this point, all the 
> members have already been freed causing a use-after-free.
> ----
> When Messenger::Init() fails before all the reactors have been Init()'ed, the 
> reactors that haven't been Init()'ed do not give up their reference to the 
> Messenger object. This is because this reference is only given up in 
> ReactorThread::RunThread(), which is not called unless reactor->Init() is 
> called:
> {code}
> void ReactorThread::RunThread() {
>   ...
>   DVLOG(6) << "Calling ReactorThread::RunThread()...";
>   loop_.run(0);
>   VLOG(1) << name() << " thread exiting.";
>   // No longer need the messenger. This causes the messenger to
>   // get deleted when all the reactors exit.
>   reactor_->messenger_.reset();
> }
> {code}
> So when the gscoped_ptr 'new_msgr' goes out of scope, it will call the 
> destructor which calls STLDeleteElements(&reactors_); where the Messenger 
> references are finally dropped. But when the last Messenger reference is 
> dropped, that too will call the destructor ~Messenger() which gets stuck on 
> the lock:
> {code}
> Messenger::~Messenger() {
>   std::lock_guard<percpu_rwlock> guard(lock_);  // Gets stuck on this lock
>   CHECK(closing_) << "Should have already shut down";
>   STLDeleteElements(&reactors_); // This will call the ~Messenger() 
> destructor again
> }
> {code}
> The easy fix for this is to make the gscoped_ptr 'new_msgr' a raw pointer 
> instead, and not explicitly free it since it will be freed by the shared 
> pointers pointing to it. (But this may look like buggy code, not sure if 
> there's a better way)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to