Ok, so I gave the futex code a second look.

First of all - I remembered why this is related to Go - as I explained in
https://github.com/cloudius-systems/osv/issues/853, the futex
code was very lightly used, but could be much more heavily used in Go - in
one network benchmark that Benoit was running, it was
used as much as 10,000 times per second. And it's possible that with such
heavy use, more bugs get exposed.

Second, I think maybe I found the bug that prompted Benoit's patch. The
FUTEX_WAIT code has

waitqueue &q = queues[uaddr];
sched::thread::wait_for(queues_mutex, tmr, q);

So, while we wait on this queue, the queues_mutex is NOT held.

So FUTEX_WAKE can run in parallel, and when it does, it will find this
queue, wake_one() on it, and if the queue becomes empty, it deletes the
queue object.
The problem with this is that waking the waiting thread doesn't guarantee
that it will run first (obviously). If it doesn't, the wait_for is woken up
with the object q already destroyed.
But this is a bug - when wait_for (see sched.hh) wakes up from a wait(), it
calls the disarm() method on the waitqueue. And that will crash.

Using a shared_ptr<waitqueue> instead of the waitqueue object directly will
solve this bug: FUTEX_WAKE will destroy the shared_ptr, but the object will
be not be destroyed until the FUTEX_WAIT code which is using a copy of it
will also drop it, after returning from the wake.

I bet there's a more efficient solution to this bug, not involving yet
another atomic operation (in shared_ptr), but since the performance of this
futex implementation sucks anyway (and this patch will only make a tiny
difference), I'm leaning towards accepting this patch.

I'll open an issue about this, and reconsider committing this patch with a
modified description. Thanks for bringing it back!






--
Nadav Har'El
n...@scylladb.com

On Tue, Mar 6, 2018 at 3:46 PM, Waldemar Kozaczuk <jwkozac...@gmail.com>
wrote:

> This patch adds logic to make sure that waitqueue object actually
> exists in queues map before it tries to dereference it.
>
> The content of this patch was authored by Benoit Canet.
>
> Signed-off-by: Waldemar Kozaczuk <jwkozac...@gmail.com>
> ---
>  linux.cc | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/linux.cc b/linux.cc
> index 25d28ee..0256c31 100644
> --- a/linux.cc
> +++ b/linux.cc
> @@ -54,7 +54,7 @@ extern "C" long gettid()
>  // The __cxa_guard_* functions only call futex in the rare case of
> contention,
>  // in fact so rarely that OSv existed for a year before anyone noticed
> futex
>  // was missing. So the performance of this implementation is not critical.
> -static std::unordered_map<void*, waitqueue> queues;
> +static std::unordered_map<void*, std::shared_ptr<waitqueue>> queues;
>  static mutex queues_mutex;
>  enum {
>      FUTEX_WAIT           = 0,
> @@ -71,12 +71,16 @@ int futex(int *uaddr, int op, int val, const struct
> timespec *timeout,
>      case FUTEX_WAIT:
>          WITH_LOCK(queues_mutex) {
>              if (*uaddr == val) {
> -                waitqueue &q = queues[uaddr];
> +                auto i = queues.find(uaddr);
> +                if (i == queues.end()) {
> +                    queues[uaddr] = std::make_shared<waitqueue>();
> +                }
> +                std::shared_ptr<waitqueue> q = queues[uaddr];
>                  if (timeout) {
>                      sched::timer tmr(*sched::thread::current());
>                      tmr.set(std::chrono::seconds(timeout->tv_sec) +
>                              std::chrono::nanoseconds(timeout->tv_nsec));
> -                    sched::thread::wait_for(queues_mutex, tmr, q);
> +                    sched::thread::wait_for(queues_mutex, tmr, *q);
>                      // FIXME: testing if tmr was expired isn't quite
> right -
>                      // we could have had both a wakeup and timer
> expiration
>                      // racing. It would be more correct to check if we
> were
> @@ -86,7 +90,7 @@ int futex(int *uaddr, int op, int val, const struct
> timespec *timeout,
>                          return -1;
>                      }
>                  } else {
> -                    q.wait(queues_mutex);
> +                    q->wait(queues_mutex);
>                  }
>                  return 0;
>              } else {
> @@ -104,11 +108,11 @@ int futex(int *uaddr, int op, int val, const struct
> timespec *timeout,
>              auto i = queues.find(uaddr);
>              if (i != queues.end()) {
>                  int waken = 0;
> -                while( (val > waken) && !(i->second.empty()) ) {
> -                    i->second.wake_one(queues_mutex);
> +                while( (val > waken) && !(i->second->empty()) ) {
> +                    i->second->wake_one(queues_mutex);
>                      waken++;
>                  }
> -                if(i->second.empty()) {
> +                if(i->second->empty()) {
>                      queues.erase(i);
>                  }
>                  return waken;
> --
> 2.7.4
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to osv-dev+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to