Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-29 Thread Mike Galbraith
On Wed, 2014-10-29 at 21:28 -0700, Darren Hart wrote: > On Thu, Oct 23, 2014 at 03:28:07PM -0400, Brian Silverman wrote: > > Here's the test code: > > > > I want to say "Thanks!" and pull it into futextest... but destroying > filesystems > and BIOS errors?!? may not be ideal failure detection

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-29 Thread Darren Hart
On Thu, Oct 23, 2014 at 03:28:07PM -0400, Brian Silverman wrote: > Here's the test code: > I want to say "Thanks!" and pull it into futextest... but destroying filesystems and BIOS errors?!? may not be ideal failure detection modes. (Apologies for being so late to this particular party). --

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-29 Thread Darren Hart
On Thu, Oct 23, 2014 at 03:28:07PM -0400, Brian Silverman wrote: Here's the test code: I want to say Thanks! and pull it into futextest... but destroying filesystems and BIOS errors?!? may not be ideal failure detection modes. (Apologies for being so late to this particular party). --

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-29 Thread Mike Galbraith
On Wed, 2014-10-29 at 21:28 -0700, Darren Hart wrote: On Thu, Oct 23, 2014 at 03:28:07PM -0400, Brian Silverman wrote: Here's the test code: I want to say Thanks! and pull it into futextest... but destroying filesystems and BIOS errors?!? may not be ideal failure detection modes.

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-26 Thread Thomas Gleixner
On Sat, 25 Oct 2014, Brian Silverman wrote: > On Sat, 25 Oct 2014, Thomas Gleixner wrote: > > > > pi_state_free and exit_pi_state_list both clean up futex_pi_state's. > > > exit_pi_state_list takes the hb lock first, and most callers of > > > pi_state_free do too. requeue_pi didn't, which causes

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-26 Thread Thomas Gleixner
On Sat, 25 Oct 2014, Brian Silverman wrote: On Sat, 25 Oct 2014, Thomas Gleixner wrote: pi_state_free and exit_pi_state_list both clean up futex_pi_state's. exit_pi_state_list takes the hb lock first, and most callers of pi_state_free do too. requeue_pi didn't, which causes lots of

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-25 Thread Brian Silverman
On Sat, 25 Oct 2014, Thomas Gleixner wrote: > > pi_state_free and exit_pi_state_list both clean up futex_pi_state's. > > exit_pi_state_list takes the hb lock first, and most callers of > > pi_state_free do too. requeue_pi didn't, which causes lots of problems. > > "causes lots of problems" is not

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-25 Thread Thomas Gleixner
On Thu, 23 Oct 2014, Brian Silverman wrote: First of all. Nice catch! > pi_state_free and exit_pi_state_list both clean up futex_pi_state's. > exit_pi_state_list takes the hb lock first, and most callers of > pi_state_free do too. requeue_pi didn't, which causes lots of problems. "causes lots

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-25 Thread Thomas Gleixner
On Thu, 23 Oct 2014, Brian Silverman wrote: First of all. Nice catch! pi_state_free and exit_pi_state_list both clean up futex_pi_state's. exit_pi_state_list takes the hb lock first, and most callers of pi_state_free do too. requeue_pi didn't, which causes lots of problems. causes lots of

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-25 Thread Brian Silverman
On Sat, 25 Oct 2014, Thomas Gleixner wrote: pi_state_free and exit_pi_state_list both clean up futex_pi_state's. exit_pi_state_list takes the hb lock first, and most callers of pi_state_free do too. requeue_pi didn't, which causes lots of problems. causes lots of problems is not really a

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-23 Thread Mike Galbraith
(CCs more eyeballs) On Thu, 2014-10-23 at 15:28 -0400, Brian Silverman wrote: > Here's the test code: Which took a 2 socket 28 core box (NOPREEMPT) out in short order. With patchlet applied, looks like it'll stay up (37 minutes and counting), I'll squeak if it explodes. Tested-by: Mike

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-23 Thread Brian Silverman
Here's the test code: #define _GNU_SOURCE #include #include #include #include #include #include #include #include // Whether to use a pthread mutex+condvar or do the raw futex operations. Either // one will break. // There's less user-space code involved with the non-pthread version,

[PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-23 Thread Brian Silverman
pi_state_free and exit_pi_state_list both clean up futex_pi_state's. exit_pi_state_list takes the hb lock first, and most callers of pi_state_free do too. requeue_pi didn't, which causes lots of problems. Move the pi_state_free calls in requeue_pi to before it drops the hb locks which it's already

[PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-23 Thread Brian Silverman
pi_state_free and exit_pi_state_list both clean up futex_pi_state's. exit_pi_state_list takes the hb lock first, and most callers of pi_state_free do too. requeue_pi didn't, which causes lots of problems. Move the pi_state_free calls in requeue_pi to before it drops the hb locks which it's already

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-23 Thread Brian Silverman
Here's the test code: #define _GNU_SOURCE #include unistd.h #include sys/types.h #include sys/wait.h #include assert.h #include sys/mman.h #include sys/syscall.h #include inttypes.h #include linux/futex.h // Whether to use a pthread mutex+condvar or do the raw futex operations. Either // one

Re: [PATCH] futex: fix a race condition between REQUEUE_PI and task death

2014-10-23 Thread Mike Galbraith
(CCs more eyeballs) On Thu, 2014-10-23 at 15:28 -0400, Brian Silverman wrote: Here's the test code: Which took a 2 socket 28 core box (NOPREEMPT) out in short order. With patchlet applied, looks like it'll stay up (37 minutes and counting), I'll squeak if it explodes. Tested-by: Mike