Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Hellstrom
On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Maarten Lankhorst
Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Hellstrom
On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Maarten Lankhorst
Op 13-09-13 09:46, Thomas Hellstrom schreef: On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Hellstrom
On 09/13/2013 09:51 AM, Maarten Lankhorst wrote: Op 13-09-13 09:46, Thomas Hellstrom schreef: On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Daniel Vetter
On Fri, Sep 13, 2013 at 7:33 AM, Thomas Hellstrom thellst...@vmware.com wrote: Given that all copy_to_user / copy_from_user paths are actually hit during testing, right? Ime it requires a bit of ingenuity to properly test this from userspace. We're using a few tricks in drm/i915 kernel testing:

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Peter Zijlstra
On Fri, Sep 13, 2013 at 09:46:03AM +0200, Thomas Hellstrom wrote: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks. bo_reserve(); // Wait for the BO to become available (interruptible) bo_unreserve(); // Where is

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Daniel Vetter
On Fri, Sep 13, 2013 at 10:23 AM, Thomas Hellstrom thellst...@vmware.com wrote: As previously mentioned, copy_from_user should return -EFAULT, since the VMAs are marked with VM_IO. It should not recurse into fault(), so evil user-space looses. I haven't put a printk in the code to prove this,

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Hellstrom
On 09/13/2013 10:32 AM, Daniel Vetter wrote: On Fri, Sep 13, 2013 at 10:23 AM, Thomas Hellstrom thellst...@vmware.com wrote: As previously mentioned, copy_from_user should return -EFAULT, since the VMAs are marked with VM_IO. It should not recurse into fault(), so evil user-space looses. I

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Daniel Vetter
On Fri, Sep 13, 2013 at 10:29 AM, Peter Zijlstra pet...@infradead.org wrote: On Fri, Sep 13, 2013 at 09:46:03AM +0200, Thomas Hellstrom wrote: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks. bo_reserve(); // Wait for the BO to

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Maarten Lankhorst
Op 13-09-13 10:23, Thomas Hellstrom schreef: On 09/13/2013 09:51 AM, Maarten Lankhorst wrote: Op 13-09-13 09:46, Thomas Hellstrom schreef: On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Peter Zijlstra
On Fri, Sep 13, 2013 at 10:41:54AM +0200, Daniel Vetter wrote: On Fri, Sep 13, 2013 at 10:29 AM, Peter Zijlstra pet...@infradead.org wrote: On Fri, Sep 13, 2013 at 09:46:03AM +0200, Thomas Hellstrom wrote: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Hellstrom
On 09/13/2013 10:58 AM, Maarten Lankhorst wrote: Op 13-09-13 10:23, Thomas Hellstrom schreef: On 09/13/2013 09:51 AM, Maarten Lankhorst wrote: Op 13-09-13 09:46, Thomas Hellstrom schreef: On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Gleixner
On Thu, 12 Sep 2013, Peter Zijlstra wrote: On Thu, Sep 12, 2013 at 05:35:43PM +0100, Chris Wilson wrote: Not quite, as it would be possible for the evil userspace to trigger a GPU hang that would cause the sane userspace to spin indefinitely waiting for the error recovery to kick in.

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Gleixner
On Thu, 12 Sep 2013, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner t...@linutronix.de wrote: I think for ttm drivers it's just execbuf being exploitable. But on drm/i915 we've had the same issue with the pwrite/pread ioctls, so a simple glBufferData(glMap) kind

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Gleixner
On Thu, 12 Sep 2013, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 9:58 PM, Thomas Gleixner t...@linutronix.de wrote: On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra pet...@infradead.org wrote: If 'sane' userspace is never supposed to do this, then only insane userspace is going to

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-13 Thread Thomas Gleixner
On Thu, 12 Sep 2013, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra pet...@infradead.org wrote: If 'sane' userspace is never supposed to do this, then only insane userspace is going to hurt from this and that's a GOOD (tm) thing, right? ;-) Afaik sane userspace

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Maarten Lankhorst
Op 12-09-13 17:06, Peter Zijlstra schreef: Hi Dave, So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra pet...@infradead.org wrote: The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Chris Wilson
On Thu, Sep 12, 2013 at 06:22:10PM +0200, Peter Zijlstra wrote: On Thu, Sep 12, 2013 at 05:58:49PM +0200, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra pet...@infradead.org wrote: The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:36:43PM +0200, Daniel Vetter wrote: The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 6:44 PM, Thomas Hellstrom thellst...@vmware.com wrote: I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve

[Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
Hi Dave, So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:11:25PM +0200, Maarten Lankhorst wrote: Op 12-09-13 17:06, Peter Zijlstra schreef: Hi Dave, So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Maarten Lankhorst
Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom
On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra pet...@infradead.org wrote: If 'sane' userspace is never supposed to do this, then only insane userspace is going to hurt from this and that's a GOOD (tm) thing, right? ;-) Afaik sane userspace doesn't hit the _deadlock_ (or lifelock if we have

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:35:43PM +0100, Chris Wilson wrote: Not quite, as it would be possible for the evil userspace to trigger a GPU hang that would cause the sane userspace to spin indefinitely waiting for the error recovery to kick in. So with FIFOn+1 preempting FIFOn its a live-lock

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner t...@linutronix.de wrote: I think for ttm drivers it's just execbuf being exploitable. But on drm/i915 we've had the same issue with the pwrite/pread ioctls, so a simple glBufferData(glMap) kind of recursion from gl clients blew the kernel to

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 9:58 PM, Thomas Gleixner t...@linutronix.de wrote: On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra pet...@infradead.org wrote: If 'sane' userspace is never supposed to do this, then only insane userspace is going to hurt from this and that's a GOOD (tm) thing,

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom
On 09/12/2013 10:39 PM, Thomas Gleixner wrote: On Thu, 12 Sep 2013, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner t...@linutronix.de wrote: I think for ttm drivers it's just execbuf being exploitable. But on drm/i915 we've had the same issue with the pwrite/pread

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Maarten Lankhorst
Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the preemption code and stumbled upon:

Re: [Intel-gfx] [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom
On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the