Re: [PATCH] irq_work: Don't reinvent the wheel but use existing llist API

2017-11-09 Thread Chris Wilson
Quoting Frederic Weisbecker (2017-10-31 01:46:54)
> From: Byungchul Park 
> 
> Although llist provides proper APIs, they are not used. Make them used.
> 
> Signed-off-by: Byungchul Park 
> Cc: Peter Zijlstra 
> Signed-off-by: Frederic Weisbecker 
> ---
>  kernel/irq_work.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index bcf107c..e2ebe8c 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -138,11 +138,7 @@ static void irq_work_run_list(struct llist_head *list)
> return;
>  
> llnode = llist_del_all(list);
> -   while (llnode != NULL) {
> -   work = llist_entry(llnode, struct irq_work, llnode);
> -
> -   llnode = llist_next(llnode);
> -
> +   llist_for_each_entry(work, llnode, llnode) {
> /*
>  * Clear the PENDING bit, after this point the @work
>  * can be re-used.

I haven't seen this reported yet, but this requires
llist_for_each_entry_safe() because as mentioned work can be linked into
another cpu's runlist after the PENDING bit is cleared.
-Chris


Re: WARNING in drm_modeset_lock_all

2017-10-31 Thread Chris Wilson
Quoting syzbot (2017-10-27 09:09:50)
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkal...@googlegroups.com.
> 
> syzbot will keep track of this bug report.
> Once a fix for this bug is committed, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug  
> report.

Can we use 

Reported-by: syzbot 


as a unique tag for tracking purposes?
-Chris


Re: WARNING in drm_modeset_lock_all

2017-10-30 Thread Chris Wilson
Quoting syzbot (2017-10-27 09:09:50)
> Hello,
> 
> syzkaller hit the following crash on  
> 6f20b7a58cb9c0fe00badcdfd65b1f4a8f28dfc6
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> 
> 
> 
> 
> [ cut here ]
> WARNING: CPU: 2 PID: 11675 at drivers/gpu/drm/drm_modeset_lock.c:92  
> drm_modeset_lock_all+0x1fc/0x2d0 drivers/gpu/drm/drm_modeset_lock.c:92
> Kernel panic - not syncing: panic_on_warn set ...
> 
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:16 [inline]
>   dump_stack+0x194/0x257 lib/dump_stack.c:52
>   fail_dump lib/fault-inject.c:51 [inline]
>   should_fail+0x8c0/0xa40 lib/fault-inject.c:149
>   should_failslab+0xec/0x120 mm/failslab.c:31
>   slab_pre_alloc_hook mm/slab.h:422 [inline]
>   slab_alloc mm/slab.c:3383 [inline]
>   kmem_cache_alloc_trace+0x4b/0x750 mm/slab.c:3625
>   kmalloc include/linux/slab.h:493 [inline]
>   kzalloc include/linux/slab.h:666 [inline]
>   drm_modeset_lock_all+0x49/0x2d0 drivers/gpu/drm/drm_modeset_lock.c:91
>   drm_mode_obj_get_properties_ioctl+0x87/0x2b0  
> drivers/gpu/drm/drm_mode_object.c:359
>   drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:735
>   drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:831
>   vfs_ioctl fs/ioctl.c:45 [inline]
>   do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:685
>   SYSC_ioctl fs/ioctl.c:700 [inline]
>   SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
>   entry_SYSCALL_64_fastpath+0x1f/0xbe

This bug is for the unexpected allocation failure inside
drm_modeset_lock_all() (in this case from should_fail). To properly
document this behaviour, we should probably use

diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
b/drivers/gpu/drm/drm_modeset_lock.c
index e123497da0ca..963e23db0fe7 100644
--- a/drivers/gpu/drm/drm_modeset_lock.c
+++ b/drivers/gpu/drm/drm_modeset_lock.c
@@ -93,7 +93,7 @@ void drm_modeset_lock_all(struct drm_device *dev)
struct drm_modeset_acquire_ctx *ctx;
int ret;
 
-   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL | __GFP_NOFAIL);
if (WARN_ON(!ctx))
return;

Then it will turn up on somebody'ss too-small-to-fail fix list.
-Chris


Re: [PATCH] drm/i915/selftests: Convert timers to use timer_setup()

2017-10-25 Thread Chris Wilson
Quoting Chris Wilson (2017-10-25 11:24:19)
> Quoting Chris Wilson (2017-10-24 17:17:09)
> > Quoting Kees Cook (2017-10-24 16:13:44)
> > > In preparation for unconditionally passing the struct timer_list pointer 
> > > to
> > > all timer callbacks, switch to using the new timer_setup() and 
> > > from_timer()
> > > to pass the timer pointer explicitly.
> > > 
> > > Cc: Jani Nikula 
> > > Cc: Joonas Lahtinen 
> > > Cc: Rodrigo Vivi 
> > > Cc: David Airlie 
> > > Cc: Tvrtko Ursulin 
> > > Cc: Chris Wilson 
> > > Cc: intel-...@lists.freedesktop.org
> > > Cc: dri-de...@lists.freedesktop.org
> > > Signed-off-by: Kees Cook 
> > 
> > Thank you for saving me from having to do this myself,
> > Reviewed-by: Chris Wilson 
> 
> I've a small batch of selftests patches queued, so added this one and
> will push to drm-intel-next-queued shortly.

Oh dear, major faux pas. There is no timer_setup_on_stack yet.
-Chris


Re: [PATCH] drm/i915/selftests: Convert timers to use timer_setup()

2017-10-25 Thread Chris Wilson
Quoting Chris Wilson (2017-10-24 17:17:09)
> Quoting Kees Cook (2017-10-24 16:13:44)
> > In preparation for unconditionally passing the struct timer_list pointer to
> > all timer callbacks, switch to using the new timer_setup() and from_timer()
> > to pass the timer pointer explicitly.
> > 
> > Cc: Jani Nikula 
> > Cc: Joonas Lahtinen 
> > Cc: Rodrigo Vivi 
> > Cc: David Airlie 
> > Cc: Tvrtko Ursulin 
> > Cc: Chris Wilson 
> > Cc: intel-...@lists.freedesktop.org
> > Cc: dri-de...@lists.freedesktop.org
> > Signed-off-by: Kees Cook 
> 
> Thank you for saving me from having to do this myself,
> Reviewed-by: Chris Wilson 

I've a small batch of selftests patches queued, so added this one and
will push to drm-intel-next-queued shortly.
-Chris


Re: [PATCH] drm/i915/selftests: Convert timers to use timer_setup()

2017-10-24 Thread Chris Wilson
Quoting Kees Cook (2017-10-24 16:13:44)
> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
> 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 
> Cc: David Airlie 
> Cc: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: intel-...@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Signed-off-by: Kees Cook 

Thank you for saving me from having to do this myself,
Reviewed-by: Chris Wilson 
-Chris


Re: [Intel-gfx] [PATCH v3] drm/i915: Replace *_reference/unreference() or *_ref/unref with _get/put()

2017-10-08 Thread Chris Wilson
Quoting Harsha Sharma (2017-10-08 15:04:07)
> @@ -624,7 +624,7 @@ static bool intel_fbdev_init_bios(struct drm_device *dev,
> ifbdev->preferred_bpp = fb->base.format->cpp[0] * 8;
> ifbdev->fb = fb;
>  
> -   drm_framebuffer_reference(&ifbdev->fb->base);
> +   drm_framebuffer_put(&ifbdev->fb->base);

Whoops.
-Chris


Re: [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged

2017-10-06 Thread Chris Wilson
Quoting Daniel Vetter (2017-10-06 10:06:37)
> stop_machine is not really a locking primitive we should use, except
> when the hw folks tell us the hw is broken and that's the only way to
> work around it.
> 
> This patch tries to address the locking abuse of stop_machine() from
> 
> commit 20e4933c478a1ca694b38fa4ac44d99e659941f5
> Author: Chris Wilson 
> Date:   Tue Nov 22 14:41:21 2016 +
> 
> drm/i915: Stop the machine as we install the wedged submit_request handler
> 
> Chris said parts of the reasons for going with stop_machine() was that
> it's no overhead for the fast-path. But these callbacks use irqsave
> spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
> 
> To stay as close as possible to the stop_machine semantics we first
> update all the submit function pointers to the nop handler, then call
> synchronize_rcu() to make sure no new requests can be submitted. This
> should give us exactly the huge barrier we want.
> 
> I pondered whether we should annotate engine->submit_request as __rcu
> and use rcu_assign_pointer and rcu_dereference on it. But the reason
> behind those is to make sure the compiler/cpu barriers are there for
> when you have an actual data structure you point at, to make sure all
> the writes are seen correctly on the read side. But we just have a
> function pointer, and .text isn't changed, so no need for these
> barriers and hence no need for annotations.
> 
> This should fix the followwing lockdep splat:
> 
> ==
> WARNING: possible circular locking dependency detected
> 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U
> --
> kworker/3:4/562 is trying to acquire lock:
>  (cpu_hotplug_lock.rw_sem){}, at: [] 
> stop_machine+0x1c/0x40
> 
> but task is already holding lock:
>  (&dev->struct_mutex){+.+.}, at: [] 
> i915_reset_device+0x1e8/0x260 [i915]
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #6 (&dev->struct_mutex){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__mutex_lock+0x86/0x9b0
>mutex_lock_interruptible_nested+0x1b/0x20
>i915_mutex_lock_interruptible+0x51/0x130 [i915]
>i915_gem_fault+0x209/0x650 [i915]
>__do_fault+0x1e/0x80
>__handle_mm_fault+0xa08/0xed0
>handle_mm_fault+0x156/0x300
>__do_page_fault+0x2c5/0x570
>do_page_fault+0x28/0x250
>page_fault+0x22/0x30
> 
> -> #5 (&mm->mmap_sem){}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__might_fault+0x68/0x90
>_copy_to_user+0x23/0x70
>filldir+0xa5/0x120
>dcache_readdir+0xf9/0x170
>iterate_dir+0x69/0x1a0
>SyS_getdents+0xa5/0x140
>entry_SYSCALL_64_fastpath+0x1c/0xb1
> 
> -> #4 (&sb->s_type->i_mutex_key#5){}:
>down_write+0x3b/0x70
>handle_create+0xcb/0x1e0
>devtmpfsd+0x139/0x180
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> -> #3 ((complete)&req.done){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>wait_for_common+0x58/0x210
>wait_for_completion+0x1d/0x20
>devtmpfs_create_node+0x13d/0x160
>device_add+0x5eb/0x620
>device_create_groups_vargs+0xe0/0xf0
>device_create+0x3a/0x40
>msr_device_create+0x2b/0x40
>cpuhp_invoke_callback+0xc9/0xbf0
>cpuhp_thread_fun+0x17b/0x240
>smpboot_thread_fn+0x18a/0x280
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> -> #2 (cpuhp_state-up){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>cpuhp_issue_call+0x133/0x1c0
>__cpuhp_setup_state_cpuslocked+0x139/0x2a0
>__cpuhp_setup_state+0x46/0x60
>page_writeback_init+0x43/0x67
>pagecache_init+0x3d/0x42
>start_kernel+0x3a8/0x3fc
>x86_64_start_reservations+0x2a/0x2c
>x86_64_start_kernel+0x6d/0x70
>verify_cpu+0x0/0xfb
> 
> -> #1 (cpuhp_state_mutex){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__mutex_lock+0x86/0x9b0
>mutex_lock_nested+0x1b/0x20
>__cpuhp_setup_state_cpuslocked+0x53/0x2a0
>__cpuhp_setup_state+0x46/0x60
>page_alloc_init+0x28/0x30
>start_kernel+0x145/0x3fc
>x86_64_start_reservations+0x2a/0x2c
>x86_64_start_kernel+0x6d/0x70
>verify_cpu+0x0/0xfb
>

Re: [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

2017-10-06 Thread Chris Wilson
>lock(&dev_priv->mm_lock);
>   lock(cpu_hotplug_lock.rw_sem);
> 
>  *** DEADLOCK ***
> 
> 2 locks held by prime_mmap/1551:
>  #0:  (&mm->mmap_sem){}, at: [] 
> i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915]
>  #1:  (&dev_priv->mm_lock){+.+.}, at: [] 
> i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
> 
> stack backtrace:
> CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G U  
> 4.14.0-rc1-CI-CI_DRM_3118+ #1
> Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
> Call Trace:
>  dump_stack+0x68/0x9f
>  print_circular_bug+0x235/0x3c0
>  ? lockdep_init_map_crosslock+0x20/0x20
>  check_prev_add+0x430/0x840
>  __lock_acquire+0x1420/0x15e0
>  ? __lock_acquire+0x1420/0x15e0
>  ? lockdep_init_map_crosslock+0x20/0x20
>  lock_acquire+0xb0/0x200
>  ? apply_workqueue_attrs+0x17/0x50
>  cpus_read_lock+0x3d/0xb0
>  ? apply_workqueue_attrs+0x17/0x50
>  apply_workqueue_attrs+0x17/0x50
>  __alloc_workqueue_key+0x1d8/0x4d9
>  ? __lockdep_init_map+0x57/0x1c0
>  i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
>  i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
>  ? i915_gem_userptr_release+0x140/0x140 [i915]
>  drm_ioctl_kernel+0x69/0xb0
>  drm_ioctl+0x2f9/0x3d0
>  ? i915_gem_userptr_release+0x140/0x140 [i915]
>  ? __do_page_fault+0x2a4/0x570
>  do_vfs_ioctl+0x94/0x670
>  ? entry_SYSCALL_64_fastpath+0x5/0xb1
>  ? __this_cpu_preempt_check+0x13/0x20
>  ? trace_hardirqs_on_caller+0xe3/0x1b0
>  SyS_ioctl+0x41/0x70
>  entry_SYSCALL_64_fastpath+0x1c/0xb1
> RIP: 0033:0x7fbb83c39587
> RSP: 002b:7fff188dc228 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 81492963 RCX: 7fbb83c39587
> RDX: 7fff188dc260 RSI: c0186473 RDI: 0003
> RBP: c90001487f88 R08:  R09: 7fff188dc2ac
> R10: 7fbb83efcb58 R11: 0246 R12: 
> R13: 0003 R14: c0186473 R15: 7fff188dc2ac
>  ? __this_cpu_preempt_check+0x13/0x20
> 
> v2: Set ret correctly when we raced with another thread.
> 
> v3: Use Chris' diff. Attach the right lockdep splat.
> 
> Cc: Chris Wilson 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Sasha Levin 
> Cc: Marta Lofstedt 
> Cc: Tejun Heo 
> References: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/shard-hsw3/igt@prime_mmap@test_userptr.html
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102939

The difference is that I would have s/Bugzilla/References/.
-Chris


Re: [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

2017-10-06 Thread Chris Wilson
Quoting Daniel Vetter (2017-10-06 10:06:36)
> 4.14-rc1 gained the fancy new cross-release support in lockdep, which
> seems to have uncovered a few more rules about what is allowed and
> isn't.
> 
> This one here seems to indicate that allocating a work-queue while
> holding mmap_sem is a no-go, so let's try to preallocate it.

But you haven't mentioned why we might want to preallocate to reduce the
mmap/mm_lock coverage anyway.
-Chris


Re: [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged

2017-10-06 Thread Chris Wilson
Quoting Daniel Vetter (2017-10-06 10:06:37)
> stop_machine is not really a locking primitive we should use, except
> when the hw folks tell us the hw is broken and that's the only way to
> work around it.
> 
> This patch tries to address the locking abuse of stop_machine() from
> 
> commit 20e4933c478a1ca694b38fa4ac44d99e659941f5
> Author: Chris Wilson 
> Date:   Tue Nov 22 14:41:21 2016 +
> 
> drm/i915: Stop the machine as we install the wedged submit_request handler
> 
> Chris said parts of the reasons for going with stop_machine() was that
> it's no overhead for the fast-path. But these callbacks use irqsave
> spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.

I still want a discussion of the reason why keeping the normal path clean
and why an alternative is sought, here. That design leads into vv

> To stay as close as possible to the stop_machine semantics we first
> update all the submit function pointers to the nop handler, then call
> synchronize_rcu() to make sure no new requests can be submitted. This
> should give us exactly the huge barrier we want.
> 
> I pondered whether we should annotate engine->submit_request as __rcu
> and use rcu_assign_pointer and rcu_dereference on it. But the reason
> behind those is to make sure the compiler/cpu barriers are there for
> when you have an actual data structure you point at, to make sure all
> the writes are seen correctly on the read side. But we just have a
> function pointer, and .text isn't changed, so no need for these
> barriers and hence no need for annotations.
> 
> This should fix the followwing lockdep splat:
> 
> ==
> WARNING: possible circular locking dependency detected
> 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U
> --
> kworker/3:4/562 is trying to acquire lock:
>  (cpu_hotplug_lock.rw_sem){}, at: [] 
> stop_machine+0x1c/0x40
> 
> but task is already holding lock:
>  (&dev->struct_mutex){+.+.}, at: [] 
> i915_reset_device+0x1e8/0x260 [i915]
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #6 (&dev->struct_mutex){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__mutex_lock+0x86/0x9b0
>mutex_lock_interruptible_nested+0x1b/0x20
>i915_mutex_lock_interruptible+0x51/0x130 [i915]
>i915_gem_fault+0x209/0x650 [i915]
>__do_fault+0x1e/0x80
>__handle_mm_fault+0xa08/0xed0
>handle_mm_fault+0x156/0x300
>__do_page_fault+0x2c5/0x570
>do_page_fault+0x28/0x250
>page_fault+0x22/0x30
> 
> -> #5 (&mm->mmap_sem){}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__might_fault+0x68/0x90
>_copy_to_user+0x23/0x70
>filldir+0xa5/0x120
>dcache_readdir+0xf9/0x170
>iterate_dir+0x69/0x1a0
>SyS_getdents+0xa5/0x140
>entry_SYSCALL_64_fastpath+0x1c/0xb1
> 
> -> #4 (&sb->s_type->i_mutex_key#5){}:
>down_write+0x3b/0x70
>handle_create+0xcb/0x1e0
>devtmpfsd+0x139/0x180
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> -> #3 ((complete)&req.done){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>wait_for_common+0x58/0x210
>wait_for_completion+0x1d/0x20
>devtmpfs_create_node+0x13d/0x160
>device_add+0x5eb/0x620
>device_create_groups_vargs+0xe0/0xf0
>device_create+0x3a/0x40
>msr_device_create+0x2b/0x40
>cpuhp_invoke_callback+0xc9/0xbf0
>cpuhp_thread_fun+0x17b/0x240
>smpboot_thread_fn+0x18a/0x280
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> -> #2 (cpuhp_state-up){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>cpuhp_issue_call+0x133/0x1c0
>__cpuhp_setup_state_cpuslocked+0x139/0x2a0
>__cpuhp_setup_state+0x46/0x60
>page_writeback_init+0x43/0x67
>pagecache_init+0x3d/0x42
>start_kernel+0x3a8/0x3fc
>x86_64_start_reservations+0x2a/0x2c
>x86_64_start_kernel+0x6d/0x70
>verify_cpu+0x0/0xfb
> 
> -> #1 (cpuhp_state_mutex){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__mutex_lock+0x86/0x9b0
>mutex_lock_nested+0x1b/0x20
>__cpuhp_setup_state_cpuslocked+0x53/0x2a0
>__cpuhp_setup_state+0x46/0x60
>page_alloc_init+0x28/0x30
>start_kernel+0x145

Re: [PATCH] drm/i915: Preallocate mmu notifier to unbreak cpu hotplug deadlock

2017-10-05 Thread Chris Wilson
Quoting Daniel Vetter (2017-10-05 14:22:06)
> 4.14-rc1 gained the fancy new cross-release support in lockdep, which
> seems to have uncovered a few more rules about what is allowed and
> isn't.
> 
> This one here seems to indicate that allocating a work-queue while
> holding mmap_sem is a no-go, so let's try to preallocate it.
> 
> Of course another way to break this chain would be somewhere in the
> cpu hotplug code, since this isn't the only trace we're finding now
> which goes through msr_create_device.
> 
> Full lockdep splat:
> 
> ==
> WARNING: possible circular locking dependency detected
> 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U
> --
> kworker/3:4/562 is trying to acquire lock:
>  (cpu_hotplug_lock.rw_sem){}, at: [] 
> stop_machine+0x1c/0x40
> 
> but task is already holding lock:
>  (&dev->struct_mutex){+.+.}, at: [] 
> i915_reset_device+0x1e8/0x260 [i915]
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #6 (&dev->struct_mutex){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__mutex_lock+0x86/0x9b0
>mutex_lock_interruptible_nested+0x1b/0x20
>i915_mutex_lock_interruptible+0x51/0x130 [i915]
>i915_gem_fault+0x209/0x650 [i915]
>__do_fault+0x1e/0x80
>__handle_mm_fault+0xa08/0xed0
>handle_mm_fault+0x156/0x300
>__do_page_fault+0x2c5/0x570
>do_page_fault+0x28/0x250
>page_fault+0x22/0x30
> 
> -> #5 (&mm->mmap_sem){}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__might_fault+0x68/0x90
>_copy_to_user+0x23/0x70
>filldir+0xa5/0x120
>dcache_readdir+0xf9/0x170
>iterate_dir+0x69/0x1a0
>SyS_getdents+0xa5/0x140
>entry_SYSCALL_64_fastpath+0x1c/0xb1
> 
> -> #4 (&sb->s_type->i_mutex_key#5){}:
>down_write+0x3b/0x70
>handle_create+0xcb/0x1e0
>devtmpfsd+0x139/0x180
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> -> #3 ((complete)&req.done){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>wait_for_common+0x58/0x210
>wait_for_completion+0x1d/0x20
>devtmpfs_create_node+0x13d/0x160
>device_add+0x5eb/0x620
>device_create_groups_vargs+0xe0/0xf0
>device_create+0x3a/0x40
>msr_device_create+0x2b/0x40
>cpuhp_invoke_callback+0xc9/0xbf0
>cpuhp_thread_fun+0x17b/0x240
>smpboot_thread_fn+0x18a/0x280
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> -> #2 (cpuhp_state-up){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>cpuhp_issue_call+0x133/0x1c0
>__cpuhp_setup_state_cpuslocked+0x139/0x2a0
>__cpuhp_setup_state+0x46/0x60
>page_writeback_init+0x43/0x67
>pagecache_init+0x3d/0x42
>start_kernel+0x3a8/0x3fc
>x86_64_start_reservations+0x2a/0x2c
>x86_64_start_kernel+0x6d/0x70
>verify_cpu+0x0/0xfb
> 
> -> #1 (cpuhp_state_mutex){+.+.}:
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>__mutex_lock+0x86/0x9b0
>mutex_lock_nested+0x1b/0x20
>__cpuhp_setup_state_cpuslocked+0x53/0x2a0
>__cpuhp_setup_state+0x46/0x60
>page_alloc_init+0x28/0x30
>start_kernel+0x145/0x3fc
>x86_64_start_reservations+0x2a/0x2c
>x86_64_start_kernel+0x6d/0x70
>verify_cpu+0x0/0xfb
> 
> -> #0 (cpu_hotplug_lock.rw_sem){}:
>check_prev_add+0x430/0x840
>__lock_acquire+0x1420/0x15e0
>lock_acquire+0xb0/0x200
>cpus_read_lock+0x3d/0xb0
>stop_machine+0x1c/0x40
>i915_gem_set_wedged+0x1a/0x20 [i915]
>i915_reset+0xb9/0x230 [i915]
>i915_reset_device+0x1f6/0x260 [i915]
>i915_handle_error+0x2d8/0x430 [i915]
>hangcheck_declare_hang+0xd3/0xf0 [i915]
>i915_hangcheck_elapsed+0x262/0x2d0 [i915]
>process_one_work+0x233/0x660
>worker_thread+0x4e/0x3b0
>kthread+0x152/0x190
>ret_from_fork+0x27/0x40
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex
> 
>  Possible unsafe locking scenario:
> 
>CPU0CPU1
>
>   lock(&dev->struct_mutex);
>lock(&mm->mmap_sem);
>lock(&dev->struct_mutex);
>   lock(cpu_hotplug_lock.rw_sem);
> 
>  *** DEADLOCK ***
> 
> 3 locks held by kworker/3:4/562:
>  #0:  ("events_long"){+.+.}, at: [] 
> process_one_work+0x1aa/0x660
>  #1:  ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: 
> [] process_one_work+0x1aa/0x660
>  #2:  (&dev->struct_mutex){+.+.}, at: [] 
> i915_reset_device+0x1e8/0x260 [i915]
> 
> stack bac

Re: [PATCH] drm/i915: Preallocate mmu notifier to unbreak cpu hotplug deadlock

2017-10-05 Thread Chris Wilson
ev->struct_mutex);
>lock(&mm->mmap_sem);
>lock(&dev->struct_mutex);
>   lock(cpu_hotplug_lock.rw_sem);
> 
>  *** DEADLOCK ***
> 
> 3 locks held by kworker/3:4/562:
>  #0:  ("events_long"){+.+.}, at: [] 
> process_one_work+0x1aa/0x660
>  #1:  ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: 
> [] process_one_work+0x1aa/0x660
>  #2:  (&dev->struct_mutex){+.+.}, at: [] 
> i915_reset_device+0x1e8/0x260 [i915]
> 
> stack backtrace:
> CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G U  
> 4.14.0-rc3-CI-CI_DRM_3179+ #1
> Hardware name:  /NUC7i5BNB, BIOS 
> BNKBL357.86A.0048.2017.0704.1415 07/04/2017
> Workqueue: events_long i915_hangcheck_elapsed [i915]
> Call Trace:
>  dump_stack+0x68/0x9f
>  print_circular_bug+0x235/0x3c0
>  ? lockdep_init_map_crosslock+0x20/0x20
>  check_prev_add+0x430/0x840
>  ? irq_work_queue+0x86/0xe0
>  ? wake_up_klogd+0x53/0x70
>  __lock_acquire+0x1420/0x15e0
>  ? __lock_acquire+0x1420/0x15e0
>  ? lockdep_init_map_crosslock+0x20/0x20
>  lock_acquire+0xb0/0x200
>  ? stop_machine+0x1c/0x40
>  ? i915_gem_object_truncate+0x50/0x50 [i915]
>  cpus_read_lock+0x3d/0xb0
>  ? stop_machine+0x1c/0x40
>  stop_machine+0x1c/0x40
>  i915_gem_set_wedged+0x1a/0x20 [i915]
>  i915_reset+0xb9/0x230 [i915]
>  i915_reset_device+0x1f6/0x260 [i915]
>  ? gen8_gt_irq_ack+0x170/0x170 [i915]
>  ? work_on_cpu_safe+0x60/0x60
>  i915_handle_error+0x2d8/0x430 [i915]
>  ? vsnprintf+0xd1/0x4b0
>  ? scnprintf+0x3a/0x70
>  hangcheck_declare_hang+0xd3/0xf0 [i915]
>  ? intel_runtime_pm_put+0x56/0xa0 [i915]
>  i915_hangcheck_elapsed+0x262/0x2d0 [i915]
>  process_one_work+0x233/0x660
>  worker_thread+0x4e/0x3b0
>  kthread+0x152/0x190
>  ? process_one_work+0x660/0x660
>  ? kthread_create_on_node+0x40/0x40
>  ret_from_fork+0x27/0x40
> Setting dangerous option reset - tainting kernel
> i915 :00:02.0: Resetting chip after gpu hang
> Setting dangerous option reset - tainting kernel
> i915 :00:02.0: Resetting chip after gpu hang
> 
> Cc: Chris Wilson 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Sasha Levin 
> Signed-off-by: Daniel Vetter 
> 
> --
> 
> Patch itself pretty much untested, I just want to figure out whether
> we should fix this (and similar backtraces going through
> msr_create_device) in i915, or whether the cpu hotplug folks will take
> care of them all.

Looking at the patch, we shrink the time under mmap_sem/mm_lock so seems
sensible (just from that pov).

> Afaict this is not because of changes in 4.14-rc1, but really only due
> to the new cross-release checks.
> 
> Note that the above trace is on top of -rc3 (plus plenty of drm/i915
> stuff), so should have Peter's recent lockdep fixes for cpu up vs.
> down already.
> -Daniel
> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 65 
> +++--
>  1 file changed, 29 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
> b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 2d4996de7331..afe166921572 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -160,36 +160,6 @@ static const struct mmu_notifier_ops 
> i915_gem_userptr_notifier = {
> .invalidate_range_start = i915_gem_userptr_mn_invalidate_range_start,
>  };
>  
> -static struct i915_mmu_notifier *
> -i915_mmu_notifier_create(struct mm_struct *mm)
> -{
> -   struct i915_mmu_notifier *mn;
> -   int ret;
> -
> -   mn = kmalloc(sizeof(*mn), GFP_KERNEL);
> -   if (mn == NULL)
> -   return ERR_PTR(-ENOMEM);
> -
> -   spin_lock_init(&mn->lock);
> -   mn->mn.ops = &i915_gem_userptr_notifier;
> -   mn->objects = RB_ROOT_CACHED;
> -   mn->wq = alloc_workqueue("i915-userptr-release", WQ_UNBOUND, 0);
> -   if (mn->wq == NULL) {
> -   kfree(mn);
> -   return ERR_PTR(-ENOMEM);
> -   }
> -
> -/* Protected by mmap_sem (write-lock) */
> -   ret = __mmu_notifier_register(&mn->mn, mm);
> -   if (ret) {
> -   destroy_workqueue(mn->wq);
> -   kfree(mn);
> -   return ERR_PTR(ret);
> -   }
> -
> -   return mn;
> -}
> -
>  static void
>  i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
>  {
> @@ -210,23 +180,46 @@ i915_gem_userptr_release__mmu_notifier(struct 
> drm_i915_gem_object *obj)
>  static struct i915_mmu_notifier *

Re: [PATCH] drm/i915/selftests: fix check for intel IOMMU

2017-10-05 Thread Chris Wilson
Quoting Arnd Bergmann (2017-10-05 13:07:22)
> An earlier bugfix tried to work around this build failure:
> 
> drivers/gpu/drm/i915/selftests/mock_gem_device.c: In function 
> 'mock_gem_device':
> drivers/gpu/drm/i915/selftests/mock_gem_device.c:151:20: error: 'struct 
> dev_archdata' has no member named 'iommu'
> 
> Checking for CONFIG_IOMMU_API is not sufficient as a compile-time
> test since that may be enabled in configurations that have neither
> INTEL_IOMMU not AMD_IOMMU enabled. This changes the check to
> INTEL_IOMMU instead, as this is the only case we actually care about.
> 
> Fixes: f46f156ea770 ("drm/i915/selftests: Only touch archdata.iommu when it 
> exists")
> Signed-off-by: Arnd Bergmann 

I'll take your wisdom. I was lost trying to nail down exactly what
config options we needed to check and settled for something that
appeared good enough (for the small selection of configs I tested).

Reviewed-by: Chris Wilson 
-Chris


Re: [PATCH] dma-buf: remove redundant initialization of sg_table

2017-09-15 Thread Chris Wilson
Quoting Colin King (2017-09-15 00:05:16)
> From: Colin Ian King 
> 
> sg_table is being initialized and is never read before it is updated
> again later on, hence making the initialization redundant. Remove
> the initialization.
> 
> Detected by clang scan-build:
> "warning: Value stored to 'sg_table' during its initialization is
> never read"
> 
> Signed-off-by: Colin Ian King 
Reviewed-by: Chris Wilson 
-Chris


Re: [PATCH] drm/i915: remove redundant variable hw_check

2017-09-14 Thread Chris Wilson
Quoting Colin King (2017-09-14 17:21:54)
> From: Colin Ian King 
> 
> hw_check is being assigned and updated but is no longer being read,
> hence it is redundant and can be removed.
> 
> Detected by clang scan-build:
> "warning: Value stored to 'hw_check' during its initialization
> is never read"
> 
> Fixes: f6d1973db2d2 ("drm/i915: Move modeset state verifier calls")
> Signed-off-by: Colin Ian King 
Reviewed-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index f17275519484..ac261eaa5ba5 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -12359,7 +12359,6 @@ static void intel_atomic_commit_tail(struct 
> drm_atomic_state *state)
> struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> struct drm_crtc *crtc;
> struct intel_crtc_state *intel_cstate;
> -   bool hw_check = intel_state->modeset;
> u64 put_domains[I915_MAX_PIPES] = {};
> unsigned crtc_vblank_mask = 0;
> int i;
> @@ -12376,7 +12375,6 @@ static void intel_atomic_commit_tail(struct 
> drm_atomic_state *state)
>  
> if (needs_modeset(new_crtc_state) ||
> to_intel_crtc_state(new_crtc_state)->update_pipe) {
> -   hw_check = true;
>  
> put_domains[to_intel_crtc(crtc)->pipe] =
> modeset_get_crtc_power_domains(crtc,
> -- 
> 2.14.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH v2 5/5] tools/testing/scatterlist: Test new __sg_alloc_table_from_pages

2017-09-06 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-09-06 13:10:57)
> 
> On 06/09/2017 11:48, Chris Wilson wrote:
> > All ascending. Interesting challenge for 3,2,1,0; it can be coalesced,
> > we just don't. I wonder if we are missing some like that. But for the
> 
> Hm, how do you think descending pages could be coalesced? By 
> re-arranging the pages? But that would break everything, do I don't get it.

Wishful thinking; I wasn't considering the order inside the object, just
their physical addresses.
> 
> > moment, 0, 2, 1, would be a good addition to the above set.
> > 
> > Is there any value in checking overflows in this interface?
> 
> Overflows as in size > num_pages * PAGE_SIZE as passed in to 
> __sg_alloc_table_from_pages ? It is not checked in the implementation at 
> the moment and it looks it is harmless.

Just thinking aloud if there was a way to get a mult/add overflow. That
we do page size coalescing, the only avenue is from a buggy max_seg.

Going back to the makefile, perhaps add the magic for ubsan as well?
-fsanitize=address -fsanitize=undefined
-Chris


Re: [PATCH v2 5/5] tools/testing/scatterlist: Test new __sg_alloc_table_from_pages

2017-09-06 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-09-05 11:24:03)
> From: Tvrtko Ursulin 
> 
> Exercise the new __sg_alloc_table_from_pages API (and through
> it also the old sg_alloc_table_from_pages), checking that the
> created table has the expected number of segments depending on
> the sequence of input pages and other conditions.
> 
> v2: Move to data driven for readability.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: linux-kernel@vger.kernel.org
> ---
>  tools/testing/scatterlist/Makefile   |  30 +
>  tools/testing/scatterlist/linux/mm.h | 125 
> +++
>  tools/testing/scatterlist/main.c |  74 +
>  3 files changed, 229 insertions(+)
>  create mode 100644 tools/testing/scatterlist/Makefile
>  create mode 100644 tools/testing/scatterlist/linux/mm.h
>  create mode 100644 tools/testing/scatterlist/main.c
> 
> diff --git a/tools/testing/scatterlist/Makefile 
> b/tools/testing/scatterlist/Makefile
> new file mode 100644
> index ..0867e0ef32d6
> --- /dev/null
> +++ b/tools/testing/scatterlist/Makefile
> @@ -0,0 +1,30 @@
> +CFLAGS += -I. -I../../include -g -O2 -Wall -fsanitize=address
> +LDFLAGS += -fsanitize=address
> +TARGETS = main
> +OFILES = main.o scatterlist.o
> +
> +ifeq ($(BUILD), 32)
> +CFLAGS += -m32
> +LDFLAGS += -m32
> +endif

Hmm, are there no HOST_CFLAGS? No. I wonder how menuconfig/xconfig get
compiled.

HOSTCC, HOSTCFLAGS.

hostprogs-y := main
always := $(hostprogs-y)

But nothing else seems to use HOSTCC in testing/selftests

> +targets: include $(TARGETS)
> +
> +main: $(OFILES)
> +
> +clean:
> +   $(RM) $(TARGETS) $(OFILES) scatterlist.c linux/scatterlist.h 
> linux/highmem.h linux/kmemleak.h asm/io.h
> +   @rmdir asm
> +
> +scatterlist.c: ../../../lib/scatterlist.c
> +   @sed -e 's/^static //' -e 's/__always_inline //' -e 's/inline //' < 
> $< > $@

I think I would have used

#define __always_inline inline
#include "../../../lib/scatterlist.c"

> diff --git a/tools/testing/scatterlist/main.c 
> b/tools/testing/scatterlist/main.c
> new file mode 100644
> index ..8ca5c8703eb7
> --- /dev/null
> +++ b/tools/testing/scatterlist/main.c
> @@ -0,0 +1,74 @@
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MAX_PAGES (64)
> +
> +static void set_pages(struct page **pages, const unsigned *array, unsigned 
> num)
> +{
> +   unsigned int i;
> +
> +   assert(num < MAX_PAGES);
> +   for (i = 0; i < num; i++)
> +   pages[i] = (struct page *)(unsigned long)
> +  ((1 + array[i]) * PAGE_SIZE);

pfn_to_page(PFN_BIAS + array[i]) ? Ah, that relies on headers. Ok.

> +}
> +
> +#define pfn(...) (unsigned []){ __VA_ARGS__ }
> +
> +int main(void)
> +{
> +   const unsigned int sgmax = SCATTERLIST_MAX_SEGMENT;
> +   struct test {
> +   int alloc_ret;
> +   unsigned num_pages;
> +   unsigned *pfn;
> +   unsigned size;
> +   unsigned int max_seg;
> +   unsigned int expected_segments;
> +   } *test, tests[] = {
> +   { -EINVAL, 1, pfn(0), PAGE_SIZE, PAGE_SIZE + 1, 1 },
> +   { -EINVAL, 1, pfn(0), PAGE_SIZE, 0, 1 },
> +   { -EINVAL, 1, pfn(0), PAGE_SIZE, sgmax + 1, 1 },
> +   { 0, 1, pfn(0), PAGE_SIZE, sgmax, 1 },
> +   { 0, 1, pfn(0), 1, sgmax, 1 },
> +   { 0, 2, pfn(0, 1), 2 * PAGE_SIZE, sgmax, 1 },
> +   { 0, 3, pfn(0, 1, 3), 3 * PAGE_SIZE, sgmax, 2 },
> +   { 0, 4, pfn(0, 1, 3, 4), 4 * PAGE_SIZE, sgmax, 2 },
> +   { 0, 5, pfn(0, 1, 3, 4, 5), 5 * PAGE_SIZE, sgmax, 2 },
> +   { 0, 5, pfn(0, 1, 3, 4, 6), 5 * PAGE_SIZE, sgmax, 3 },
> +   { 0, 5, pfn(0, 1, 2, 3, 4), 5 * PAGE_SIZE, sgmax, 1 },
> +   { 0, 5, pfn(0, 1, 2, 3, 4), 5 * PAGE_SIZE, 2 * PAGE_SIZE, 3 },
> +   { 0, 6, pfn(0, 1, 2, 3, 4, 5), 6 * PAGE_SIZE, 2 * PAGE_SIZE, 
> 3 },
> +   { 0, 6, pfn(0, 2, 3, 4, 5, 6), 6 * PAGE_SIZE, 2 * PAGE_SIZE, 
> 4 },
> +   { 0, 6, pfn(0, 1, 3, 4, 5, 6), 6 * PAGE_SIZE, 2 * PAGE_SIZE, 
> 3 },

All ascending. Interesting challenge for 3,2,1,0; it can be coalesced,
we just don't. I wonder if we are missing some like that. But for the
moment, 0, 2, 1, would be a good addition to the above set.

Is there any value in checking overflows in this interface?

Lgtm, throw in a couple of inverted positions,
Reviewed-by: Chris Wilson 
-Chris


Re: Abysmal scheduler performance in Linus' tree?

2017-09-06 Thread Chris Wilson
Quoting Andy Lutomirski (2017-09-06 06:13:39)
> I'm running e7d0c41ecc2e372a81741a30894f556afec24315 from Linus' tree
> today, and I'm seeing abysmal scheduler performance.  Running make -j4
> ends up with all the tasks on CPU 3 most of the time (on my
> 4-logical-thread laptop).  taskset -c 0 whatever puts whatever on CPU
> 0, but plain while true; do true; done puts the infinite loop on CPU 3
> right along with the make -j4 tasks.
> 
> This is on Fedora 26, and I don't think I'm doing anything weird.
> systemd has enabled the cpu controller, but it doesn't seem to have
> configured anything or created any non-root cgroups.
> 
> Just a heads up.  I haven't tried to diagnose it at all.

There is an issue on !llc machines arising from numa_wake_wide() where
processes are not spread widely across the low-power cores:
https://patchwork.kernel.org/patch/9875581/

The patch we are using to fix the regression is
https://cgit.freedesktop.org/drm-intel/commit/?h=topic/core-for-CI&id=6c362d9657293d700a8299acbeb2f1e24378f488
-Chris


Re: [RFC][PATCH] time: Fix ktime_get_raw() issues caused by incorrect base accumulation

2017-08-26 Thread Chris Wilson
Quoting John Stultz (2017-08-25 23:57:04)
> In commit fc6eead7c1e2 ("time: Clean up CLOCK_MONOTONIC_RAW time
> handling"), I mistakenly added the following:
> 
>  /* Update the monotonic raw base */
>  seconds = tk->raw_sec;
>  nsec = (u32)(tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift);
>  tk->tkr_raw.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);
> 
> Which adds the raw_sec value and the shifted down raw xtime_nsec
> to the base value.
> 
> This is problematic as when calling ktime_get_raw(), we add the
> tk->tkr_raw.xtime_nsec and current offset, shift it down and add
> it to the raw base.
> 
> This results in the shifted down tk->tkr_raw.xtime_nsec being
> added twice.
> 
> My mistake, was that I was matching the monotonic base logic
> above:
> 
>  seconds = (u64)(tk->xtime_sec + tk->wall_to_monotonic.tv_sec);
>  nsec = (u32) tk->wall_to_monotonic.tv_nsec;
>  tk->tkr_mono.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);
> 
> Which adds the wall_to_monotonic.tv_nsec value, but not the
> tk->tkr_mono.xtime_nsec value to the base.
> 
> The result of this is that ktime_get_raw() users (which are all
> internal users) see the raw time move faster then it should
> (the rate at which can vary with the current size of
> tkr_raw.xtime_nsec), which has resulted in at least problems
> with graphics rendering performance.
> 
> To fix this, we simplify the tkr_raw.base accumulation to only
> accumulate the raw_sec portion, and do not include the
> tkr_raw.xtime_nsec portion, which will be added at read time.

This fixes our testcase of using ktime_get_raw() deltas. Thanks,
Tested-by: Chris Wilson 
-Chris


Re: [PATCH] sched/fair: Fix wake_affine() for !NUMA_BALANCING

2017-08-25 Thread Chris Wilson
Quoting Peter Zijlstra (2017-08-01 22:43:07)
> @@ -5356,20 +5295,115 @@ static int wake_wide(struct task_struct
> return 1;
>  }
>  
> +struct llc_stats {
> +   unsigned long nr_running;
> +   unsigned long load;
> +   unsigned long capacity;
> +   int has_capacity;
> +};
> +
> +static void get_llc_stats(struct llc_stats *stats, int cpu)
> +{
> +   struct sched_domain_shared *sds = 
> rcu_dereference(per_cpu(sd_llc_shared, cpu));
> +
> +   if (!sds) {
> +   memset(&stats, 0, sizeof(*stats));
 ^
Just in case, stray & before stats.
-Chris


Re: [RFC][PATCH 4/4] time: Clean up CLOCK_MONOTONIC_RAW time handling

2017-08-25 Thread Chris Wilson
Quoting John Stultz (2017-05-27 04:33:55)
> Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW,
> remove the duplicitive tk->raw_time.tv_nsec, which can be
> stored in tk->tkr_raw.xtime_nsec (similarly to how its handled
> for monotonic time).

This patch breaks tkr_raw pretty badly. It is now accumulating 2x faster
than tkr_mono, with the occasional wacky jump between two reads.

e.g:

@@ -838,6 +840,11 @@ ktime_t ktime_get_raw(void)
 
} while (read_seqcount_retry(&tk_core.seq, seq));
 
+   pr_err("%s: raw.base=%llu, mono.base=%llu: nsecs=%llu, 
mono.nsecs=%llu\n",
+   __func__,
+   ktime_to_ns(base), ktime_to_ns(tk->tkr_mono.base),
+   nsecs, timekeeping_get_ns(&tk->tkr_mono));
+

gave
ktime_get_raw: raw.base=40255680121, mono.base=39940532364: 
nsecs=261321742, mono.nsecs=318630514
ktime_get_raw: raw.base=40339680124, mono.base=39940532364: 
nsecs=345836800, mono.nsecs=403109209

https://bugs.freedesktop.org/show_bug.cgi?id=102336

The patch still reverts cleanly and aiui was not part of the bugfixes,
but a cleanup afterwards; so can be reapplied at leisure.

> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Miroslav Lichvar 
> Cc: Richard Cochran 
> Cc: Prarit Bhargava 
> Cc: Stephen Boyd 
> Cc: Kevin Brodsky 
> Cc: Will Deacon 
> Cc: Daniel Mentz 
> Signed-off-by: John Stultz 
> ---
>  arch/arm64/kernel/vdso.c|  6 ++---
>  include/linux/timekeeper_internal.h |  4 ++--
>  kernel/time/timekeeping.c   | 47 
> -
>  3 files changed, 29 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
> index d0cb007..7492d90 100644
> --- a/arch/arm64/kernel/vdso.c
> +++ b/arch/arm64/kernel/vdso.c
> @@ -220,10 +220,8 @@ void update_vsyscall(struct timekeeper *tk)
> if (!use_syscall) {
> /* tkr_mono.cycle_last == tkr_raw.cycle_last */
> vdso_data->cs_cycle_last= tk->tkr_mono.cycle_last;
> -   vdso_data->raw_time_sec = tk->raw_time.tv_sec;
> -   vdso_data->raw_time_nsec= (tk->raw_time.tv_nsec <<
> -  tk->tkr_raw.shift) +
> - tk->tkr_raw.xtime_nsec;
> +   vdso_data->raw_time_sec = tk->raw_sec;
> +   vdso_data->raw_time_nsec= tk->tkr_raw.xtime_nsec;
> vdso_data->xtime_clock_sec  = tk->xtime_sec;
> vdso_data->xtime_clock_nsec = tk->tkr_mono.xtime_nsec;
> vdso_data->cs_mono_mult = tk->tkr_mono.mult;
> diff --git a/include/linux/timekeeper_internal.h 
> b/include/linux/timekeeper_internal.h
> index 528cc86..8abf6df 100644
> --- a/include/linux/timekeeper_internal.h
> +++ b/include/linux/timekeeper_internal.h
> @@ -52,7 +52,7 @@ struct tk_read_base {
>   * @clock_was_set_seq: The sequence number of clock was set events
>   * @cs_was_changed_seq:The sequence number of clocksource change 
> events
>   * @next_leap_ktime:   CLOCK_MONOTONIC time value of a pending leap-second
> - * @raw_time:  Monotonic raw base time in timespec64 format
> + * @raw_sec:   CLOCK_MONOTONIC_RAW  time in seconds
>   * @cycle_interval:Number of clock cycles in one NTP interval
>   * @xtime_interval:Number of clock shifted nano seconds in one NTP
>   * interval.
> @@ -94,7 +94,7 @@ struct timekeeper {
> unsigned intclock_was_set_seq;
> u8  cs_was_changed_seq;
> ktime_t next_leap_ktime;
> -   struct timespec64   raw_time;
> +   u64 raw_sec;
>  
> /* The following members are for timekeeping internal use */
> u64 cycle_interval;
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 35d3ba3..0c3b8c1 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -72,6 +72,10 @@ static inline void tk_normalize_xtime(struct timekeeper 
> *tk)
> tk->tkr_mono.xtime_nsec -= (u64)NSEC_PER_SEC << 
> tk->tkr_mono.shift;
> tk->xtime_sec++;
> }
> +   while (tk->tkr_raw.xtime_nsec >= ((u64)NSEC_PER_SEC << 
> tk->tkr_raw.shift)) {
> +   tk->tkr_raw.xtime_nsec -= (u64)NSEC_PER_SEC << 
> tk->tkr_raw.shift;
> +   tk->raw_sec++;
> +   }
>  }
>  
>  static inline struct timespec64 tk_xtime(struct timekeeper *tk)
> @@ -287,12 +291,14 @@ static void tk_setup_internals(struct timekeeper *tk, 
> struct clocksource *clock)
>  /* if changing clocks, convert xtime_nsec shift units */
> if (old_clock) {
> int shift_change = clock->shift - old_clock->shift;
> -   if (shift_change < 0)
> +   if (shift_change < 0) {
> tk->tkr_mon

Re: [PATCH] sched/fair: Fix wake_affine() for !NUMA_BALANCING

2017-08-24 Thread Chris Wilson
We've stumbled across the same regression on Broxton/Apollolake (J3455).

Quoting Peter Zijlstra (2017-08-01 22:43:07)
> On Tue, Aug 01, 2017 at 03:26:51PM -0400, Josef Bacik wrote:
> > > @@ -7574,6 +7607,18 @@ static inline void update_sd_lb_stats(st
> > > env->dst_rq->rd->overload = overload;
> > > }
> > >  
> > > +   /*
> > > +* Since these are sums over groups they can contain some CPUs
> > > +* multiple times for the NUMA domains.
> > > +*
> > > +* Currently only wake_affine_llc() and find_busiest_group()
> > > +* uses these numbers, only the last is affected by this problem.
> > > +*
> > > +* XXX fix that.
> > > +*/
> > > +   WRITE_ONCE(shared->nr_running,  sds->total_running);
> > > +   WRITE_ONCE(shared->load,sds->total_load);
> > > +   WRITE_ONCE(shared->capacity,sds->total_capacity);
> > 
> > This panic's on boot for me because shared is NULL.  Same happens in
> > select_task_rq_fair when it tries to do the READ_ONCE.  Here is my .config 
> > in
> > case it's something strange with my config.  Thanks,
> 
> Nah, its just me being an idiot and booting the wrong kernel. Unless I
> messed up again, this one boots.
> 
> There is state during boot and hotplug where there are no domains, and
> thus also no shared. Simply ignoring things when that happens should be
> good enough I think.

This is still not as effective as the previous code in spreading across
siblings.

> +/*
> + * Can a task be moved from prev_cpu to this_cpu without causing a load
> + * imbalance that would trigger the load balancer?
> + *
> + * Since we're running on 'stale' values, we might in fact create an 
> imbalance
> + * but recomputing these values is expensive, as that'd mean iteration 2 
> cache
> + * domains worth of CPUs.
> + */
> +static bool
> +wake_affine_llc(struct sched_domain *sd, struct task_struct *p,
> +   int this_cpu, int prev_cpu, int sync)
> +{
> +   struct llc_stats prev_stats, this_stats;
> +   s64 this_eff_load, prev_eff_load;
> +   unsigned long task_load;
> +
> +   get_llc_stats(&prev_stats, prev_cpu);
> +   get_llc_stats(&this_stats, this_cpu);
> +
> +   /*
> +* If sync wakeup then subtract the (maximum possible)
> +* effect of the currently running task from the load
> +* of the current LLC.
> +*/
> +   if (sync) {
> +   unsigned long current_load = task_h_load(current);
> +
> +   /* in this case load hits 0 and this LLC is considered 'idle' 
> */
> +   if (current_load > this_stats.load)
> +   return true;
> +
> +   this_stats.load -= current_load;
> +   }
> +
> +   /*
> +* The has_capacity stuff is not SMT aware, but by trying to balance
> +* the nr_running on both ends we try and fill the domain at equal
> +* rates, thereby first consuming cores before siblings.
> +*/
> +
> +   /* if the old cache has capacity, stay there */
> +   if (prev_stats.has_capacity && prev_stats.nr_running < 
> this_stats.nr_running+1)
> +   return false;
> +
> +   /* if this cache has capacity, come here */
> +   if (this_stats.has_capacity && this_stats.nr_running < 
> prev_stats.nr_running+1)

I think you mean,

if (this_stats.has_capacity && this_stats.nr_running + 1 < 
prev_stats.nr_running)

and with that our particular graphics benchmark behaves similarly to as
before (the regression appears fixed). But I'll let Eero double check.

> +   return true;
> +
> +
> +   /*
> +* Check to see if we can move the load without causing too much
> +* imbalance.
> +*/
> +   task_load = task_h_load(p);
> +
> +   this_eff_load = 100;
> +   this_eff_load *= prev_stats.capacity;
> +
> +   prev_eff_load = 100 + (sd->imbalance_pct - 100) / 2;
> +   prev_eff_load *= this_stats.capacity;
> +
> +   this_eff_load *= this_stats.load + task_load;
> +   prev_eff_load *= prev_stats.load - task_load;
> +
> +   return this_eff_load <= prev_eff_load;
> +}


Re: [PATCH] vc4: Add an ioctl for labeling GEM BOs for summary stats

2017-07-28 Thread Chris Wilson
Quoting Eric Anholt (2017-07-25 19:27:25)
> Chris Wilson  writes:
> 
> > Quoting Eric Anholt (2017-06-22 21:50:54)
> >> This has proven immensely useful for debugging memory leaks and
> >> overallocation (which is a rather serious concern on the platform,
> >> given that we typically run at about 256MB of CMA out of up to 1GB
> >> total memory, with framebuffers that are about 8MB ecah).
> >> 
> >> The state of the art without this is to dump debug logs from every GL
> >> application, guess as to kernel allocations based on bo_stats, and try
> >> to merge that all together into a global picture of memory allocation
> >> state.  With this, you can add a couple of calls to the debug build of
> >> the 3D driver and get a pretty detailed view of GPU memory usage from
> >> /debug/dri/0/bo_stats (or when we debug print to dmesg on allocation
> >> failure).
> >> 
> >> The Mesa side currently labels at the gallium resource level (so you
> >> see that a 1920x20 pixmap has been created, presumably for the window
> >> system panel), but we could extend that to be even more useful with
> >> glObjectLabel() names being sent all the way down to the kernel.
> >> 
> >> (partial) example of sorted debugfs output with Mesa labeling all
> >> resources:
> >> 
> >>kernel BO cache:  16392kb BOs (3)
> >>tiling shadow 1920x1080:   8160kb BOs (1)
> >>resource 1920x1080@32/0:   8160kb BOs (1)
> >> scanout resource 1920x1080@32/0:   8100kb BOs (1)
> >> kernel:   8100kb BOs (1)
> >> 
> >> Signed-off-by: Eric Anholt 
> >> ---
> >
> >>  static void vc4_bo_stats_dump(struct vc4_dev *vc4)
> >>  {
> >
> > Now unused?
> 
> Still used from the splat when CMA allocation fails.  It's less
> catastrophic than it used to be, but still bad, so we're dumping to
> dmesg for now.
> 
> >> -   DRM_INFO("num bos allocated: %d\n",
> >> -vc4->bo_stats.num_allocated);
> >> -   DRM_INFO("size bos allocated: %dkb\n",
> >> -vc4->bo_stats.size_allocated / 1024);
> >> -   DRM_INFO("num bos used: %d\n",
> >> -vc4->bo_stats.num_allocated - vc4->bo_stats.num_cached);
> >> -   DRM_INFO("size bos used: %dkb\n",
> >> -(vc4->bo_stats.size_allocated -
> >> - vc4->bo_stats.size_cached) / 1024);
> >> -   DRM_INFO("num bos cached: %d\n",
> >> -vc4->bo_stats.num_cached);
> >> -   DRM_INFO("size bos cached: %dkb\n",
> >> -vc4->bo_stats.size_cached / 1024);
> >> +   int i;
> >> +
> >> +   for (i = 0; i < vc4->num_labels; i++) {
> >> +   if (!vc4->bo_labels[i].num_allocated)
> >> +   continue;
> >> +
> >> +   DRM_INFO("%30s: %6dkb BOs (%d)\n",
> >> +vc4->bo_labels[i].name,
> >> +vc4->bo_labels[i].size_allocated / 1024,
> >> +vc4->bo_labels[i].num_allocated);
> >> +   }
> >>  }
> >>  
> >>  #ifdef CONFIG_DEBUG_FS
> >
> >> +/* Must be called with bo_lock held. */
> >> +static void vc4_bo_set_label(struct drm_gem_object *gem_obj, int label)
> >> +{
> >> +   struct vc4_bo *bo = to_vc4_bo(gem_obj);
> >> +   struct vc4_dev *vc4 = to_vc4_dev(gem_obj->dev);
> >
> > lockdep_assert_held(&vc4->bo_lock);
> 
> Nice.  I've converted the other instances of this comment, too.
> 
> >> +
> >> +   vc4->bo_labels[label].num_allocated++;
> >> +   vc4->bo_labels[label].size_allocated += gem_obj->size;
> >
> > This gets called with label=-1 on destroy.
> 
> Oh, good catch, thanks.  This code got reworked a couple of times and I
> lost that check.
> 
> >> +   vc4->bo_labels[bo->label].num_allocated--;
> >> +   vc4->bo_labels[bo->label].size_allocated -= gem_obj->size;
> >
> > Ok, preassigned to TYPE_KERNEL on creation.
> >
> >> +   if (vc4->bo_labels[bo->label].num_allocated == 0 &&
> >> +   is_user_label(bo->label)) {
> >> +   /* Free user BO label slots on last unreference. */
> >> +   kfree(vc4->

Re: [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

2017-07-28 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-07-27 10:05:03)
> From: Tvrtko Ursulin 
> 
> Drivers like i915 benefit from being able to control the maxium
> size of the sg coallesced segment while building the scatter-
> gather list.
> 
> Introduce and export the __sg_alloc_table_from_pages function
> which will allow it that control.
> 
> v2: Reorder parameters. (Chris Wilson)
> v3: Fix incomplete reordering in v2.
> v4: max_segment needs to be page aligned.
> v5: Rebase.
> v6: Rebase.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Masahiro Yamada 
> Cc: linux-kernel@vger.kernel.org
> Cc: Chris Wilson 
> Reviewed-by: Chris Wilson  (v2)
> Cc: Joonas Lahtinen 

Sill happy and checked the external user,
Reviewed-by: Chris Wilson 
-Chris


Re: [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

2017-07-28 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-07-27 10:05:04)
> From: Tvrtko Ursulin 
> 
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
> 
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
> 
> v2:
>  * Rename helper to i915_sg_segment_size and fix swiotlb override.
>  * Commit message update.
> 
> v3:
>  * Actually include the swiotlb override fix.
> 
> v4:
>  * Regroup parameters a bit. (Chris Wilson)
> 
> v5:
>  * Rebase for swiotlb_max_segment.
>  * Add DMA map failure handling as in abb0deacb5a6
>("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
> 
> v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)
> 
> v7: Rebase.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Chris Wilson  (v4)
> Cc: Joonas Lahtinen 
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 15 +++
>  drivers/gpu/drm/i915/i915_gem.c |  6 +--
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 79 
> -
>  3 files changed, 45 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2c7456f4ed38..6383940e8d55 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2749,6 +2749,21 @@ static inline struct scatterlist *__sg_next(struct 
> scatterlist *sg)
>  (((__iter).curr += PAGE_SIZE) < (__iter).max) ||   \
>  ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
>  
> +static inline unsigned int i915_sg_segment_size(void)
> +{
> +   unsigned int size = swiotlb_max_segment();
> +
> +   if (size == 0)
> +   return SCATTERLIST_MAX_SEGMENT;
> +
> +   size = rounddown(size, PAGE_SIZE);

Looks like swiotbl_max_seqment() is always page aligned when not 1.
And it returns bytes, ok.

Given that you are using a pot, you can use round_down().

> +   /* swiotlb_max_segment_size can return 1 byte when it means one page. 
> */
> +   if (size < PAGE_SIZE)
> +   size = PAGE_SIZE;
> +
> +   return size;
> +}
> +
>  static inline const struct intel_device_info *
>  intel_info(const struct drm_i915_private *dev_priv)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6faabf34f142..a60885d6231b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2339,7 +2339,7 @@ i915_gem_object_get_pages_gtt(struct 
> drm_i915_gem_object *obj)
> struct sgt_iter sgt_iter;
> struct page *page;
> unsigned long last_pfn = 0; /* suppress gcc warning */
> -   unsigned int max_segment;
> +   unsigned int max_segment = i915_sg_segment_size();
> gfp_t noreclaim;
> int ret;
>  
> @@ -2350,10 +2350,6 @@ i915_gem_object_get_pages_gtt(struct 
> drm_i915_gem_object *obj)
> GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>  
> -   max_segment = swiotlb_max_segment();
> -   if (!max_segment)
> -   max_segment = rounddown(UINT_MAX, PAGE_SIZE);

Conversion to i915_sg_segment_size(), ok.

> -
> st = kmalloc(sizeof(*st), GFP_KERNEL);
> if (st == NULL)
> return ERR_PTR(-ENOMEM);
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
> b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index ccd09e8419f5..60c10d4118ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -399,64 +399,42 @@ struct get_pages_work {
> struct task_struct *task;
>  };
>  
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> -#define swiotlb_active() swiotlb_nr_tbl()
> -#else
> -#define swiotlb_active() 0
> -#endif

Converted to i915_sg_segment_size(), nice.

> -static int
> -st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> -{
> -   struct scatterlist *sg;
> -   int ret, n;
> -
> -   *st = kmalloc(sizeof(**st), GFP_KERNEL);
> -   if (*st == NULL)
> -   return -ENOMEM;
> -
> -   if (swiotlb_active()) {
> -   ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> -   if (ret)
> -   goto err;
> -
>

Re: [Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow

2017-07-28 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-07-27 10:05:02)
> From: Tvrtko Ursulin 
> 
> Since the scatterlist length field is an unsigned int, make
> sure that sg_alloc_table_from_pages does not overflow it while
> coallescing pages to a single entry.
> 
> v2: Drop reference to future use. Use UINT_MAX.
> v3: max_segment must be page aligned.
> v4: Do not rely on compiler to optimise out the rounddown.
> (Joonas Lahtinen)
> v5: Simplified loops and use post-increments rather than
> pre-increments. Use PAGE_MASK and fix comment typo.
> (Andy Shevchenko)
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Masahiro Yamada 
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Chris Wilson  (v2)
> Cc: Joonas Lahtinen 
> Cc: Andy Shevchenko 
> ---
>  include/linux/scatterlist.h |  6 ++
>  lib/scatterlist.c   | 31 ---
>  2 files changed, 26 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index 205aefb4ed93..6dd2ddbc6230 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -21,6 +21,12 @@ struct scatterlist {
>  };
>  
>  /*
> + * Since the above length field is an unsigned int, below we define the 
> maximum
> + * length in bytes that can be stored in one scatterlist entry.
> + */
> +#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK)
> +
> +/*
>   * These macros should be used after a dma_map_sg call has been done
>   * to get bus addresses of each of the SG entries and their lengths.
>   * You should only work with the number of sg entries dma_map_sg
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index dee0c5004e2f..7b2e74da2c44 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
> unsigned int offset, unsigned long size,
> gfp_t gfp_mask)
>  {
> -   unsigned int chunks;
> -   unsigned int i;
> -   unsigned int cur_page;
> +   const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
> +   unsigned int chunks, cur_page, seg_len, i;
> int ret;
> struct scatterlist *s;
>  
> /* compute number of contiguous chunks */
> chunks = 1;
> -   for (i = 1; i < n_pages; ++i)
> -   if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
> -   ++chunks;
> +   seg_len = 0;
> +   for (i = 1; i < n_pages; i++) {
> +   seg_len += PAGE_SIZE;
> +   if (seg_len >= max_segment ||
> +   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
> +   chunks++;
> +   seg_len = 0;
> +   }
> +   }

Ok. Took a moment to realise that it works correctly for a chunk on last
page.

> ret = sg_alloc_table(sgt, chunks, gfp_mask);
> if (unlikely(ret))
> @@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
> /* merging chunks and putting them into the scatterlist */
> cur_page = 0;
> for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
> -   unsigned long chunk_size;
> -   unsigned int j;
> +   unsigned int j, chunk_size;
>  
> /* look for the end of the current chunk */
> -   for (j = cur_page + 1; j < n_pages; ++j)
> -   if (page_to_pfn(pages[j]) !=
> +   seg_len = 0;
> +   for (j = cur_page + 1; j < n_pages; j++) {
> +   seg_len += PAGE_SIZE;
> +   if (seg_len >= max_segment ||
> +   page_to_pfn(pages[j]) !=
> page_to_pfn(pages[j - 1]) + 1)
> break;
> +   }

Ok.

>  
> chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
> -       sg_set_page(s, pages[cur_page], min(size, chunk_size), 
> offset);
> +   sg_set_page(s, pages[cur_page],
> +   min_t(unsigned long, size, chunk_size), offset);
> size -= chunk_size;
> offset = 0;
> cur_page = j;

Reviewed-by: Chris Wilson 
-Chris


Re: [PATCH] vc4: Add an ioctl for labeling GEM BOs for summary stats

2017-07-25 Thread Chris Wilson
Quoting Eric Anholt (2017-06-22 21:50:54)
> This has proven immensely useful for debugging memory leaks and
> overallocation (which is a rather serious concern on the platform,
> given that we typically run at about 256MB of CMA out of up to 1GB
> total memory, with framebuffers that are about 8MB ecah).
> 
> The state of the art without this is to dump debug logs from every GL
> application, guess as to kernel allocations based on bo_stats, and try
> to merge that all together into a global picture of memory allocation
> state.  With this, you can add a couple of calls to the debug build of
> the 3D driver and get a pretty detailed view of GPU memory usage from
> /debug/dri/0/bo_stats (or when we debug print to dmesg on allocation
> failure).
> 
> The Mesa side currently labels at the gallium resource level (so you
> see that a 1920x20 pixmap has been created, presumably for the window
> system panel), but we could extend that to be even more useful with
> glObjectLabel() names being sent all the way down to the kernel.
> 
> (partial) example of sorted debugfs output with Mesa labeling all
> resources:
> 
>kernel BO cache:  16392kb BOs (3)
>tiling shadow 1920x1080:   8160kb BOs (1)
>resource 1920x1080@32/0:   8160kb BOs (1)
> scanout resource 1920x1080@32/0:   8100kb BOs (1)
> kernel:   8100kb BOs (1)
> 
> Signed-off-by: Eric Anholt 
> ---

>  static void vc4_bo_stats_dump(struct vc4_dev *vc4)
>  {

Now unused?

> -   DRM_INFO("num bos allocated: %d\n",
> -vc4->bo_stats.num_allocated);
> -   DRM_INFO("size bos allocated: %dkb\n",
> -vc4->bo_stats.size_allocated / 1024);
> -   DRM_INFO("num bos used: %d\n",
> -vc4->bo_stats.num_allocated - vc4->bo_stats.num_cached);
> -   DRM_INFO("size bos used: %dkb\n",
> -(vc4->bo_stats.size_allocated -
> - vc4->bo_stats.size_cached) / 1024);
> -   DRM_INFO("num bos cached: %d\n",
> -vc4->bo_stats.num_cached);
> -   DRM_INFO("size bos cached: %dkb\n",
> -vc4->bo_stats.size_cached / 1024);
> +   int i;
> +
> +   for (i = 0; i < vc4->num_labels; i++) {
> +   if (!vc4->bo_labels[i].num_allocated)
> +   continue;
> +
> +   DRM_INFO("%30s: %6dkb BOs (%d)\n",
> +vc4->bo_labels[i].name,
> +vc4->bo_labels[i].size_allocated / 1024,
> +vc4->bo_labels[i].num_allocated);
> +   }
>  }
>  
>  #ifdef CONFIG_DEBUG_FS

> +/* Must be called with bo_lock held. */
> +static void vc4_bo_set_label(struct drm_gem_object *gem_obj, int label)
> +{
> +   struct vc4_bo *bo = to_vc4_bo(gem_obj);
> +   struct vc4_dev *vc4 = to_vc4_dev(gem_obj->dev);

lockdep_assert_held(&vc4->bo_lock);

> +
> +   vc4->bo_labels[label].num_allocated++;
> +   vc4->bo_labels[label].size_allocated += gem_obj->size;

This gets called with label=-1 on destroy.

> +   vc4->bo_labels[bo->label].num_allocated--;
> +   vc4->bo_labels[bo->label].size_allocated -= gem_obj->size;

Ok, preassigned to TYPE_KERNEL on creation.

> +   if (vc4->bo_labels[bo->label].num_allocated == 0 &&
> +   is_user_label(bo->label)) {
> +   /* Free user BO label slots on last unreference. */
> +   kfree(vc4->bo_labels[bo->label].name);
> +   vc4->bo_labels[bo->label].name = NULL;
> +   }

This seems dubious. As a user I would expect the label I created to last
until I closed the fd. Otherwise if I ever close all bo of one type,
wait a few seconds for the cache to be reaped, then reallocate a new bo,
someone else may have assigned my label a new name.

If that's guarded against, a comment would help.

> +
> +   bo->label = label;
> +}

> +int vc4_label_bo_ioctl(struct drm_device *dev, void *data,
> +  struct drm_file *file_priv)
> +{
> +   struct vc4_dev *vc4 = to_vc4_dev(dev);
> +   struct drm_vc4_label_bo *args = data;
> +   char *name;
> +   struct drm_gem_object *gem_obj;
> +   int ret = 0, label;
> +
> +   name = kmalloc(args->len + 1, GFP_KERNEL);
> +   if (!name)
> +   return -ENOMEM;
> +
> +   if (copy_from_user(name, (void __user *)(uintptr_t)args->name,
> +  args->len)) {
> +   kfree(name);
> +   return -EFAULT;
> +   }
> +   name[args->len] = 0;

if (!args->len)
return -EINVAL;

name = strndup_user(u64_to_user_ptr(args->name), args->len + 1);
if (IS_ERR(name))
return PTR_ERR(name);
-Chris


Re: [PATCH] drm/udl: Make page_flip asynchronous

2017-07-25 Thread Chris Wilson
Quoting Dawid Kurek (2017-07-07 06:48:49)
> In page_flip vblank is sent with no delay. Driver does not know when the
> actual update is present on the display and has no means for getting
> this information from a device. It is practically impossible to say
> exactly *when* as there is also i.e. a usb delay.
> 
> When we are unable to determine when the vblank actually happens we may
> assume it will behave accordingly, i.e. it will present frames with
> proper timing. In the worst case scenario it should take up to duration
> of one frame (we may get new frame in the device just after presenting
> current one so we would need to wait for the whole frame).
> 
> Because of the asynchronous nature of the delay we need to synchronize:
>  * read/write vrefresh/page_flip data when changing mode and
>preparing/executing vblank
>  * USB requests to prevent interleaved access to URBs for two different
>frame buffers
> 
> All those changes are backports from ChromeOS:
>   1. https://chromium-review.googlesource.com/250622
>   2. https://chromium-review.googlesource.com/249450
>   partially, only change in udl_modeset.c for 'udl_flip_queue'
>   3. https://chromium-review.googlesource.com/321378
>   4. https://chromium-review.googlesource.com/324119
> + fixes for checkpatch and latest drm changes
> 
> Cc: h...@chromium.org
> Cc: marc...@chromium.org
> Cc: za...@chromium.org
> Cc: db...@google.com
> Signed-off-by: Dawid Kurek 

> +struct udl_flip_queue {
> +   struct mutex lock;
> +   struct workqueue_struct *wq;
> +   struct delayed_work work;
> +   struct drm_crtc *crtc;
> +   struct drm_pending_vblank_event *event;
> +   u64 flip_time; /*in jiffies */
> +   u64 vblank_interval; /*in jiffies */

jiffies are unsigned long. That avoids the complication of the reader
having to consider whether all the divides are 32bit safe.

mallocing this struct seems a little overkill as opposed to embedding it
tino the udl_device, flip_queue.wq provides the safety check after
initialisation.

I couldn't spot anything drastically wrong, certainly it looks complete,
I would have structured it such that everything went through the same
worker with a hrtimer emulating the vblank. That model is then but a
stone's throw away from atomic.

Reviewed-by: Chris Wilson 
-Chris


Re: [Intel-gfx] [PATCH] drm/i915/selftests: Fix an error handling path in 'mock_gem_device()'

2017-07-20 Thread Chris Wilson
Quoting Tvrtko Ursulin (2017-07-20 11:09:53)
> 
> On 19/07/2017 23:35, Christophe JAILLET wrote:
> > Goto the right label in case of error, otherwise there is a leak.
> > This has been introduced by c5cf9a9147ff. In this patch a goto has not been
> > updated.
> > 
> > Fixes: c5cf9a9147ff ("drm/i915: Create a kmem_cache to allocate struct 
> > i915_priolist from")
> > Signed-off-by: Christophe JAILLET 
> > ---
> >   drivers/gpu/drm/i915/selftests/mock_gem_device.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
> > b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> > index 627e2aa09766..8cdec455cf7d 100644
> > --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> > +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> > @@ -206,7 +206,7 @@ struct drm_i915_private *mock_gem_device(void)
> >   mkwrite_device_info(i915)->ring_mask = BIT(0);
> >   i915->engine[RCS] = mock_engine(i915, "mock");
> >   if (!i915->engine[RCS])
> > - goto err_dependencies;
> > + goto err_priorities;
> >   
> >   i915->kernel_context = mock_context(i915, NULL);
> >   if (!i915->kernel_context)
> 
> Reviewed-by: Tvrtko Ursulin 

Thanks for the fix and review, pushed.
-Chris


Re: [REGRESSION 4.12] i915 Oops at intel_fbdev_invalidate()

2017-06-30 Thread Chris Wilson
Quoting Takashi Iwai (2017-06-30 16:38:46)
> Hi,
> 
> I hit an Oops with the latest Linus tree (4.12-rc7+) on a HSW machine
> like the following at boot:
> 
>  BUG: unable to handle kernel NULL pointer dereference at 00b0
>  IP: intel_fbdev_invalidate.isra.3+0xc/0x40 [i915]
>  Oops:  [#1] PREEMPT SMP
>  CPU: 2 PID: 833 Comm: X Not tainted 4.10.0-rc5-btest9+ #15
>  Hardware name: Hewlett-Packard HP ProBook 430 G1/1946, BIOS L73 Ver. 08.05 
> 2013/03/15
>  task: 917313db8000 task.stack: b6e70379c000
>  RIP: 0010:intel_fbdev_invalidate.isra.3+0xc/0x40 [i915]
>  RSP: 0018:b6e70379fde0 EFLAGS: 00010246
>  RAX:  RBX: 9172f70e1c00 RCX: 
>  RDX: 917313db8000 RSI:  RDI: 91731934d040
>  RBP: b6e70379fdf0 R08: 0002 R09: 91731934ead0
>  R10: 9173192f0368 R11: 0001 R12: 9173192f
>  R13: 9172f71016e8 R14: 9173227c8480 R15: 9172f71016c8
>  FS:  7f8cc8c3fa00() GS:9173c0a8() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 00b0 CR3: 6273b000 CR4: 001406e0
>  Call Trace:
>   ? intel_fbdev_restore_mode+0x4e/0x70 [i915]
>   i915_driver_lastclose+0xe/0x20 [i915]
>   drm_lastclose+0x3b/0xf0 [drm]
>   drm_release+0x2b8/0x360 [drm]
>   __fput+0xd9/0x1e0
>   fput+0xe/0x10
>   task_work_run+0x83/0xa0
>   exit_to_usermode_loop+0x59/0x85
>   do_syscall_64+0xb3/0xd0
>   entry_SYSCALL64_slow_path+0x25/0x25
> 
> And git bisection leaded to the commit 
> fabef825626d7bd05a321e4427fdf31a169b5173
> drm/i915: Drop struct_mutex around frontbuffer flushes
> 
> The band-aid patch below seems fixing it.

See https://patchwork.freedesktop.org/patch/163261/
-Chris


Re: [PATCH] drm/i915: Fix an error checking test

2017-06-27 Thread Chris Wilson
Quoting Christophe JAILLET (2017-06-27 06:38:54)
> 'dma_buf_vmap' returns NULL on error, not an error pointer.
> 
> Fixes: 6cca22ede8a4 ("drm/i915: Add some mock tests for dmabuf interop")
> Signed-off-by: Christophe JAILLET 

Thanks for the fix, reviewed and pushed.
-Chris


Re: [next-20170609] WARNING: CPU: 3 PID: 71167 at lib/idr.c:157 idr_replace

2017-06-21 Thread Chris Wilson
Quoting Tejun Heo (2017-06-13 14:58:49)
> Cc'ing David Airlie.
> 
> This is from drm driver calling in idr_replace() w/ a negative id.
> Probably a silly bug in error handling path?

No, this is the validation of an invalid userspace handle. The drm ABI
for handles is supposed to be a full u32 range with 0 reserved for an
invalid handle (constrained by the idr_alloc ofc). The WARN was
introduced by 0a835c4f090a.
-Chris


Re: [PATCH] BUG-REPORT: snd-hda: hacked-together EPROBE_DEFER support

2017-06-21 Thread Chris Wilson
Quoting Daniel Vetter (2017-06-21 16:08:54)
> So back when the i915 power well support landed in
> 
> commit 99a2008d0b32d72dfc2a54e7be1eb698dd2e3bd6
> Author: Wang Xingchao 
> Date:   Thu May 30 22:07:10 2013 +0800
> 
> ALSA: hda - Add power-welll support for haswell HDA
> 
> the logic to handle the cross-module depencies was hand-rolled using a
> async work item, and that just doesn't work.
> 
> The correct way to handle cross-module deps is either:
> - request_module + failing when the other module isn't there
> 
> OR
> 
> - failing the module load with EPROBE_DEFER.
> 
> You can't mix them, if you do then the entire load path just
> busy-spins blowing through cpu cycles forever with no way to stop
> this.
> 
> snd-hda-intel does mix it, because the hda codec drivers are loaded
> using request_module, but the i915 depency is handled using
> PROBE_DEFER (or well, should be, but I haven't found any code at all).
> This is a major pain when trying to debug i915 load failures.
> 
> This patch here is a horrible hackish attempt at somewhat correctly
> wriing EPROBE_DEFER through. Stuff that's missing:
> - Check all the other places where load errors are conveniently
>   dropped on the floor.
> - Also fix up the firmware_cb path.
> - Drop the debug noise I've left in to make it clear this isn't
>   anything for merging.

This tames "hdaudio hdaudioC0D0: Unable to bind the codec" which was
continuously spewing previously, and now the system is usable again.
Thanks,
-Chris


Re: [PATCH 2/2] selftests: lib: prime_numbers: update presence check

2017-06-19 Thread Chris Wilson
Quoting Sumit Semwal (2017-06-19 14:44:32)
> The test for prime numbers doesn't differentiate between missing
> prime_numbers.ko and failure in prime_numbers.ko.
> 
> Update it to check for presence of the file itself to skip, therefore
> correctly exercising the test failure case.

modprobe -r shouldn't be executing the module? But you still need to
unload the module before you can load it with the selftest module
parameters. If you can't unload the module due to an earlier failure,
you cannot discern whether or not the module itself is at fault, so
still want to SKIP.
-Chris


Re: [PATCH v11] drm: Unplug drm device when unregistering it (v8)

2017-06-01 Thread Chris Wilson
On Thu, Jun 01, 2017 at 02:26:01PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 01-06-17 14:22, Chris Wilson wrote:
> >On Thu, Jun 01, 2017 at 02:13:28PM +0200, Hans de Goede wrote:
> >>Hi,
> >>
> >>On 31-05-17 04:39, jeffy wrote:
> >>>Hi Hans,
> >>>
> >>>thanx for investigating :)
> >>>
> >>>On 05/30/2017 03:06 PM, Hans de Goede wrote:
> >>>>Hi,
> >>>>
> >>>>On 29-05-17 22:25, Chris Wilson wrote:
> >>>>>On Fri, Apr 14, 2017 at 11:15:04AM -0400, Sean Paul wrote:
> >>>>>>On Thu, Apr 13, 2017 at 03:32:44PM +0800, Jeffy Chen wrote:
> >>>>>>>After unbinding drm, the user space may still owns the drm dev fd, and
> >>>>>>>may still be able to call drm ioctl.
> >>>>>>>
> >>>>>>>We're using an unplugged state to prevent something like that, so let's
> >>>>>>>reuse it here.
> >>>>>>>
> >>>>>>>Also drop drm_unplug_dev, because it would be unused after other
> >>>>>>>changes.
> >>>>>>>
> >>>>>>>Verified on rk3399 chromebook kevin(with cros 4.4 kernel), no more
> >>>>>>>crashes
> >>>>>>>when unbinding drm with ui service still running.
> >>>>>>>
> >>>>>>>v2: Fix some commit messages.
> >>>>>>>v3: Reuse unplug status.
> >>>>>>>v4: Add drm_device_set_plug_state helper.
> >>>>>>>v5: Fix hang when unregistering drm dev with open_count 0.
> >>>>>>>v6: Move drm_device_set_plug_state into drm_drv.
> >>>>>>>v7: Add missing drm_dev_unref in udl_drv.
> >>>>>>>v8: Fix compiler errors after enable udl.
> >>>>>>>
> >>>>>>>Signed-off-by: Jeffy Chen 
> >>>>>>>
> >>>>>>>---
> >>>>>>
> >>>>>>Hi Jeffy,
> >>>>>>Given the trouble we've had with this patch already, coupled with the
> >>>>>>fact that
> >>>>>>unbinding while userspace is active is a contrived/pathological case,
> >>>>>>I don't
> >>>>>>think it's worth picking this patch upstream.
> >>>>>>
> >>>>>>If it's really causing issues downstream, you can add my Reviewed-by
> >>>>>>for a CHROMIUM
> >>>>>>patch, but I'd rather not carry patches in the CrOS repo if we don't
> >>>>>>need to.
> >>>>>
> >>>>>Would a
> >>>>>
> >>>>>Fixes: a39be606f99d ("drm: Do a full device unregister when unplugging")
> >>>>>Reported-by: Marco Diego Aurélio Mesquita 
> >>>>>Cc: Hans de Goede 
> >>>>>
> >>>>>convince us to look into this patch again?
> >>>>
> >>>>The problem is this patch is wrong, see below.
> >>>>
> >>>>>>>  drivers/gpu/drm/drm_drv.c | 26 ++
> >>>>>>>  drivers/gpu/drm/udl/udl_drv.c |  3 ++-
> >>>>>>>  include/drm/drmP.h|  6 --
> >>>>>>>  include/drm/drm_drv.h |  1 -
> >>>>>>>  4 files changed, 12 insertions(+), 24 deletions(-)
> >>>>>>>
> >>>>>>>diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>>>>index b5c6bb4..e1da4d1 100644
> >>>>>>>--- a/drivers/gpu/drm/drm_drv.c
> >>>>>>>+++ b/drivers/gpu/drm/drm_drv.c
> >>>>>>>@@ -355,22 +355,6 @@ void drm_put_dev(struct drm_device *dev)
> >>>>>>>  }
> >>>>>>>  EXPORT_SYMBOL(drm_put_dev);
> >>>>>>>-void drm_unplug_dev(struct drm_device *dev)
> >>>>>>>-{
> >>>>>>>-/* for a USB device */
> >>>>>>>-drm_dev_unregister(dev);
> >>>>>>>-
> >>>>>>>-mutex_lock(&drm_global_mutex);
> >>>>>>>-
> >>>>>>>-drm_device_set_unplugged(dev);
> >>>>>>>-
> >>>>>>>-if (dev->open_count == 0) {
> >

Re: [PATCH v11] drm: Unplug drm device when unregistering it (v8)

2017-06-01 Thread Chris Wilson
On Thu, Jun 01, 2017 at 02:13:28PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 31-05-17 04:39, jeffy wrote:
> >Hi Hans,
> >
> >thanx for investigating :)
> >
> >On 05/30/2017 03:06 PM, Hans de Goede wrote:
> >>Hi,
> >>
> >>On 29-05-17 22:25, Chris Wilson wrote:
> >>>On Fri, Apr 14, 2017 at 11:15:04AM -0400, Sean Paul wrote:
> >>>>On Thu, Apr 13, 2017 at 03:32:44PM +0800, Jeffy Chen wrote:
> >>>>>After unbinding drm, the user space may still owns the drm dev fd, and
> >>>>>may still be able to call drm ioctl.
> >>>>>
> >>>>>We're using an unplugged state to prevent something like that, so let's
> >>>>>reuse it here.
> >>>>>
> >>>>>Also drop drm_unplug_dev, because it would be unused after other
> >>>>>changes.
> >>>>>
> >>>>>Verified on rk3399 chromebook kevin(with cros 4.4 kernel), no more
> >>>>>crashes
> >>>>>when unbinding drm with ui service still running.
> >>>>>
> >>>>>v2: Fix some commit messages.
> >>>>>v3: Reuse unplug status.
> >>>>>v4: Add drm_device_set_plug_state helper.
> >>>>>v5: Fix hang when unregistering drm dev with open_count 0.
> >>>>>v6: Move drm_device_set_plug_state into drm_drv.
> >>>>>v7: Add missing drm_dev_unref in udl_drv.
> >>>>>v8: Fix compiler errors after enable udl.
> >>>>>
> >>>>>Signed-off-by: Jeffy Chen 
> >>>>>
> >>>>>---
> >>>>
> >>>>Hi Jeffy,
> >>>>Given the trouble we've had with this patch already, coupled with the
> >>>>fact that
> >>>>unbinding while userspace is active is a contrived/pathological case,
> >>>>I don't
> >>>>think it's worth picking this patch upstream.
> >>>>
> >>>>If it's really causing issues downstream, you can add my Reviewed-by
> >>>>for a CHROMIUM
> >>>>patch, but I'd rather not carry patches in the CrOS repo if we don't
> >>>>need to.
> >>>
> >>>Would a
> >>>
> >>>Fixes: a39be606f99d ("drm: Do a full device unregister when unplugging")
> >>>Reported-by: Marco Diego Aurélio Mesquita 
> >>>Cc: Hans de Goede 
> >>>
> >>>convince us to look into this patch again?
> >>
> >>The problem is this patch is wrong, see below.
> >>
> >>>>>  drivers/gpu/drm/drm_drv.c | 26 ++
> >>>>>  drivers/gpu/drm/udl/udl_drv.c |  3 ++-
> >>>>>  include/drm/drmP.h|  6 --
> >>>>>  include/drm/drm_drv.h |  1 -
> >>>>>  4 files changed, 12 insertions(+), 24 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>>index b5c6bb4..e1da4d1 100644
> >>>>>--- a/drivers/gpu/drm/drm_drv.c
> >>>>>+++ b/drivers/gpu/drm/drm_drv.c
> >>>>>@@ -355,22 +355,6 @@ void drm_put_dev(struct drm_device *dev)
> >>>>>  }
> >>>>>  EXPORT_SYMBOL(drm_put_dev);
> >>>>>-void drm_unplug_dev(struct drm_device *dev)
> >>>>>-{
> >>>>>-/* for a USB device */
> >>>>>-drm_dev_unregister(dev);
> >>>>>-
> >>>>>-mutex_lock(&drm_global_mutex);
> >>>>>-
> >>>>>-drm_device_set_unplugged(dev);
> >>>>>-
> >>>>>-if (dev->open_count == 0) {
> >>>>>-drm_put_dev(dev);
> >>>>>-}
> >>>>>-mutex_unlock(&drm_global_mutex);
> >>>>>-}
> >>>>>-EXPORT_SYMBOL(drm_unplug_dev);
> >>>>>-
> >>>>>  /*
> >>>>>   * DRM internal mount
> >>>>>   * We want to be able to allocate our own "struct address_space"
> >>>>>to control
> >>>>>@@ -733,6 +717,13 @@ static void remove_compat_control_link(struct
> >>>>>drm_device *dev)
> >>>>>  kfree(name);
> >>>>>  }
> >>>>>+static inline void drm_device_set_plug_state(struct drm_device *dev,
> >>&

Re: [PATCH v6 5/6] drm/i915/gvt: Dmabuf support for GVT-g

2017-06-01 Thread Chris Wilson
   }
> + fb_info->fb_addr = gvt_dmabuf->plane_info.start;
> + fb_info->fb_size = gvt_dmabuf->plane_info.size;
> + fb_info->vgpu = vgpu;
> + obj->gvt_info = fb_info;
> +
> + dmabuf = i915_gem_prime_export(dev, &obj->base, DRM_CLOEXEC | DRM_RDWR);
> +
> + if (IS_ERR(dmabuf)) {
> + kfree(fb_info);
> + i915_gem_object_free(obj);
> + gvt_vgpu_err("export dma-buf failed\n");

Develop a habit for using onion unwind for errors.

> + return PTR_ERR(dmabuf);
> + }
> +
> + ret = dma_buf_fd(dmabuf, DRM_CLOEXEC | DRM_RDWR);
> + if (ret < 0) {
> + kfree(fb_info);
> + i915_gem_object_free(obj);
> + gvt_vgpu_err("create dma-buf fd failed ret:%d\n", ret);
> + return ret;
> + }
> + gvt_dmabuf->fd = ret;
> +
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0c1cbe9..14be3b0 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1609,6 +1609,12 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
> *data,
>   if (!obj)
>   return -ENOENT;
>  
> + /* proxy gem object does not support setting domain */
> + if (i915_gem_object_is_proxy(obj)) {
> + i915_gem_object_put(obj);
> + return -EPERM;

Hook into the onion.

> + }
> +
>   /* Try to flush the object off the GPU without holding the lock.
>* We will repeat the flush holding the lock in the normal manner
>* to catch cases where we are gazumped.
> @@ -1677,6 +1683,12 @@ i915_gem_sw_finish_ioctl(struct drm_device *dev, void 
> *data,
>   if (!obj)
>   return -ENOENT;
>  
> + /* proxy gem obj does not support this operation */
> + if (i915_gem_object_is_proxy(obj)) {
> + i915_gem_object_put(obj);
> + return -EPERM;
> + }
> +
>   /* Pinned buffers may be scanout, so flush the cache */
>   i915_gem_object_flush_if_display(obj);
>   i915_gem_object_put(obj);
> @@ -1727,7 +1739,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>*/
>   if (!obj->base.filp) {
>   i915_gem_object_put(obj);
> - return -EINVAL;
> + return -EPERM;
>   }
>  
>   addr = vm_mmap(obj->base.filp, 0, args->size,
> @@ -3717,6 +3729,12 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, 
> void *data,
>   if (!obj)
>   return -ENOENT;
>  
> + /* proxy gem obj does not support setting caching mode */
> + if (i915_gem_object_is_proxy(obj)) {
> + i915_gem_object_put(obj);
> + return -EPERM;

Use the onion.
> + }
> +
>   if (obj->cache_level == level)
>   goto out;
>  
> @@ -4168,6 +4186,12 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void 
> *data,
>   if (!obj)
>   return -ENOENT;
>  
> + /* proxy gem obj does not support changing backding storage */
> + if (i915_gem_object_is_proxy(obj)) {
> + i915_gem_object_put(obj);

More onion.

> + return -EPERM;
> + }
> +
>   err = mutex_lock_interruptible(&obj->mm.lock);
>   if (err)
>   goto out;

-- 
Chris Wilson, Intel Open Source Technology Centre


[v4.12-rc3] Early boot panic on Broadwell

2017-06-01 Thread Chris Wilson
Hi guys,

I hit an early boot panic on a Broadwell laptop (xps13-9343) that I
bisected to:

commit cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
Author: Mikulas Patocka 
Date:   Tue Apr 18 15:07:11 2017 -0400

x86/PAT: Fix Xorg regression on CPUs that don't support PAT

In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that a kernel warning is produced when attempting 
to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:

  x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 
0xe400-0xe5ff], got write-combining
  [ cut here ]
  WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
  ...
  x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 
0xe400-0xe5ff], got write-combining

To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.

Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.

In my testing, I found that reverting the /boot_cpu_has/this_cpu_has/
change was enough to restore working behaviour:

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 83a59a6..c537bfb 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -234,7 +234,7 @@ static void pat_bsp_init(u64 pat)
 
 static void pat_ap_init(u64 pat)
 {
-   if (!this_cpu_has(X86_FEATURE_PAT)) {
+   if (!boot_cpu_has(X86_FEATURE_PAT)) {
/*
 * If this happens we are on a secondary CPU, but switched to
 * PAT on the boot CPU. We have no way to undo PAT.

Seems scary enough that different cpus may have different features, but
that may just be a symptom of the boot phase?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH 1/3] x86-32: Teach copy_from_user to unroll .size=6/8

2017-06-01 Thread Chris Wilson
Two exception handling register moves are faster to inline than a call
to __copy_user_ll(). We already apply the conversion for a get_user()
call, so for symmetry we should also apply the optimisation to
copy_from_user.

Signed-off-by: Chris Wilson 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/include/asm/uaccess_32.h | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/uaccess_32.h 
b/arch/x86/include/asm/uaccess_32.h
index aeda9bb8af50..44d17d1ab07c 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -23,30 +23,47 @@ static __always_inline unsigned long
 raw_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
if (__builtin_constant_p(n)) {
-   unsigned long ret;
+   unsigned long ret = 0;
 
switch (n) {
case 1:
-   ret = 0;
__uaccess_begin();
__get_user_asm_nozero(*(u8 *)to, from, ret,
  "b", "b", "=q", 1);
__uaccess_end();
return ret;
case 2:
-   ret = 0;
__uaccess_begin();
__get_user_asm_nozero(*(u16 *)to, from, ret,
  "w", "w", "=r", 2);
__uaccess_end();
return ret;
case 4:
-   ret = 0;
__uaccess_begin();
__get_user_asm_nozero(*(u32 *)to, from, ret,
  "l", "k", "=r", 4);
__uaccess_end();
return ret;
+   case 6:
+   __uaccess_begin();
+   __get_user_asm_nozero(*(u32 *)to, from, ret,
+ "l", "k", "=r", 6);
+   if (likely(!ret))
+   __get_user_asm_nozero(*(u16 *)(4 + (char *)to),
+ (u16 __user *)(4 + (char 
__user *)from),
+ ret, "w", "w", "=r", 2);
+   __uaccess_end();
+   return ret;
+   case 8:
+   __uaccess_begin();
+   __get_user_asm_nozero(*(u32 *)to, from, ret,
+ "l", "k", "=r", 8);
+   if (likely(!ret))
+   __get_user_asm_nozero(*(u32 *)(4 + (char *)to),
+ (u32 __user *)(4 + (char 
__user *)from),
+ ret, "l", "k", "=r", 4);
+   __uaccess_end();
+   return ret;
}
}
return __copy_user_ll(to, (__force const void *)from, n);
-- 
2.11.0



[PATCH 2/3] x86-32: Expand static copy_to_user()

2017-06-01 Thread Chris Wilson
For known compile-time fixed sizes, teach x86-32 copy_to_user() to
convert them to the simpler put_user and inline it similar to the
optimisation applied to copy_from_user() and already used by x86-64.

Signed-off-by: Chris Wilson 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/include/asm/uaccess_32.h | 48 +++
 1 file changed, 48 insertions(+)

diff --git a/arch/x86/include/asm/uaccess_32.h 
b/arch/x86/include/asm/uaccess_32.h
index 44d17d1ab07c..a02aa9db34ed 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -16,6 +16,54 @@ unsigned long __must_check __copy_from_user_ll_nocache_nozero
 static __always_inline unsigned long __must_check
 raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
+   if (__builtin_constant_p(n)) {
+   unsigned long ret = 0;
+
+   switch (n) {
+   case 1:
+   __uaccess_begin();
+   __put_user_asm(*(u8 *)from, to, ret,
+   "b", "b", "iq", 1);
+   __uaccess_end();
+   return ret;
+   case 2:
+   __uaccess_begin();
+   __put_user_asm(*(u16 *)from, to, ret,
+   "w", "w", "ir", 2);
+   __uaccess_end();
+   return ret;
+   case 4:
+   __uaccess_begin();
+   __put_user_asm(*(u32 *)from, to, ret,
+   "l", "k", "ir", 4);
+   __uaccess_end();
+   return ret;
+   case 6:
+   __uaccess_begin();
+   __put_user_asm(*(u32 *)from, to, ret,
+   "l", "k", "ir", 4);
+   if (likely(!ret)) {
+   asm("":::"memory");
+   __put_user_asm(*(u16 *)(4 + (char *)from),
+   (u16 __user *)(4 + (char __user 
*)to),
+   ret, "w", "w", "ir", 2);
+   }
+   __uaccess_end();
+   return ret;
+   case 8:
+   __uaccess_begin();
+   __put_user_asm(*(u32 *)from, to, ret,
+   "l", "k", "ir", 4);
+   if (likely(!ret)) {
+   asm("":::"memory");
+   __put_user_asm(*(u32 *)(4 + (char *)from),
+   (u32 __user *)(4 + (char __user 
*)to),
+   ret, "l", "k", "ir", 4);
+   }
+   __uaccess_end();
+   return ret;
+   }
+   }
return __copy_user_ll((__force void *)to, from, n);
 }
 
-- 
2.11.0



x86: Static optimisations for copy_user

2017-05-31 Thread Chris Wilson
I was looking at the overhead of drmIoctl() in a microbenchmark that
repeatedly did a copy_from_user(.size=8) followed by a
copy_to_user(.size=8) as part of the DRM_IOCTL_I915_GEM_BUSY. I found
that if I forced inlined the get_user/put_user instead the walltime of
the ioctl was improved by about 20%. If copy_user_generic_unrolled was
used instead of copy_user_enhanced_fast_string, performance of the
microbenchmark was improved by 10%. Benchmarking on a few machines

(Broadwell)
 benchmark_copy_user(hot):
   size   unrolled string fast-string
  1158 77 79
  2306154158
  4614308317
  6926462476
  8   1344298635
 12   1773482952
 16   2797602   1269
 24   4020903   1906
 32   5055   1204   2540
 48   6150   1806   3810
 64   9564   2409   5082
 96  13583   3612   6483
128  18108   4815   8434

(Broxton)
 benchmark_copy_user(hot):
   size   unrolled string fast-string
  1270 52 53
  2364106109
  4460213218
  6486305312
  8   1250253437
 12   1009332625
 16   2059514897
 24   2624672   1071
 32   3043   1014   1750
 48   3620   1499   2561
 64      1971   
 96   7499   2876   4772
128      3733   6088

which says that for this cache hot case in benchmarking the rep mov
microcode noticeably underperforms. Though once we pass a few
cachelines, and definitely after exceeding L1 cache, rep mov is the
clear winner. From cold, there is no difference in timings.

I can improve the microbenchmark by either force inlining the
raw_copy_*_user switches, or by switching to copy_user_generic_unrolled.
Both leave a sour taste. The switch is too big to be inlined, and if
called out-of-line the function call overhead negates its benefits.
Switching between fast-string and unrolled makes a presumption on
behaviour.

In the end, I limited this series to just adding a few extra
translations for statically known copy_*_user().
-Chris



[PATCH 3/3] x86-64: Inline 6/12 byte copy_user

2017-05-31 Thread Chris Wilson
Extend the list of replacements for compile-time known sizes to include
6/12 byte copies. These expand to two movs (along with their exception
table) and are cheaper to inline than the function call (similar to the
10 byte copy already handled).

Signed-off-by: Chris Wilson 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/include/asm/uaccess_64.h | 42 +++
 1 file changed, 42 insertions(+)

diff --git a/arch/x86/include/asm/uaccess_64.h 
b/arch/x86/include/asm/uaccess_64.h
index c5504b9a472e..ff2d65baa988 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -71,6 +71,16 @@ raw_copy_from_user(void *dst, const void __user *src, 
unsigned long size)
  ret, "l", "k", "=r", 4);
__uaccess_end();
return ret;
+   case 6:
+   __uaccess_begin();
+   __get_user_asm_nozero(*(u32 *)dst, (u32 __user *)src,
+  ret, "l", "k", "=r", 6);
+   if (likely(!ret))
+   __get_user_asm_nozero(*(u16 *)(4 + (char *)dst),
+  (u16 __user *)(4 + (char __user *)src),
+  ret, "w", "w", "=r", 2);
+   __uaccess_end();
+   return ret;
case 8:
__uaccess_begin();
__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
@@ -87,6 +97,16 @@ raw_copy_from_user(void *dst, const void __user *src, 
unsigned long size)
   ret, "w", "w", "=r", 2);
__uaccess_end();
return ret;
+   case 12:
+   __uaccess_begin();
+   __get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
+  ret, "q", "", "=r", 10);
+   if (likely(!ret))
+   __get_user_asm_nozero(*(u32 *)(8 + (char *)dst),
+  (u32 __user *)(8 + (char __user *)src),
+  ret, "l", "k", "=r", 4);
+   __uaccess_end();
+   return ret;
case 16:
__uaccess_begin();
__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
@@ -128,6 +148,17 @@ raw_copy_to_user(void __user *dst, const void *src, 
unsigned long size)
  ret, "l", "k", "ir", 4);
__uaccess_end();
return ret;
+   case 6:
+   __uaccess_begin();
+   __put_user_asm(*(u32 *)src, (u32 __user *)dst,
+  ret, "l", "k", "ir", 6);
+   if (likely(!ret)) {
+   asm("":::"memory");
+   __put_user_asm(2[(u16 *)src], 2 + (u16 __user *)dst,
+  ret, "w", "w", "ir", 2);
+   }
+   __uaccess_end();
+   return ret;
case 8:
__uaccess_begin();
__put_user_asm(*(u64 *)src, (u64 __user *)dst,
@@ -145,6 +176,17 @@ raw_copy_to_user(void __user *dst, const void *src, 
unsigned long size)
}
__uaccess_end();
return ret;
+   case 12:
+   __uaccess_begin();
+   __put_user_asm(*(u64 *)src, (u64 __user *)dst,
+  ret, "q", "", "er", 12);
+   if (likely(!ret)) {
+   asm("":::"memory");
+   __put_user_asm(2[(u32 *)src], 2 + (u32 __user *)dst,
+  ret, "l", "k", "ir", 4);
+   }
+   __uaccess_end();
+   return ret;
case 16:
__uaccess_begin();
__put_user_asm(*(u64 *)src, (u64 __user *)dst,
-- 
2.11.0



Re: [PATCH v11] drm: Unplug drm device when unregistering it (v8)

2017-05-29 Thread Chris Wilson
80a204 100644
> > --- a/include/drm/drmP.h
> > +++ b/include/drm/drmP.h
> > @@ -488,12 +488,6 @@ static __inline__ int drm_core_check_feature(struct 
> > drm_device *dev,
> > return ((dev->driver->driver_features & feature) ? 1 : 0);
> >  }
> >  
> > -static inline void drm_device_set_unplugged(struct drm_device *dev)
> > -{
> > -   smp_wmb();
> > -   atomic_set(&dev->unplugged, 1);
> > -}
> > -
> >  static inline int drm_device_is_unplugged(struct drm_device *dev)
> >  {
> > int ret = atomic_read(&dev->unplugged);
> > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> > index 0fefc3f..eb63078 100644
> > --- a/include/drm/drm_drv.h
> > +++ b/include/drm/drm_drv.h
> > @@ -544,7 +544,6 @@ void drm_dev_unregister(struct drm_device *dev);
> >  void drm_dev_ref(struct drm_device *dev);
> >  void drm_dev_unref(struct drm_device *dev);
> >  void drm_put_dev(struct drm_device *dev);
> > -void drm_unplug_dev(struct drm_device *dev);
> >  
> >  int drm_dev_set_unique(struct drm_device *dev, const char *name);

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] [PATCH v5 4/5] drm/i915/gvt: Dmabuf support for GVT-g

2017-05-26 Thread Chris Wilson
On Thu, May 25, 2017 at 02:28:25PM +0100, Chris Wilson wrote:
> On Tue, May 23, 2017 at 06:32:00PM +0800, Xiaoguang Chen wrote:
> > +   gtt_entries = (gen8_pte_t __iomem *)dev_priv->ggtt.gsm +
> > +   (fb_gma >> PAGE_SHIFT);
> > +   for_each_sg(st->sgl, sg, fb_size, i) {
> > +   sg->offset = 0;
> > +   sg->length = PAGE_SIZE;
> > +   sg_dma_address(sg) =
> > +   GEN8_DECODE_PTE(readq(>t_entries[i]));
> > +   sg_dma_len(sg) = PAGE_SIZE;
> 
> This assumes that the entries are PAGE_SIZE. This will not remain true.

Ok, we will only be supporting different page sizes for ppgtt. However,
it is probably better to use I915_GTT_PAGE_SIZE to match our insertions.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] [PATCH v5 4/5] drm/i915/gvt: Dmabuf support for GVT-g

2017-05-25 Thread Chris Wilson
iv->drm;
> + struct intel_vgpu_plane_info *info = args;
> +
> + info = intel_vgpu_get_plane_info(dev, vgpu, info->plane_id);
> + if (info == NULL)
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> +int intel_vgpu_create_dmabuf(struct intel_vgpu *vgpu, void *args)
> +{
> + struct dma_buf *dmabuf;
> + struct drm_i915_gem_object *obj;
> + struct drm_device *dev = &vgpu->gvt->dev_priv->drm;
> + struct intel_vgpu_dmabuf *gvt_dmabuf = args;
> + struct intel_vgpu_plane_info *info;
> + int ret;
> +
> + info = intel_vgpu_get_plane_info(dev, vgpu,
> + gvt_dmabuf->plane_info.plane_id);
> + if (info == NULL)
> + return -EINVAL;
> +
> + obj = intel_vgpu_create_gem(dev, info);
> + if (obj == NULL) {
> + gvt_vgpu_err("create gvt gem obj failed:%d\n", vgpu->id);
> + return -EINVAL;

Of all the errors that can cause this, do you think the user is at fault?

> + }
> +
> + dmabuf = i915_gem_prime_export(dev, &obj->base, DRM_CLOEXEC | DRM_RDWR);
> +
> + if (IS_ERR(dmabuf)) {
> + gvt_vgpu_err("export dma-buf failed\n");
> + return -EINVAL;

Again, was this a user error? You even have the right error to hand!

> + }
> +
> + ret = dma_buf_fd(dmabuf, DRM_CLOEXEC | DRM_RDWR);
> + if (ret < 0) {
> + gvt_vgpu_err("create dma-buf fd failed ret:%d\n", ret);
> + return -EINVAL;

Sitll a user error? Not ret?

> + }
> + gvt_dmabuf->fd = ret;
> + gvt_dmabuf->plane_info = *info;
> +
> + return 0;
> +}

> diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
> index c42266c..763a8c5 100644
> --- a/drivers/gpu/drm/i915/gvt/gvt.h
> +++ b/drivers/gpu/drm/i915/gvt/gvt.h
> @@ -47,6 +47,7 @@
>  #include "render.h"
>  #include "cmd_parser.h"
>  #include "fb_decoder.h"
> +#include "dmabuf.h"
>  
>  #define GVT_MAX_VGPU 8
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0c1cbe9..54f7a0f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1609,6 +1609,10 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
> *data,
>   if (!obj)
>   return -ENOENT;
>  
> + /* The obj is a gvt dma-buf object and set domain is not supported */
> + if (i915_gem_object_is_gvt_dmabuf(obj))
> + return -EPERM;

Leaks the obj.

Ok, I'm slightly warming to this.

Newly broken ioctls:
i915_gem_set_caching_ioctl
i915_gem_set_domain_ioctl
i915_gem_sw_finish_ioctl
i915_gem_set_tiling_ioctl
i915_gem_madvise_ioctl

These should all be -EPERM.

Existing broken ioctls:
i915_gem_mmap_ioctl

Sadly this used -EINVAL. Might sneak a change in to use -EPERM for
constistency.

Misbehaving ioctls:
i915_gem_busy_ioctl
i915_gem_throttle_ioctl
i915_gem_wait_ioctl

These only work on the proxy bo and do not reflect the real object.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH -next] drm/vgem: Fix return value check in vgem_init()

2017-05-21 Thread Chris Wilson
On Sun, May 21, 2017 at 01:19:39AM +, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> In case of error, the function platform_device_register_simple() returns
> ERR_PTR() and never returns NULL. The NULL test in the return value
> check should be replaced with IS_ERR().
> 
> Fixes: 315f0242aa2b ("drm/vgem: Convert to a struct drm_device subclass")

This is wrong, the bug was introduced in
af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
not that it matters since it is the same tag.

> Signed-off-by: Wei Yongjun 

Checked it is an ERR_PTR return on failure, so
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH][drm-next] drm/i915: Check for allocation failure

2017-05-19 Thread Chris Wilson
On Fri, May 19, 2017 at 06:56:17PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> The memory allocation for C is not being null checked and hence we
> could end up with a null pointer dereference. Fix this with a null
> pointer check. (I really should have noticed this when I was fixing an
> earlier issue.)
> 
> Detected by CoverityScan, CID#1436406 ("Dereference null return")
> 
> Fixes: 47624cc3301b60 ("drm/i915: Import the kfence selftests for 
> i915_sw_fence")
> Signed-off-by: Colin Ian King 

Pushed, thanks very much for the patch.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH -next] drm/i915: Fix return value check in kfence selftests

2017-05-19 Thread Chris Wilson
On Fri, May 19, 2017 at 12:26:05AM +, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> Fix the return value check which testing the wrong variable.

Already fixed up yesterday:

commit ac0a73fb526100adc521ec2069623e47ca3997a8
Author: Colin Ian King 
Date:   Thu May 18 14:39:42 2017 +0100

drm/i915: Check C for null pointer rather than B

Thanks for the patch.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm/i915: Check C for null pointer rather than B

2017-05-18 Thread Chris Wilson
On Thu, May 18, 2017 at 02:39:42PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> There are two occasions where pointer B is being check for a NULL
> when it should be pointer C instead. Fix these.
> 
> Detected by CoverityScan, CID#1436348,1436349 ("Logically Dead Code")
> 
> Fixes: 47624cc3301b60 ("drm/i915: Import the kfence selftests for 
> i915_sw_fence")
> Signed-off-by: Colin Ian King 

Thanks, obviously correct and not exercised by BAT so pushed.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH v2 4/5] drm/i915/gvt: Dmabuf support for GVT-g

2017-05-18 Thread Chris Wilson
n the GPU. We just can't use CPU mmaps.

However, we should reject things like tiling changes, cache level
changes, set-domain is also problematic.

> +
> + return obj;
> +}

> +int intel_vgpu_create_dmabuf(struct intel_vgpu *vgpu, void *args)
> +{
> + struct dma_buf *dmabuf;
> + struct drm_i915_gem_object *obj;
> + struct drm_device *dev = &vgpu->gvt->dev_priv->drm;
> + struct intel_vgpu_dmabuf *gvt_dmabuf = args;
> + struct intel_vgpu_plane_info *info;
> + int ret;
> +
> + info = intel_vgpu_get_plane_info(dev, vgpu, gvt_dmabuf->plane_id);
> + if (info == NULL)
> + return -EINVAL;
> +
> + vgpu->plane_info = info;
> + obj = intel_vgpu_create_gem(dev, vgpu);
> + if (obj == NULL) {
> + gvt_vgpu_err("create gvt gem obj failed:%d\n", vgpu->id);
> + return -EINVAL;
> + }
> + vgpu->obj_dmabuf = obj;
> +
> + ret = i915_gem_object_pin_pages(obj);

Why do you need to pin the pages? It has no relevance to the backing
storage, that can be discard freely by the other end of vgpu. get_pages
is just allocating an sg_table...
-chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] mm: clarify why we want kmalloc before falling backto vmallock

2017-05-17 Thread Chris Wilson
subject s/vmallock/vmalloc/

On Wed, May 17, 2017 at 10:09:32AM +0200, Michal Hocko wrote:
> From: Michal Hocko 
> 
> While converting drm_[cm]alloc* helpers to kvmalloc* variants Chris
> Wilson has wondered why we want to try kmalloc before vmalloc fallback
> even for larger allocations requests. Let's clarify that one larger
> physically contiguous block is less likely to fragment memory than many
> scattered pages which can prevent more large blocks from being created.
> 
> Suggested-by: Chris Wilson 
> Signed-off-by: Michal Hocko 

It helped me understand the decisions made by the code, so
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 1/2] drm: replace drm_[cm]alloc* by kvmalloc alternatives

2017-05-17 Thread Chris Wilson
On Wed, May 17, 2017 at 11:03:50AM +0200, Michal Hocko wrote:
> On Wed 17-05-17 08:38:09, Chris Wilson wrote:
> > On Wed, May 17, 2017 at 08:55:08AM +0200, Michal Hocko wrote:
> > > From: Michal Hocko 
> > > 
> > > drm_[cm]alloc* has grown their own kvmalloc with vmalloc fallback
> > > implementations. MM has grown kvmalloc* helpers in the meantime. Let's
> > > use those because it a) reduces the code and b) MM has a better idea
> > > how to implement fallbacks (e.g. do not vmalloc before kmalloc is tried
> > > with __GFP_NORETRY).
> > > 
> > > drm_calloc_large needs to get __GFP_ZERO explicitly but it is the same
> > > thing as kvmalloc_array in principle.
> > > 
> > > Signed-off-by: Michal Hocko 
> > 
> > Just a little surprised that calloc_large users still exist.
> > 
> > Reviewed-by: Chris Wilson 
> 
> Thanks!
> 
> > One more feature request from mm, can we have the 
> > if (size != 0 && n > SIZE_MAX / size)
> > check exported by itself.
> 
> What do you exactly mean by exporting?

Just make available to others so that little things like choice between
SIZE_MAX and ULONG_MAX are consistent and actually reflect the right
limit (as dictated by kmalloc/kvmalloc/vmalloc...).

> Something like the following?
> I haven't compile tested it outside of mm with different config options.
> Sticking alloc_array_check into mm_types.h is kind of gross but I do not
> have a great idea where to put it. A new header doesn't seem nice.
> ---
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 7cb17c6b97de..f908b14ffc4c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -534,7 +534,7 @@ static inline void *kvzalloc(size_t size, gfp_t flags)
>  
>  static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags)
>  {
> - if (size != 0 && n > SIZE_MAX / size)
> + if (!alloc_array_check(n, size))
>   return NULL;
>  
>   return kvmalloc(n * size, flags);
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 45cdb27791a3..d7154b43a0d1 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -601,4 +601,10 @@ typedef struct {
>   unsigned long val;
>  } swp_entry_t;
>  
> +static inline bool alloc_array_check(size_t n, size_t size)
> +{
> + if (size != 0 && n > SIZE_MAX / size)
> + return false;
> + return true;

Just return size == 0 || n <= SIZE_MAX /size ?

Whether or not size being 0 makes for a sane user is another question.
The guideline is that size is the known constant from sizeof() or
whatever and n is the variable number to allocate.

But yes, that inline is what I want :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm: use kvmalloc_array for drm_malloc*

2017-05-17 Thread Chris Wilson
On Wed, May 17, 2017 at 09:44:53AM +0200, Michal Hocko wrote:
> On Tue 16-05-17 12:09:08, Chris Wilson wrote:
> > On Tue, May 16, 2017 at 12:53:52PM +0200, Michal Hocko wrote:
> > > On Tue 16-05-17 10:31:19, Chris Wilson wrote:
> > > > On Tue, May 16, 2017 at 11:06:06AM +0200, Michal Hocko wrote:
> > > > > From: Michal Hocko 
> > > > > 
> > > > > drm_malloc* has grown their own kmalloc with vmalloc fallback
> > > > > implementations. MM has grown kvmalloc* helpers in the meantime. Let's
> > > > > use those because it a) reduces the code and b) MM has a better idea
> > > > > how to implement fallbacks (e.g. do not vmalloc before kmalloc is 
> > > > > tried
> > > > > with __GFP_NORETRY).
> > > > 
> > > > Better? The same idea. The only difference I was reluctant to hand out
> > > > large pages for long lived objects. If that's the wisdom of the core mm,
> > > > so be it.
> > > 
> > > vmalloc tends to fragment physical memory more os it is preferable to
> > > try the physically contiguous request first and only fall back to
> > > vmalloc if the first attempt would be too costly or it fails.
> > 
> > Not relevant for the changelog in this patch, but it would be nice to
> > have that written in kvmalloc() as to why the scatterring of 4k vmapped
> > pages prevents defragmentation when compared to allocating large pages.
> 
> Well, it is not as much about defragmentation because both vmapped and
> kmalloc allocations are very likely to be unmovable (at least
> currently). Theoretically there shouldn't be a problem to make vmapped
> pages movable as the ptes can be modified but this is not implemented...
> The problem is that vmapped pages are more likely to break up more
> larger order blocks. kmalloc will naturally break a single larger block.
> 
> > I have vague recollections of seeing the conversation, but a summary as
> > to the reason why kvmalloc prefers large pages will be good for future
> > reference.
> 
> Does the following sound better to you?
> 
> diff --git a/mm/util.c b/mm/util.c
> index 464df3489903..87499f8119f2 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -357,7 +357,10 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
>   WARN_ON_ONCE((flags & GFP_KERNEL) != GFP_KERNEL);
>  
>   /*
> -  * Make sure that larger requests are not too disruptive - no OOM
> +  * We want to attempt a large physically contiguous block first because
> +  * it is less likely to fragment multiple larger blocks and therefore
> +  * contribute to a long term fragmentation less than vmalloc fallback.
> +  * However make sure that larger requests are not too disruptive - no 
> OOM
>* killer and no allocation failure warnings as we have a fallback
>*/

Hmm, shouldn't we also teach vmalloc to allocate large chunks where
possible - even mixing huge and normal pages? As well as avoiding pinning
the pages and allowing migration.

That comment is helping me to understand why the decison is made to
favour kmalloc over vmalloc, thanks.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 1/2] drm: replace drm_[cm]alloc* by kvmalloc alternatives

2017-05-17 Thread Chris Wilson
On Wed, May 17, 2017 at 08:55:08AM +0200, Michal Hocko wrote:
> From: Michal Hocko 
> 
> drm_[cm]alloc* has grown their own kvmalloc with vmalloc fallback
> implementations. MM has grown kvmalloc* helpers in the meantime. Let's
> use those because it a) reduces the code and b) MM has a better idea
> how to implement fallbacks (e.g. do not vmalloc before kmalloc is tried
> with __GFP_NORETRY).
> 
> drm_calloc_large needs to get __GFP_ZERO explicitly but it is the same
> thing as kvmalloc_array in principle.
> 
> Signed-off-by: Michal Hocko 

Just a little surprised that calloc_large users still exist.

Reviewed-by: Chris Wilson 

One more feature request from mm, can we have the 
if (size != 0 && n > SIZE_MAX / size)
check exported by itself. It is used by both kvmalloc_array and
kmalloc_array, and in my ioctls I have it open-coded as well to
differentiate between the -EINVAL (for bogus user values) and 
genuine -ENOMEM.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm: use kvmalloc_array for drm_malloc*

2017-05-16 Thread Chris Wilson
On Tue, May 16, 2017 at 12:53:52PM +0200, Michal Hocko wrote:
> On Tue 16-05-17 10:31:19, Chris Wilson wrote:
> > On Tue, May 16, 2017 at 11:06:06AM +0200, Michal Hocko wrote:
> > > From: Michal Hocko 
> > > 
> > > drm_malloc* has grown their own kmalloc with vmalloc fallback
> > > implementations. MM has grown kvmalloc* helpers in the meantime. Let's
> > > use those because it a) reduces the code and b) MM has a better idea
> > > how to implement fallbacks (e.g. do not vmalloc before kmalloc is tried
> > > with __GFP_NORETRY).
> > 
> > Better? The same idea. The only difference I was reluctant to hand out
> > large pages for long lived objects. If that's the wisdom of the core mm,
> > so be it.
> 
> vmalloc tends to fragment physical memory more os it is preferable to
> try the physically contiguous request first and only fall back to
> vmalloc if the first attempt would be too costly or it fails.

Not relevant for the changelog in this patch, but it would be nice to
have that written in kvmalloc() as to why the scatterring of 4k vmapped
pages prevents defragmentation when compared to allocating large pages.
I have vague recollections of seeing the conversation, but a summary as
to the reason why kvmalloc prefers large pages will be good for future
reference.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm: use kvmalloc_array for drm_malloc*

2017-05-16 Thread Chris Wilson
On Tue, May 16, 2017 at 11:06:06AM +0200, Michal Hocko wrote:
> From: Michal Hocko 
> 
> drm_malloc* has grown their own kmalloc with vmalloc fallback
> implementations. MM has grown kvmalloc* helpers in the meantime. Let's
> use those because it a) reduces the code and b) MM has a better idea
> how to implement fallbacks (e.g. do not vmalloc before kmalloc is tried
> with __GFP_NORETRY).

Better? The same idea. The only difference I was reluctant to hand out
large pages for long lived objects. If that's the wisdom of the core mm,
so be it.
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH v2] perf/core: Avoid removing shared pmu_context on unregister

2017-05-12 Thread Chris Wilson
On Fri, May 12, 2017 at 01:40:37PM -0700, David Carrillo-Cisneros wrote:
> On Fri, May 12, 2017 at 4:45 AM, Chris Wilson  
> wrote:
> > In commit 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu"),
> > the search for another user of the pmu_cpu_context was removed, and so
> > we unconditionally free it during perf_pmu_unregister. This leads to
> > random corruption later and a BUG at mm/percpu.c:689.
> >
> > v2: Check for shared pmu_contexts under the mutex.
> >
> > Fixes: 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu")
> > Signed-off-by: Chris Wilson 
> > Cc: David Carrillo-Cisneros 
> > Cc: Peter Zijlstra (Intel) 
> > Cc: Ingo Molnar 
> > Cc:  # v4.11+
> > ---
> >  kernel/events/core.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index aaefaa27e1a6..4f60f66b35ad 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -8983,10 +8983,12 @@ EXPORT_SYMBOL_GPL(perf_pmu_register);
> >  void perf_pmu_unregister(struct pmu *pmu)
> >  {
> > int remove_device;
> > +   int remove_context;
> >
> > mutex_lock(&pmus_lock);
> > remove_device = pmu_bus_running;
> > list_del_rcu(&pmu->entry);
> > +   remove_context = !find_pmu_context(pmu->task_ctx_nr);
> > mutex_unlock(&pmus_lock);
> >
> > /*
> > @@ -9005,7 +9007,8 @@ void perf_pmu_unregister(struct pmu *pmu)
> > device_del(pmu->dev);
> > put_device(pmu->dev);
> > }
> > -   free_pmu_context(pmu);
> > +   if (remove_context)
> > +   free_pmu_context(pmu);
> >  }
> 
> Shouldn't be cleaner to keep the check in find_pmu_context, just as it
> was before commit 1fd7e4169954 ("perf/core: Remove
> perf_cpu_context::unique_pmu")?
> 
> (Code below untested)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 6e75a5c9412d..50d90cbf8418 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8857,7 +8857,8 @@ static struct perf_cpu_context __percpu
> *find_pmu_context(int ctxn)
>  static void free_pmu_context(struct pmu *pmu)
>  {
> mutex_lock(&pmus_lock);
> -   free_percpu(pmu->pmu_cpu_context);
> +   if (!find_pmu_context(pmu->task_ctx_nr))
> +   free_percpu(pmu->pmu_cpu_context);
> mutex_unlock(&pmus_lock);

We have the problem that find_pmu_context looks for a matching
task_ctx_nr, but if a second pmu was registered since our list_del and
before our search, we would wrongly conclude that it was using our pmu
context, but it had actually allocated a new one for itself.

We could do a search by pmu_cpu_context instead, but seems overkill
compared to the remove_context approach.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH v2] perf/core: Avoid removing shared pmu_context on unregister

2017-05-12 Thread Chris Wilson
In commit 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu"),
the search for another user of the pmu_cpu_context was removed, and so
we unconditionally free it during perf_pmu_unregister. This leads to
random corruption later and a BUG at mm/percpu.c:689.

v2: Check for shared pmu_contexts under the mutex.

Fixes: 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu")
Signed-off-by: Chris Wilson 
Cc: David Carrillo-Cisneros 
Cc: Peter Zijlstra (Intel) 
Cc: Ingo Molnar 
Cc:  # v4.11+
---
 kernel/events/core.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index aaefaa27e1a6..4f60f66b35ad 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8983,10 +8983,12 @@ EXPORT_SYMBOL_GPL(perf_pmu_register);
 void perf_pmu_unregister(struct pmu *pmu)
 {
int remove_device;
+   int remove_context;
 
mutex_lock(&pmus_lock);
remove_device = pmu_bus_running;
list_del_rcu(&pmu->entry);
+   remove_context = !find_pmu_context(pmu->task_ctx_nr);
mutex_unlock(&pmus_lock);
 
/*
@@ -9005,7 +9007,8 @@ void perf_pmu_unregister(struct pmu *pmu)
device_del(pmu->dev);
put_device(pmu->dev);
}
-   free_pmu_context(pmu);
+   if (remove_context)
+   free_pmu_context(pmu);
 }
 EXPORT_SYMBOL_GPL(perf_pmu_unregister);
 
-- 
2.11.0



Re: [PATCH] perf/core: Avoid removing shared pmu_context on unregister

2017-05-12 Thread Chris Wilson
On Fri, May 12, 2017 at 11:40:32AM +0100, Chris Wilson wrote:
> In commit 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu"),
> the search for another user of the pmu_cpu_context was removed, and so
> we unconditionally free it during perf_pmu_unregister. This leads to
> random corruption later and a BUG at mm/percpu.c:689.
> 
> Fixes: 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu")
> Signed-off-by: Chris Wilson 
> Cc: David Carrillo-Cisneros 
> Cc: Peter Zijlstra (Intel) 
> Cc: Ingo Molnar 
> Cc:  # v4.11+
> ---
>  kernel/events/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index aaefaa27e1a6..e62b6207925c 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9005,7 +9005,8 @@ void perf_pmu_unregister(struct pmu *pmu)
>   device_del(pmu->dev);
>   put_device(pmu->dev);
>   }
> - free_pmu_context(pmu);
> + if (!find_pmu_context(pmu->task_ctx_nr))

Bleh, that find should be under some guard, such as the mutex we held
ealier.

> +     free_pmu_context(pmu);
>  }
>  EXPORT_SYMBOL_GPL(perf_pmu_unregister);
>  
> -- 
> 2.11.0
> 

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH] perf/core: Avoid removing shared pmu_context on unregister

2017-05-12 Thread Chris Wilson
In commit 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu"),
the search for another user of the pmu_cpu_context was removed, and so
we unconditionally free it during perf_pmu_unregister. This leads to
random corruption later and a BUG at mm/percpu.c:689.

Fixes: 1fd7e4169954 ("perf/core: Remove perf_cpu_context::unique_pmu")
Signed-off-by: Chris Wilson 
Cc: David Carrillo-Cisneros 
Cc: Peter Zijlstra (Intel) 
Cc: Ingo Molnar 
Cc:  # v4.11+
---
 kernel/events/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index aaefaa27e1a6..e62b6207925c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9005,7 +9005,8 @@ void perf_pmu_unregister(struct pmu *pmu)
device_del(pmu->dev);
put_device(pmu->dev);
}
-   free_pmu_context(pmu);
+   if (!find_pmu_context(pmu->task_ctx_nr))
+   free_pmu_context(pmu);
 }
 EXPORT_SYMBOL_GPL(perf_pmu_unregister);
 
-- 
2.11.0



Re: [kernel-locking] question about structure field initialization

2017-05-12 Thread Chris Wilson
On Thu, May 11, 2017 at 03:00:02PM -0500, Gustavo A. R. Silva wrote:
> 
> Hello everybody,
> 
> While looking into Coverity ID 1402035 I ran into the following
> piece of code at kernel/locking/test-ww_mutex.c:197:
> 
> 197static int test_abba(bool resolve)
> 198{
> 199struct test_abba abba;
> 200struct ww_acquire_ctx ctx;
> 201int err, ret;
> 202
> 203ww_mutex_init(&abba.a_mutex, &ww_class);
> 204ww_mutex_init(&abba.b_mutex, &ww_class);
> 205INIT_WORK_ONSTACK(&abba.work, test_abba_work);
> 206init_completion(&abba.a_ready);
> 207init_completion(&abba.b_ready);
> 208abba.resolve = resolve;
> 209
> 210schedule_work(&abba.work);
> 211
> 212ww_acquire_init(&ctx, &ww_class);
> 213ww_mutex_lock(&abba.a_mutex, &ctx);
> 214
> 215complete(&abba.a_ready);
> 216wait_for_completion(&abba.b_ready);
> 217
> 218err = ww_mutex_lock(&abba.b_mutex, &ctx);
> 219if (resolve && err == -EDEADLK) {
> 220ww_mutex_unlock(&abba.a_mutex);
> 221ww_mutex_lock_slow(&abba.b_mutex, &ctx);
> 222err = ww_mutex_lock(&abba.a_mutex, &ctx);
> 223}
> 224
> 225if (!err)
> 226ww_mutex_unlock(&abba.b_mutex);
> 227ww_mutex_unlock(&abba.a_mutex);
> 228ww_acquire_fini(&ctx);
> 229
> 230flush_work(&abba.work);
> 231destroy_work_on_stack(&abba.work);
> 232
> 233ret = 0;
> 234if (resolve) {
> 235if (err || abba.result) {
> 236pr_err("%s: failed to resolve ABBA
> deadlock, A err=%d, B err=%d\n",
> 237   __func__, err, abba.result);
> 238ret = -EINVAL;
> 239}
> 240} else {
> 241if (err != -EDEADLK && abba.result != -EDEADLK) {
> 242pr_err("%s: missed ABBA deadlock, A
> err=%d, B err=%d\n",
> 243   __func__, err, abba.result);
> 244ret = -EINVAL;
> 245}
> 246}
> 247return ret;
> 248}
> 
> The issue here is that apparently abba.result is being used at lines
> 235, 237 and 241 without previous initialization.
> 
> It seems to me that this is an issue, but I may be overlooking something.
> Can someone help me to spot where exactly abba.result is being
> initialized, if at all?

You are only looking at half the code. Though the schedule/flush it is
indirectly executing test_abba_work().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm/prime: include device.h

2017-05-10 Thread Chris Wilson
On Wed, May 10, 2017 at 07:40:01AM -0700, Laura Abbott wrote:
> Explictly add linux/device.h to ensure struct device is defined:
> 
> In file included from include/drm/drm_file.h:38:0,
>  from drivers/gpu/drm/drm_file.c:38:
> include/drm/drm_prime.h:71:14: warning: 'struct device' declared inside
> parameter list will not be visible outside of this definition or
> declaration
>   struct device *attach_dev);

We aren't using it for anything other than an opaque pointer, so a
struct device;
forward decl will suffice?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces

2017-05-05 Thread Chris Wilson
On Thu, May 04, 2017 at 09:09:54PM -0700, Joe Perches wrote:
> On Thu, 2017-05-04 at 21:25 +0100, Chris Wilson wrote:
> > On Thu, May 04, 2017 at 11:45:48AM -0700, Laura Abbott wrote:
> > > 
> > > Enable the GEM dma-buf import interfaces in addition to the export
> > > interfaces. This lets vgem be used as a test source for other allocators
> > > (e.g. Ion).
> > > 
> > > Reviewed-by: Chris Wilson 
> > > Signed-off-by: Laura Abbott 
> > > ---
> > > v4: Use new drm_gem_prime_import_dev function
> > > ---
> > >  static const struct vm_operations_struct vgem_gem_vm_ops = {
> > > @@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, 
> > > struct drm_file *file)
> > >   kfree(vfile);
> > >  }
> > >  
> > > -/* ioctls */
> > > -
> > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> > > -   struct drm_file *file,
> > > -   unsigned int *handle,
> > > -   unsigned long size)
> > > +static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device 
> > > *dev,
> > > + unsigned long size)
> > 
> > I'm going to guess that doesn't line up anymore. If checkpatch isn't
> > complaining, then sorry for the noise.
> 
> Because of the very long identifiers, perhaps a
> nicer way to write this is like:
> 
> static struct drm_vgem_gem_object *
> __vgen_gem_create(struct drm_device *dev, unsigned long size);

Yes, we frequently use that pattern for very_long_function_names.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()

2017-05-04 Thread Chris Wilson
On Thu, May 04, 2017 at 10:48:10PM +0200, SF Markus Elfring wrote:
> >> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >> @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m)
> >>  
> >>forcewake_count = 
> >> READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count);
> >>if (forcewake_count) {
> >> -  seq_puts(m, "RC information inaccurate because somebody "
> >> -  "holds a forcewake reference \n");
> >> +  seq_puts(m,
> >> +   "RC information inaccurate because somebody holds a 
> >> forcewake reference.\n");
> > 
> > And now you break the 80col rule. Blind adherence to checkpatch is 
> > impossible.
> 
> Have you got any other coding style preferences around the grepping
> of longer message strings from such source code?

I personally use long strings (because they are less hassle to write),
except when they are ridiculously long. But checkpatch complains either
way, so checkpatch itself is not a reason to make a change.

Certainly grepping for a complete seq_printf() is unlikely (i.e. you had
to open the debugfs file to see it, so you must already know where to
look in the code).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces

2017-05-04 Thread Chris Wilson
On Thu, May 04, 2017 at 11:45:48AM -0700, Laura Abbott wrote:
> 
> Enable the GEM dma-buf import interfaces in addition to the export
> interfaces. This lets vgem be used as a test source for other allocators
> (e.g. Ion).
> 
> Reviewed-by: Chris Wilson 
> Signed-off-by: Laura Abbott 
> ---
> v4: Use new drm_gem_prime_import_dev function
> ---
>  static const struct vm_operations_struct vgem_gem_vm_ops = {
> @@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, 
> struct drm_file *file)
>   kfree(vfile);
>  }
>  
> -/* ioctls */
> -
> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> -   struct drm_file *file,
> -   unsigned int *handle,
> -   unsigned long size)
> +static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> + unsigned long size)

I'm going to guess that doesn't line up anymore. If checkpatch isn't
complaining, then sorry for the noise.

> +static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> +   struct drm_file *file,
> +   unsigned int *handle,
> +       unsigned long size)

Ditto.

Lgtm, so r-b still good.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCHv4 1/3] drm/vgem: Add a dummy platform device

2017-05-04 Thread Chris Wilson
On Thu, May 04, 2017 at 11:45:46AM -0700, Laura Abbott wrote:
> 
> The vgem driver is currently registered independent of any actual
> device. Some usage of the dmabuf APIs require an actual device structure
> to do anything. Register a dummy platform device for use with dmabuf.
> 
> Reviewed-by: Chris Wilson 
> Signed-off-by: Laura Abbott 
> ---
> v4: Switch from the now removed platformdev to a static platform device.

I was thinking of avoiding the static, i.e.

static struct vgem_device {
struct drm_device drm;

struct device *platform;
} *vgem_device;

vgem_init():

vgem_device = kzalloc(sizeof(*vgem_device), GFP_KERNEEL);

ret = drm_dev_init(&vgem_device->drm, &vgem_drv, NULL);

vgem_device->platform = platform_device_register_simple("vgem");

And then platform_device_unregister() should be done in a new
vgem_drv.release callback.

I'm not going to insist upon it as I can send a patch to move over to
the "modern" drm_device subclassing later.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCHv4 2/3] drm/prime: Introduce drm_gem_prime_import_dev

2017-05-04 Thread Chris Wilson
On Thu, May 04, 2017 at 11:45:47AM -0700, Laura Abbott wrote:
> 
> The existing drm_gem_prime_import function uses the underlying
> struct device of a drm_device for attaching to a dma_buf. Some drivers
> (notably vgem) may not have an underlying device structure. Offer
> an alternate function to attach using any available device structure.
> 
> Signed-off-by: Laura Abbott 
> ---
> v4: Alternate implemntation to take an arbitrary struct dev instead of just
> a platform device.
> 
> This was different enough that I dropped the previous Reviewed-by
> ---
>  drivers/gpu/drm/drm_prime.c | 30 --
>  include/drm/drm_prime.h |  5 +
>  2 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 9fb65b7..5ad9a26 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -595,15 +595,18 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev,
>  EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);
>  
>  /**
> - * drm_gem_prime_import - helper library implementation of the import 
> callback
> + * drm_gem_prime_import_dev - core implementation of the import callback
>   * @dev: drm_device to import into
>   * @dma_buf: dma-buf object to import
> + * @attach_dev: struct device to dma_buf attach
>   *
> - * This is the implementation of the gem_prime_import functions for GEM 
> drivers
> - * using the PRIME helpers.
> + * This is the core of drm_gem_prime_import. It's designed to be called by
> + * drivers who want to use a different device structure than dev->dev for
> + * attaching via dma_buf.
>   */
> -struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
> - struct dma_buf *dma_buf)
> +struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev,
> + struct dma_buf *dma_buf,
> + struct device *attach_dev)

My critique would be that this should be called
drm_gem_prime_import_for_device()

Either way (though naturally I like my suggestion ;),
Reviewed-by: Chris Wilson 
-Chris
>   

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()

2017-05-04 Thread Chris Wilson
On Thu, May 04, 2017 at 06:59:23PM +0200, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Thu, 4 May 2017 14:15:00 +0200
> 
> The script "checkpatch.pl" pointed information out like the following.
> 
> WARNING: quoted string split across lines
> 
> Thus fix the affected source code place.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 6f3119d40c50..dbd52ea89fb4 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m)
>  
>   forcewake_count = 
> READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count);
>   if (forcewake_count) {
> - seq_puts(m, "RC information inaccurate because somebody "
> - "holds a forcewake reference \n");
> + seq_puts(m,
> +  "RC information inaccurate because somebody holds a 
> forcewake reference.\n");

And now you break the 80col rule. Blind adherence to checkpatch is
impossible.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 3/9] drm/i915: Replace 14 seq_printf() calls by seq_puts()

2017-05-04 Thread Chris Wilson
On Thu, May 04, 2017 at 06:54:16PM +0200, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Thu, 4 May 2017 13:20:47 +0200
> 
> Some strings which did not contain data format specifications should be put
> into a sequence. Thus use the corresponding function "seq_puts".

debugfs / seq_file is not performance critical. Familiar idiomatic code is
much preferred over continually switching between seq_printf and seq_puts.

And don't even start on converting seq_printf / seq_puts to seq_putc...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCHv3 2/3] drm/prime: Introduce drm_gem_prime_import_platform

2017-05-03 Thread Chris Wilson
On Wed, May 03, 2017 at 05:07:03PM +0200, Daniel Vetter wrote:
> On Wed, May 03, 2017 at 07:40:51AM -0700, Laura Abbott wrote:
> > On 05/03/2017 12:39 AM, Daniel Vetter wrote:
> > > On Tue, May 02, 2017 at 09:22:13PM +0100, Chris Wilson wrote:
> > >> On Tue, May 02, 2017 at 10:02:07AM -0700, Laura Abbott wrote:
> > >>>  /**
> > >>> + * drm_gem_prime_import_platform - alternate implementation of the 
> > >>> import callback
> > >>> + * @dev: drm_device to import into
> > >>> + * @dma_buf: dma-buf object to import
> > >>> + *
> > >>> + * This is identical to drm_gem_prime_import except the device used 
> > >>> for dma_buf
> > >>> + * attachment is an internal platform device instead of the standard 
> > >>> device
> > >>> + * structure. The use of this function should be limited to drivers 
> > >>> that do not
> > >>> + * set up an underlying device structure.
> > >>> + */
> > >>> +struct drm_gem_object *drm_gem_prime_import_platform(struct drm_device 
> > >>> *dev,
> > >>
> > >> Simpler soluation will be for the caller to provide the platformdev?
> > >>
> > >> That works nicely for the vgem case, I think.
> > > 
> > > Yeah looking at this again, do we really need this patch? Couldn't we
> > > instead change patch 1 to first allocate the fake platform device, then
> > > pass that to drm_dev_alloc (instead of NULL like we do now)?
> > > 
> > 
> > That was what I proposed in the first version and it was rejected.
> > It's useful to have at least one driver with a NULL device for testing
> > edge cases.
> 
> Oh drat :( I'd say dropping the coverage for NULL testing is ok, there's
> no other driver than vgem using this. And now that we have vgem dma-buf
> (or will, soonish) I'd expect that the same will hold for vkms, if it ever
> happens.

This series creates vgem->platformdev which we can just pass to
drm_gem_prime_import_platform() (or equivalent drm_gem_prime function that
takes an explicit dev). It was a bit of a surprise that import_platform
didn't take the platformdev after going to the trouble of creating
vgem->platformdev.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCHv3 2/3] drm/prime: Introduce drm_gem_prime_import_platform

2017-05-02 Thread Chris Wilson
On Tue, May 02, 2017 at 10:02:07AM -0700, Laura Abbott wrote:
> The existing drm_gem_prime_import function uses the underlying
> struct device of a drm_device for attaching to a dma_buf. Some drivers
> (notably vgem) may not have an underlying device structure. Offer
> an alternate function to attach using a platform device associated
> with drm_device.
> 
> Cc: intel-...@lists.freedesktop.org
> Reviewed-by: Chris Wilson 
> Signed-off-by: Laura Abbott 
> ---
> v3: Rebase to drm-misc-next. Prototype moved to a new header file. Comments
> added for new function. Brought back drm_device->platformdev as it had been
> removed in 76adb460fd93 ("drm: Remove the struct drm_device platformdev 
> field").
> I'm not entirely thrilled about this since the platformdev removal was good
> cleanup and this feels like a small step backwards. I don't know of a better
> approach though.
> ---
>  drivers/gpu/drm/drm_prime.c | 49 
> +++--
>  include/drm/drmP.h  |  2 ++
>  include/drm/drm_prime.h |  4 
>  3 files changed, 44 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 9fb65b7..a557a4b 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -594,16 +594,9 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);
>  
> -/**
> - * drm_gem_prime_import - helper library implementation of the import 
> callback
> - * @dev: drm_device to import into
> - * @dma_buf: dma-buf object to import
> - *
> - * This is the implementation of the gem_prime_import functions for GEM 
> drivers
> - * using the PRIME helpers.
> - */
> -struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
> - struct dma_buf *dma_buf)
> +static struct drm_gem_object *__drm_gem_prime_import(struct drm_device *dev,
> + struct dma_buf *dma_buf,
> + struct device *attach_dev)
>  {
>   struct dma_buf_attachment *attach;
>   struct sg_table *sgt;
> @@ -625,7 +618,7 @@ struct drm_gem_object *drm_gem_prime_import(struct 
> drm_device *dev,
>   if (!dev->driver->gem_prime_import_sg_table)
>   return ERR_PTR(-EINVAL);
>  
> - attach = dma_buf_attach(dma_buf, dev->dev);
> + attach = dma_buf_attach(dma_buf, attach_dev);
>   if (IS_ERR(attach))
>   return ERR_CAST(attach);
>  
> @@ -655,9 +648,43 @@ struct drm_gem_object *drm_gem_prime_import(struct 
> drm_device *dev,
>  
>   return ERR_PTR(ret);
>  }
> +
> +/**
> + * drm_gem_prime_import - helper library implementation of the import 
> callback
> + * @dev: drm_device to import into
> + * @dma_buf: dma-buf object to import
> + *
> + * This is the implementation of the gem_prime_import functions for GEM 
> drivers
> + * using the PRIME helpers.
> + */
> +struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
> + struct dma_buf *dma_buf)
> +{
> + return __drm_gem_prime_import(dev, dma_buf, dev->dev);
> +}
>  EXPORT_SYMBOL(drm_gem_prime_import);
>  
>  /**
> + * drm_gem_prime_import_platform - alternate implementation of the import 
> callback
> + * @dev: drm_device to import into
> + * @dma_buf: dma-buf object to import
> + *
> + * This is identical to drm_gem_prime_import except the device used for 
> dma_buf
> + * attachment is an internal platform device instead of the standard device
> + * structure. The use of this function should be limited to drivers that do 
> not
> + * set up an underlying device structure.
> + */
> +struct drm_gem_object *drm_gem_prime_import_platform(struct drm_device *dev,

Simpler soluation will be for the caller to provide the platformdev?

That works nicely for the vgem case, I think.

> + struct dma_buf *dma_buf)
> +{
> + if (WARN_ON_ONCE(!dev->platformdev))
> + return NULL;
> +
> + return __drm_gem_prime_import(dev, dma_buf, &dev->platformdev->dev);
> +}
> +EXPORT_SYMBOL(drm_gem_prime_import_platform);

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] [RFC PATCH 5/6] drm/i915/gvt: dmabuf support for GVT-g

2017-04-28 Thread Chris Wilson
->plane_id);
> + if (c != NULL) {
> + info->start = c->base;
> + info->width = c->width;
> + info->height = c->height;
> + info->stride = c->width * (c->bpp / 8);
> + info->tiled = 0;
> + info->x_pos = c->x_pos;
> + info->y_pos = c->y_pos;
> + info->size = (((info->stride * c->height * c->bpp) / 8) 
> +
> + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> + } else {
> + gvt_dbg_core("invalid cursor plane\n");
> + return -EINVAL;
> + }
> + } else {
> + gvt_vgpu_err("invalid plane id:%d\n", info->plane_id);
> + return -EINVAL;
> + }
> +
> + if (info->start & (PAGE_SIZE - 1)) {
> + gvt_vgpu_err("Not aligned fb address:0x%x\n", info->start);
> + return -EINVAL;
> + }
> + if (((info->start >> PAGE_SHIFT) + info->size) >
> + ggtt_total_entries(&dev_priv->ggtt)) {
> + gvt_vgpu_err("Invalid GTT offset or size\n");
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static struct drm_i915_gem_object *intel_vgpu_create_gem_from_vgpuid(
> + struct drm_device *dev, struct intel_vgpu *vgpu,
> + struct intel_vgpu_dmabuf *info)
> +{
> + struct drm_i915_gem_object *obj;
> + int ret;
> +
> + ret = intel_vgpu_get_plane_info(dev, vgpu, info);
> + if (ret) {
> + gvt_vgpu_err("get plane info failed:%d\n", info->plane_id);
> + return NULL;

Propagate errors.

> + }
> + obj = intel_vgpu_create_gem(dev, info);
> +
> + return obj;
> +}
> +
> +int intel_vgpu_query_dmabuf(struct intel_vgpu *vgpu, void *args)
> +{
> + struct drm_device *dev = &vgpu->gvt->dev_priv->drm;
> + int ret;
> + struct intel_vgpu_dmabuf *info = args;
> +
> + ret = intel_vgpu_get_plane_info(dev, vgpu, info);
> + if (ret) {
> + gvt_vgpu_err("get plane info failed:%d\n", info->plane_id);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +int intel_vgpu_generate_dmabuf(struct intel_vgpu *vgpu, void *args)
> +{
> + struct dma_buf *dmabuf;
> + struct drm_i915_gem_object *obj;
> + struct drm_device *dev = &vgpu->gvt->dev_priv->drm;
> + int ret;
> + struct intel_vgpu_dmabuf *info = args;
> + struct dma_buf_export_info exp_info = {
> + .exp_name = KBUILD_MODNAME,
> + .owner = THIS_MODULE };
> +
> + obj = intel_vgpu_create_gem_from_vgpuid(dev, vgpu, info);
> + if (obj == NULL) {
> + gvt_vgpu_err("create gvt gem obj failed:%d\n", vgpu->id);
> + return -EINVAL;
> + }
> +
> + exp_info.ops = &i915_dmabuf_ops;

Please put back my private i915_dmabuf_ops from where you stole it from.

Just call dmabuf = i915_gem_prime_export(dev, obj, DRM_CLOEXEC | DRM_RDWR);

> + exp_info.size = obj->base.size;
> + exp_info.flags = DRM_CLOEXEC | DRM_RDWR;
> + exp_info.priv = &obj->base;
> + exp_info.resv = obj->resv;
> +
> + dmabuf = drm_gem_dmabuf_export(dev, &exp_info);
> + if (IS_ERR(dmabuf)) {
> + gvt_vgpu_err("intel vgpu export dma-buf failed\n");
> + mutex_unlock(&dev->object_name_lock);
> + return -EINVAL;
> + }
> +
> + ret = dma_buf_fd(dmabuf, exp_info.flags);
> + if (ret < 0) {
> + gvt_vgpu_err("intel vgpu create dma-buf fd failed\n");
> + return ret;
> + }
> + info->fd = ret;
> +
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.h 
> b/drivers/gpu/drm/i915/gvt/dmabuf.h
> new file mode 100644
> index 000..c590f4a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gvt/dmabuf.h
> @@ -0,0 +1,50 @@
> +/*
> + * Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
> THE
> + * SOFTWARE.
> + *
> + */
> +
> +#ifndef _GVT_DMABUF_H_
> +#define _GVT_DMABUF_H_
> +
> +#define INTEL_VGPU_QUERY_DMABUF  0
> +#define INTEL_VGPU_GENERATE_DMABUF   1
> +
> +struct intel_vgpu_dmabuf {

This looks to be uapi. What's it doing here?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] pstore: Solve lockdep warning by moving inode locks

2017-04-28 Thread Chris Wilson
On Thu, Apr 27, 2017 at 04:20:37PM -0700, Kees Cook wrote:
> Lockdep complains about a possible deadlock between mount and unlink
> (which is technically impossible), but fixing this improves possible
> future multiple-backend support, and keeps locking in the right order.

I have merged your for-next/pstore branch (which included this patch,
so I hope I chose correctly ;) into our CI. That should exercise it on
the machines that we originally found the lockdep splat.

Thanks,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [RFC PATCHv2 0/3] dma_buf import support for vgem

2017-04-27 Thread Chris Wilson
On Wed, Apr 26, 2017 at 02:12:27PM -0700, Laura Abbott wrote:
> Hi,
> 
> This is v2 of my proposal to add dma_buf import functions for vgem.
> Big changes from v1:
> 
> - A device is required for dma_buf attach to work. The existing vgem driver
> intentionally does not use one as it provides a good way to test the DRM
> framework. This approach instead puts a dummy platform device in the existing
> drm_device->platformdev field and uses that for attaching.
> - Native vgem buffers can still be faulted in a page at a time without
> requiring the entire buffer be resident in memory.
> 
> I'm still marking this as RFC as I haven't had a chance to finish
> a userspace test that can be integrated into igt.

Note, that it will be good to cc:intel-gfx@ so that our CI does run it
over the existing vgem tests.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [RFC PATCHv2 3/3] drm/vgem: Enable dmabuf import interfaces

2017-04-27 Thread Chris Wilson
On Wed, Apr 26, 2017 at 02:12:30PM -0700, Laura Abbott wrote:
> Enable the GEM dma-buf import interfaces in addition to the export
> interfaces. This lets vgem be used as a test source for other allocators
> (e.g. Ion).
> 
> Signed-off-by: Laura Abbott 
> ---
> v2: Don't require vgem allocated buffers to existing in memory before 
> importing.
> ---
>  drivers/gpu/drm/vgem/vgem_drv.c | 133 
> +++-
>  drivers/gpu/drm/vgem/vgem_drv.h |   2 +
>  2 files changed, 106 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> index 1b02e56..73a619a 100644
> --- a/drivers/gpu/drm/vgem/vgem_drv.c
> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> @@ -34,6 +34,9 @@
>  #include 
>  #include 
>  #include 
> +
> +#include 
> +
>  #include "vgem_drv.h"
>  
>  #define DRIVER_NAME  "vgem"
> @@ -46,6 +49,12 @@ static void vgem_gem_free_object(struct drm_gem_object 
> *obj)
>  {
>   struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
>  
> + if (vgem_obj->pages)
> + drm_free_large(vgem_obj->pages);

Just drm_free_large(vgem_obj->pages); NULL -> kfree() which is NULL safe.

The series lgtm, doesn't compromise on any of the existing tests.

All 3,
Reviewed-by: Chris Wilson 

You could always get Joonas to nitpick over style...

> +
> + if (obj->import_attach)
> + drm_prime_gem_destroy(obj, vgem_obj->table);
> +
>   drm_gem_object_release(obj);
>   kfree(vgem_obj);
>  }
> @@ -56,26 +65,48 @@ static int vgem_gem_fault(struct vm_fault *vmf)
>   struct drm_vgem_gem_object *obj = vma->vm_private_data;
>   /* We don't use vmf->pgoff since that has the fake offset */
>   unsigned long vaddr = vmf->address;
> - struct page *page;
> -
> - page = shmem_read_mapping_page(file_inode(obj->base.filp)->i_mapping,
> -(vaddr - vma->vm_start) >> PAGE_SHIFT);
> - if (!IS_ERR(page)) {
> - vmf->page = page;
> - return 0;
> - } else switch (PTR_ERR(page)) {
> - case -ENOSPC:
> - case -ENOMEM:
> - return VM_FAULT_OOM;
> - case -EBUSY:
> - return VM_FAULT_RETRY;
> - case -EFAULT:
> - case -EINVAL:
> - return VM_FAULT_SIGBUS;
> - default:
> - WARN_ON_ONCE(PTR_ERR(page));
> - return VM_FAULT_SIGBUS;
> + int ret;
> + loff_t num_pages;
> + pgoff_t page_offset;
> + page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> +
> + num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> +
> + if (page_offset > num_pages)
> + return VM_FAULT_SIGBUS;
> +
> + if (obj->pages) {
> + get_page(obj->pages[page_offset]);
> + vmf->page = obj->pages[page_offset];
> + ret = 0;
> + } else {
> + struct page *page;
> +
> + page = shmem_read_mapping_page(
> + file_inode(obj->base.filp)->i_mapping,
> + page_offset);
> + if (!IS_ERR(page)) {
> + vmf->page = page;
> + ret = 0;
> + } else switch (PTR_ERR(page)) {
> + case -ENOSPC:
> + case -ENOMEM:
> + ret = VM_FAULT_OOM;
> + break;
> + case -EBUSY:
> +     ret = VM_FAULT_RETRY;
> + break;
> + case -EFAULT:
> + case -EINVAL:
> + ret = VM_FAULT_SIGBUS;
> + break;
> + default:
> + WARN_ON(PTR_ERR(page));
> + ret = VM_FAULT_SIGBUS;

break;
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] GPU hangs and X shot down with 4.11-rc6

2017-04-25 Thread Chris Wilson
On Tue, Apr 25, 2017 at 06:41:20PM +0200, Michal Hocko wrote:
> Hi,
> I have just experienced X being shut down once with 4.11-rc2 and 2 times
> with 4.11-rc6 kernel.  I do not remember seeing something like this
> before but it is quite possible I was just lucky to not trigger this
> issue before. It always happened while I was working on a presentation
> in LibreOffice which I do very seldom. The kernel log contains:
> 
> [ 7456.721893] [drm] GPU HANG: ecode 9:0:0x86dd, in Xorg [3594], reason: 
> Hang on render ring, action: reset
> [ 7456.721897] [drm] GPU hangs can indicate a bug anywhere in the entire gfx 
> stack, including userspace.
> [ 7456.721898] [drm] Please file a _new_ bug report on bugs.freedesktop.org 
> against DRI -> DRM/Intel
> [ 7456.721900] [drm] drm/i915 developers can then reassign to the right 
> component if it's not a kernel issue.
> [ 7456.721901] [drm] The gpu crash dump is required to analyze gpu hangs, so 
> please always attach it.
> [ 7456.721902] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [ 7456.721925] drm/i915: Resetting chip after gpu hang
> [ 7456.722117] [drm] RC6 on
> [ 7456.734588] [drm] GuC firmware load skipped
> [ 7464.686209] drm/i915: Resetting chip after gpu hang
> [ 7464.686284] [drm] RC6 on
> [ 7464.702469] [drm] GuC firmware load skipped
> [ 7472.686180] drm/i915: Resetting chip after gpu hang
> [ 7472.686241] [drm] RC6 on
> [ 7472.704565] [drm] GuC firmware load skipped
> [ 7480.686179] drm/i915: Resetting chip after gpu hang
> [ 7480.686241] [drm] RC6 on
> [ 7480.704583] [drm] GuC firmware load skipped
> [ 7493.678130] drm/i915: Resetting chip after gpu hang
> [ 7493.678206] [drm] RC6 on
> [ 7493.696505] [drm] GuC firmware load skipped
> 
> The kernel message tells that the problem might be anywhere and I should
> report to freedesktop but I haven't changed the userspace recently so it
> smells more like a kernel bug to me. Does this ring bells? The GPU crash
> dump is attached in case it is useful.

There are lots of very similar GPU hangs for mesa across a wide range of
kernels, with several reports noting a correlation with libreoffice.

At first glance, I would say you were just unlucky to hit it.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [lkp-robot] [locking/ww] 57dd924e54: BUG:soft_lockup-CPU##stuck_for#s

2017-04-14 Thread Chris Wilson
On Fri, Apr 14, 2017 at 10:59:27AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: 57dd924e541a98219bf3a508623db2e0c07e75a7 ("locking/ww-mutex: Limit 
> stress test to 2 seconds")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> in testcase: boot
> 
> on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):
> 
> 
> +--+++
> |  | 44fe84459f | 57dd924e54 |
> +--+++
> | boot_successes   | 0  | 0  |
> | boot_failures| 8  | 8  |
> | INFO:task_blocked_for_more_than#seconds  | 8  ||
> | calltrace:debug_show_all_locks   | 8  ||
> | calltrace:ww_mutex_lock_slow | 7  ||
> | Kernel_panic-not_syncing:hung_task:blocked_tasks | 8  ||
> | calltrace:stress_inorder_work| 1  ||
> | IP-Config:Auto-configuration_of_network_failed   | 0  | 4  |
> | BUG:soft_lockup-CPU##stuck_for#s | 0  | 4  |
> | Kernel_panic-not_syncing:softlockup:hung_tasks   | 0  | 4  |
> +--+++
> 
> 
> 
> [  273.632796] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! 
> [swapper/0:1]
> [  273.675540] irq event stamp: 4865008
> [  273.685130] hardirqs last  enabled at (4865007): [] 
> restore_regs_and_iret+0x0/0x1d
> [  273.698804] hardirqs last disabled at (4865008): [] 
> apic_timer_interrupt+0x98/0xb0
> [  273.712354] softirqs last  enabled at (4865006): [] 
> __do_softirq+0x1cf/0x22a
> [  273.725144] softirqs last disabled at (4864999): [] 
> irq_exit+0x56/0xa6
> [  273.737170] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> 4.11.0-rc2-00029-g57dd924 #1
> [  273.748511] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.9.3-20161025_171302-gandalf 04/01/2014
> [  273.763553] task: 89a4bae9c040 task.stack: 89a4baea
> [  273.772653] RIP: 0010:erase_augmented+0x79/0x23b
> [  273.781081] RSP: :89a4baea3e20 EFLAGS: 0296 ORIG_RAX: 
> ff10
> [  273.792428] RAX:  RBX: 91ce7a38 RCX: 
> 89a4bae9c040
> [  273.802992] RDX: 91ce7b50 RSI:  RDI: 
> 91ce7328
> [  273.813680] RBP: 89a4baea3e50 R08: 91ce7560 R09: 
> 
> [  273.824245] R10:  R11: 89a4bae9c768 R12: 
> 
> [  273.834880] R13: 91ce7a38 R14: 91ce7a39 R15: 
> 91ce8200
> [  273.845602] FS:  () GS:89a4bbc0() 
> knlGS:
> [  273.857595] CS:  0010 DS:  ES:  CR0: 80050033
> [  273.884265] CR2:  CR3: 0d419000 CR4: 
> 06a0
> [  273.894885] Call Trace:
> [  273.898736]  rbtree_test_init+0x1c3/0x27f
> [  273.904839]  ? irq_poll_setup+0x96/0x96
> [  273.910698]  ? set_debug_rodata+0x20/0x20
> [  273.916791]  do_one_initcall+0xa2/0x17c
> [  273.922687]  ? set_debug_rodata+0x20/0x20
> [  273.928744]  kernel_init_freeable+0x202/0x2a3
> [  273.935363]  ? rest_init+0x16d/0x16d
> [  273.940809]  kernel_init+0xf/0x142
> [  273.946056]  ? rest_init+0x16d/0x16d
> [  273.951528]  ret_from_fork+0x31/0x40
> [  273.956974] Code: 45 10 75 0b e8 e5 db e1 ff 4d 89 65 10 eb 17 e8 da db e1 
> ff 4d 89 65 08 eb 0c e8 cf db e1 ff 4c 89 25 e7 ca 3b 04 e8 c3 db e1 ff <4d> 
> 85 e4 74 0e e8 b9 db e1 ff 4d 89 34 24 e9 4e 01 00 00 e8 ab 

It looks like (summary + trace) that we have moved the timeout from the
ww_mutex stress test to the rbtree tests.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 3/3] drm/vc4: Add support for dma-buf fencing.

2017-04-11 Thread Chris Wilson
On Mon, Apr 10, 2017 at 06:44:14PM -0700, Eric Anholt wrote:
> This is needed for proper synchronization with display on another DRM
> device (pl111 or tinydrm) with buffers produced by vc4 V3D.  Fixes the
> new igt vc4_dmabuf_poll testcase, and rendering of one of the glmark2
> desktop tests on pl111+vc4.
> 
> This doesn't yet introduce waits on other device's fences before vc4's
> rendering/display, because I don't have testcases for them.
> 
> Signed-off-by: Eric Anholt 
> Cc: Gustavo Padovan 
> ---
> +static void vc4_fence_release(struct dma_fence *fence)
> +{
> + struct vc4_fence *f = to_vc4_fence(fence);
> +
> + kfree_rcu(f, base.rcu);
> +}

Unless you have a plan to do more here, looks like you can just use
the default dma_fence_free as the release callback.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-10 Thread Chris Wilson
On Fri, Apr 07, 2017 at 06:48:58PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 07, 2017 at 04:30:11PM +0100, Chris Wilson wrote:
> > Not getting hangs is a good sign, but lockdep doesn't like it:
> > 
> > [  460.684901] WARNING: CPU: 1 PID: 172 at kernel/workqueue.c:2418 
> > check_flush_dependency+0x92/0x130
> > [  460.684924] workqueue: PF_MEMALLOC task 172(kworker/1:1H) is flushing 
> > !WQ_MEM_RECLAIM events:__i915_gem_free_work [i915]
> > 
> > If I allocated the workqueue with WQ_MEM_RELCAIM, it complains bitterly
> > as well.
> 
> So in PF_MEMALLOC context we can't flush a workqueue with
> !WQ_MEM_RECLAIM.
> 
>   system_wq = alloc_workqueue("events", 0, 0);
> 
> My point is synchronize_rcu_expedited will still push its work in
> the same system_wq workqueue...
> 
>   /* Marshall arguments & schedule the expedited grace period. */
>   rew.rew_func = func;
>   rew.rew_rsp = rsp;
>   rew.rew_s = s;
>   INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
>   schedule_work(&rew.rew_work);
> 
> It's also using schedule_work, so either the above is a false
> positive, or we've still a problem with synchronize_rcu_expedited,
> just a reproducible issue anymore after we stop running it under the
> struct_mutex.

We still do have a problem with using synchronize_rcu_expedited() from
the shrinker as we maybe under someone else's mutex is that involved in
its own RCU dance.

> Even synchronize_sched will wait on the system_wq if
> synchronize_rcu_expedited has been issued in parallel by some other
> code, it's just there's no check for it because it's not invoking
> flush_work to wait.

Right.
 
> The deadlock happens if we flush_work() while holding any lock that
> may be taken by any of the workqueues that could be queued there. i915
> makes sure to flush_work only if the struct_mutex was released (not
> my initial version) but we're not checking if any of the other
> system_wq workqueues could possibly taking a lock that may have been
> hold during the allocation that invoked reclaim, I suppose that is the
> problem left, but I don't see how flush_work is special about it,
> synchronize_rcu_expedited would still have the same issue no? (despite
> no lockdep warning)
> 
> I suspect this means synchronize_rcu_expedited() is not usable in
> reclaim context and lockdep should warn if PF_MEMALLOC is set when
> synchronize_rcu_expedited/synchronize_sched/synchronize_rcu are
> called.

Yes.

> Probably to fix this we should create a private workqueue for both RCU
> and i915 and stop sharing the system_wq with the rest of the system
> (and of course set WQ_MEM_RECLAIM in both workqueues). This makes sure
> when we call synchronize_rcu_expedited; flush_work from the shrinker,
> we won't risk waiting on other random work that may be taking locks
> that are hold by the code that invoked reclaim during an allocation.

We simply do not need to do our own synchronize_rcu* -- it's only used
to flush our slab frees on the off chance that (a) we have any and (b)
we do manage to free a whole slab. It is not the bulk of the memory that
we return to the system from the shrinker.

In the other thread, I stated that we should simply remove it. The
kswapd reclaim path should try to reclaim RCU slabs (by doing a
synhronize_sched or equivalent).

> The macro bug of waiting on system_wq 100% of the time while always
> holding the struct_mutex is gone, but we need to perfect this further
> and stop using the system_wq for RCU and i915 shrinker work.

Agreed. My preference is to simply not do it and leave the dangling RCU
to the core reclaim paths.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: Linux 4.11: Reported regressions as of Sunday, 2017-04-09

2017-04-09 Thread Chris Wilson
On Sun, Apr 09, 2017 at 07:15:49PM +0200, Thorsten Leemhuis wrote:
> Desc: i915 gpu hangs under load
> Repo: 2017-03-22 
> https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg116227.html 
> https://bugs.freedesktop.org/show_bug.cgi?id=100181 
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372182.html
> Stat: 2017-04-07 
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1372182.html
> Note: A few patches from Andrea (who's seeing problems as well) are being 
> discussed

Note, Andrea saw a completely different problem unrelated to the GPU
hang and date back to ~4.9. Both reported issues have been fixed, with
the earlier report already in mainline but Andrea's fix is not yet in a
PR.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [RFC PATCH 2/2] drm/vgem: Enable dmabuf import interfaces

2017-04-07 Thread Chris Wilson
On Fri, Apr 07, 2017 at 09:18:30AM -0700, Laura Abbott wrote:
> On 04/07/2017 12:39 AM, Chris Wilson wrote:
> > On Thu, Apr 06, 2017 at 04:18:33PM -0700, Laura Abbott wrote:
> >>
> >> Enable the GEM dma-buf import interfaces in addition to the export
> >> interfaces. This lets vgem be used as a test source for other allocators
> >> (e.g. Ion).
> >>
> >> +int vgem_gem_get_pages(struct drm_vgem_gem_object *obj)
> >> +{
> >> +  struct page **pages;
> >> +
> >> +  if (obj->pages)
> >> +  return 0;
> >> +
> >> +  pages = drm_gem_get_pages(&obj->base);
> >> +  if (IS_ERR(pages)) {
> >> +  return PTR_ERR(pages);
> >> +  }
> >> +
> >> +  obj->pages = pages;
> > 
> > That's a significant loss in functionality (it requires the entire
> > object to fit into physical memory at the same time and requires a large
> > vmalloc for 32b systems), for what? vgem only has the ability to mmap
> > and export a fd -- both of which you already have if you have the dmabuf
> > fd. The only extra interface is the signaling, which does not yet have a
> > direct correspondence on dmabuf.
> > 
> > (An obvious way to keep both would be to move the get_pages to importing
> > and then differentiate in the faulthandler where to get the page from.)
> > 
> 
> Thanks for pointing this out. I'll look to keep the existing behavior.
> 
> > Can you provide more details on how you are using vgem to justify the
> > changes? I'm also not particularly happy in losing testing of a virtual
> > platform device from our CI.
> > -Chris
> > 
> 
> There exists a test module in the Ion directory
> (drivers/staging/android/ion/ion_test.c) today but it's not actually
> Ion specific. It registers a platform device and then provides an
> ioctl interface to perform a dma_buf attach and map. I proposed
> moving this to a generic dma-buf test module
> (https://marc.info/?l=dri-devel&m=148952187004611&w=2) and Daniel
> suggested that this was roughly what vgem was supposed to do.

mmap(dma_buf_fd) gives you DMA_BUF_IOC_TEST_DMA_MAPPING, and that's the
only facility already available via vgem.

DMA_BUF_IOC_TEST_KERNEL_MAPPING would be equivalent to
read/write(dma_buf_fd).

> Adding the import methods for vgem covers all of what the
> Ion test was doing and we don't have to invent yet another ioctl
> interface and framework for attaching and mapping. 

I don't think it does :)

To muddy the waters further, I've also done something similar:
https://cgit.freedesktop.org/drm-intel/tree/drivers/gpu/drm/i915/selftests/mock_dmabuf.c
https://cgit.freedesktop.org/drm-intel/tree/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c

But this feels like a good enough reason for implementing read/write ops
for the dma-buf fd, and then we can test the dma_buf->ops->kmap on i915
as well, directly from i915.ko. Sounds fun, I'll see if I can cook
something up - and we can then see if that suits your testing as well.
 
> Can you clarify about what you mean about 'virtual platform device'?

vgem_device = drm_dev_alloc(&vgem_driver, NULL);

helps exercise some more corner cases of the drm core, that we have
unwittingly broken in the past.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-07 Thread Chris Wilson
On Fri, Apr 07, 2017 at 03:06:00PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 07, 2017 at 11:02:11AM +0100, Chris Wilson wrote:
> > On Fri, Apr 07, 2017 at 01:23:44AM +0200, Andrea Arcangeli wrote:
> > > Waiting a RCU grace period only guarantees the work gets queued, but
> > > until after the queued workqueue returns, there's no guarantee the
> > > memory was actually freed. So flush the work to provide better
> > > guarantees to the reclaim code in addition of waiting a RCU grace
> > > period to pass.
> > 
> > We are not allowed to call flush_work() from the shrinker, the workqueue
> > doesn't have and can't have the right reclaim flags.
> 
> I figured the flush_work had to be conditional to "unlock" being true
> too in the i915 shrinker (not only synchronize_rcu_expedited()), and I
> already fixed that bit, but I didn't think it would be a problem to
> wait for the workqueue as long as reclaim didn't recurse on the
> struct_mutex (it is a problem if unlock is false of course as we would
> be back to square one). I didn't get further hangs and I assume I've
> been running a couple of synchronize_rcu_expedited() and flush_work (I
> should add dynamic tracing to be sure).

Not getting hangs is a good sign, but lockdep doesn't like it:

[  460.684901] WARNING: CPU: 1 PID: 172 at kernel/workqueue.c:2418 
check_flush_dependency+0x92/0x130
[  460.684924] workqueue: PF_MEMALLOC task 172(kworker/1:1H) is flushing 
!WQ_MEM_RECLAIM events:__i915_gem_free_work [i915]

If I allocated the workqueue with WQ_MEM_RELCAIM, it complains bitterly
as well.

> Also note, I didn't get any lockdep warning when I reproduced the
> workqueue hang in 4.11-rc5 so at least as far as lockdep is concerned
> there's no problem to call synchronize_rcu_expedited and it couldn't
> notice we were holding the struct_mutex while waiting for the new
> workqueue to run.

Yes, that is concerning. I think it's due to the coupling via
the struct completion that is not being picked up lockdep, and I hope
the "crossrelease" patches would fix the lack of warnings.

> Also note recursing on the lock (unlock false case) is something
> nothing else does, I'm not sure if it's worth the risk and if you
> shouldn't just call mutex_trylock in the shrinker instead of
> mutex_trylock_recursive. One thing was to recurse on the lock
> internally in the same context, but recursing through the whole
> reclaim is more dubious as safe.

We know. We don't use trylock in order to reduce the frequency of users'
oom. Peter added mutex_trylock_recursive() because we already were doing
recursive locking in the shrinker and although we know we shouldn't,
getting rid of the recursion is something we are doing, but slowly.

> You could start dropping objects and wiping vmas and stuff in the
> middle of some kmalloc/alloc_pages that doesn't expect it and then
> crash for other reasons. So this reclaim recursion model of the
> shinker is quite unique and quite challenging to proof as safe if you
> keep using mutex_trylock_recursive in i915_gem_shrinker_scan.

I know. Trying to stay on top of all the kmallocs under the struct_mutex
and being aware that the shrinker can and will undo your objects as you
work is a continual battle. And catches everyone working on i915.ko by
surprise. Our policy to avoid surprises is based around pin before alloc.
 
> Lock recursion in all other places could be dropped without runtime
> downsides, the only place mutex_trylock_recursive makes a design
> difference and makes sense to be used is in i915_gem_shrinker_scan,
> the rest are implementation issues not fundamental shrinker design and
> it'd be nice if those other mutex_trylock_recursive would all be
> removed and the only one that is left is in i915_gem_shrinker_scan and
> nowhere else (or to drop it also from i915_gem_shrinker_scan).

We do need it for shrinker_count as well. If we just report 0 objects,
the shrinker_scan callback will be skipped, iirc. All we do need it for
direct calls to i915_gem_shrink() which themselves may or may not be
underneath the struct_mutex at the time.

> mutex_trylock_recursive() should also be patched to use
> READ_ONCE(__mutex_owner(lock)) because currently it breaks C.

I don't follow,

static inline struct task_struct *__mutex_owner(struct mutex *lock)
{
return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);
}

The atomic read is equivalent to READ_ONCE(). What's broken here? (I
guess strict aliasing and pointer cast?)

> In the whole kernel i915 and msm drm are the only two users of such
> function in fact.

Yes, Peter will continue to remind us to fix our code and complain until
it is.

> Another thin

Re: [PATCH 3/5] i915: initialize the free_list of the fencing atomic_helper

2017-04-07 Thread Chris Wilson
On Fri, Apr 07, 2017 at 01:23:45AM +0200, Andrea Arcangeli wrote:
> Just in case the llist model changes and NULL isn't valid
> initialization.
> 
> Signed-off-by: Andrea Arcangeli 

Applied, thanks.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-07 Thread Chris Wilson
On Fri, Apr 07, 2017 at 01:23:44AM +0200, Andrea Arcangeli wrote:
> Waiting a RCU grace period only guarantees the work gets queued, but
> until after the queued workqueue returns, there's no guarantee the
> memory was actually freed. So flush the work to provide better
> guarantees to the reclaim code in addition of waiting a RCU grace
> period to pass.

We are not allowed to call flush_work() from the shrinker, the workqueue
doesn't have and can't have the right reclaim flags.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 5/5] i915: fence workqueue optimization

2017-04-07 Thread Chris Wilson
On Fri, Apr 07, 2017 at 01:23:47AM +0200, Andrea Arcangeli wrote:
> Insist to run llist_del_all() until the free_list is found empty, this
> may avoid having to schedule more workqueues.

The work will already be scheduled (everytime we add the first element,
the work is scheduled, and the scheduled bit is cleared before the work
is executed). So we aren't saving the kworker from having to process
another work, but we may make that having nothing to do. The question is
whether we want to trap the kworker here, and presumably you will also want
to add a cond_resched() between passes.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [RFC PATCH 2/2] drm/vgem: Enable dmabuf import interfaces

2017-04-07 Thread Chris Wilson
On Thu, Apr 06, 2017 at 04:18:33PM -0700, Laura Abbott wrote:
> 
> Enable the GEM dma-buf import interfaces in addition to the export
> interfaces. This lets vgem be used as a test source for other allocators
> (e.g. Ion).
> 
> +int vgem_gem_get_pages(struct drm_vgem_gem_object *obj)
> +{
> + struct page **pages;
> +
> + if (obj->pages)
> + return 0;
> +
> + pages = drm_gem_get_pages(&obj->base);
> + if (IS_ERR(pages)) {
> + return PTR_ERR(pages);
> + }
> +
> + obj->pages = pages;

That's a significant loss in functionality (it requires the entire
object to fit into physical memory at the same time and requires a large
vmalloc for 32b systems), for what? vgem only has the ability to mmap
and export a fd -- both of which you already have if you have the dmabuf
fd. The only extra interface is the signaling, which does not yet have a
direct correspondence on dmabuf.

(An obvious way to keep both would be to move the get_pages to importing
and then differentiate in the faulthandler where to get the page from.)

Can you provide more details on how you are using vgem to justify the
changes? I'm also not particularly happy in losing testing of a virtual
platform device from our CI.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[tip:locking/core] locking/ww-mutex: Limit stress test to 2 seconds

2017-03-30 Thread tip-bot for Chris Wilson
Commit-ID:  57dd924e541a98219bf3a508623db2e0c07e75a7
Gitweb: http://git.kernel.org/tip/57dd924e541a98219bf3a508623db2e0c07e75a7
Author: Chris Wilson 
AuthorDate: Fri, 10 Mar 2017 10:57:33 +
Committer:  Ingo Molnar 
CommitDate: Thu, 30 Mar 2017 09:49:47 +0200

locking/ww-mutex: Limit stress test to 2 seconds

Use a timeout rather than a fixed number of loops to avoid running for
very long periods, such as under the kbuilder VMs.

Reported-by: kernel test robot 
Signed-off-by: Chris Wilson 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Boqun Feng 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20170310105733.6444-1-ch...@chris-wilson.co.uk
Signed-off-by: Ingo Molnar 
---
 kernel/locking/test-ww_mutex.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index 90d8d88..39f56c8 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -353,8 +353,8 @@ static int test_cycle(unsigned int ncpus)
 struct stress {
struct work_struct work;
struct ww_mutex *locks;
+   unsigned long timeout;
int nlocks;
-   int nloops;
 };
 
 static int *get_random_order(int count)
@@ -434,7 +434,7 @@ retry:
}
 
ww_acquire_fini(&ctx);
-   } while (--stress->nloops);
+   } while (!time_after(jiffies, stress->timeout));
 
kfree(order);
kfree(stress);
@@ -496,7 +496,7 @@ static void stress_reorder_work(struct work_struct *work)
ww_mutex_unlock(ll->lock);
 
ww_acquire_fini(&ctx);
-   } while (--stress->nloops);
+   } while (!time_after(jiffies, stress->timeout));
 
 out:
list_for_each_entry_safe(ll, ln, &locks, link)
@@ -522,7 +522,7 @@ static void stress_one_work(struct work_struct *work)
__func__, err);
break;
}
-   } while (--stress->nloops);
+   } while (!time_after(jiffies, stress->timeout));
 
kfree(stress);
 }
@@ -532,7 +532,7 @@ static void stress_one_work(struct work_struct *work)
 #define STRESS_ONE BIT(2)
 #define STRESS_ALL (STRESS_INORDER | STRESS_REORDER | STRESS_ONE)
 
-static int stress(int nlocks, int nthreads, int nloops, unsigned int flags)
+static int stress(int nlocks, int nthreads, unsigned int flags)
 {
struct ww_mutex *locks;
int n;
@@ -574,7 +574,7 @@ static int stress(int nlocks, int nthreads, int nloops, 
unsigned int flags)
INIT_WORK(&stress->work, fn);
stress->locks = locks;
stress->nlocks = nlocks;
-   stress->nloops = nloops;
+   stress->timeout = jiffies + 2*HZ;
 
queue_work(wq, &stress->work);
nthreads--;
@@ -618,15 +618,15 @@ static int __init test_ww_mutex_init(void)
if (ret)
return ret;
 
-   ret = stress(16, 2*ncpus, 1<<10, STRESS_INORDER);
+   ret = stress(16, 2*ncpus, STRESS_INORDER);
if (ret)
return ret;
 
-   ret = stress(16, 2*ncpus, 1<<10, STRESS_REORDER);
+   ret = stress(16, 2*ncpus, STRESS_REORDER);
if (ret)
return ret;
 
-   ret = stress(4095, hweight32(STRESS_ALL)*ncpus, 1<<12, STRESS_ALL);
+   ret = stress(4095, hweight32(STRESS_ALL)*ncpus, STRESS_ALL);
if (ret)
return ret;
 


Re: Regression in i915 for 4.11-rc1 - bisected to commit 69df05e11ab8

2017-03-23 Thread Chris Wilson
On Thu, Mar 23, 2017 at 01:19:43PM -0500, Larry Finger wrote:
> Since kernel 4.11-rc1, my desktop (Plasma5/KDE) has encountered
> intermittent hangs with the following information in the logs:
> 
> linux-4v1g.suse kernel: [drm] GPU HANG: ecode 7:0:0xf3ce, in
> plasmashell [1283], reason: Hang on render ring, action: reset
> linux-4v1g.suse kernel: [drm] GPU hangs can indicate a bug anywhere
> in the entire gfx stack, including userspace.
> linux-4v1g.suse kernel: [drm] Please file a _new_ bug report on
> bugs.freedesktop.org against DRI -> DRM/Intel
> linux-4v1g.suse kernel: [drm] drm/i915 developers can then reassign
> to the right component if it's not a kernel issue.
> linux-4v1g.suse kernel: [drm] The gpu crash dump is required to
> analyze gpu hangs, so please always attach it.
> linux-4v1g.suse kernel: [drm] GPU crash dump saved to 
> /sys/class/drm/card0/error
> linux-4v1g.suse kernel: drm/i915: Resetting chip after gpu hang
> 
> This problem was added to
> https://bugs.freedesktop.org/show_bug.cgi?id=99380, but it probably
> is a different bug, as the OP in that report has problems with
> kernel 4.10.x, whereas my problem did not appear until 4.11.

Close. Actually that patch touches code you are not using (oa-perf and
gvt), the real culprit was e8a9c58fcd9a ("drm/i915: Unify active context
tracking between legacy/execlists/guc").

The fix

commit 5d4bac5503fcc67dd7999571e243cee49371aef7
Author: Chris Wilson 
Date:   Wed Mar 22 20:59:30 2017 +

drm/i915: Restore marking context objects as dirty on pinning

Commit e8a9c58fcd9a ("drm/i915: Unify active context tracking between
legacy/execlists/guc") converted the legacy intel_ringbuffer submission
to the same context pinning mechanism as execlists - that is to pin the
context until the subsequent request is retired. Previously it used the
vma retirement of the context object to keep itself pinned until the
next request (after i915_vma_move_to_active()). In the conversion, I
missed that the vma retirement was also responsible for marking the
object as dirty. Mark the context object as dirty when pinning
(equivalent to execlists) which ensures that if the context is swapped
out due to mempressure or suspend/hibernation, when it is loaded back in
it does so with the previous state (and not all zero).

Fixes: e8a9c58fcd9a ("drm/i915: Unify active context tracking between 
legacy/execlists/guc")
Reported-by: Dennis Gilmore 
Reported-by: Mathieu Marquer 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99993
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100181
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc:  # v4.11-rc1
Link: 
http://patchwork.freedesktop.org/patch/msgid/20170322205930.12762-1-ch...@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin 

went in this morning and so will be upstreamed ~next week.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm/i915/kvmgt: avoid dereferencing a potentially null info pointer

2017-03-23 Thread Chris Wilson
On Thu, Mar 23, 2017 at 12:22:30PM +, Colin King wrote:
> From: Colin Ian King 
> 
> info is being checked to see if it is a null pointer, however, vpgu is
> dereferencing info before this check, leading to a potential null
> pointer dereference.  If info is null, then the error message being
> printed by macro gvt_vgpu_err and this requires vpgu to exist. We can
> use a null vpgu as the macro has a sanity check to see if vpgu is null,
> so this is OK.

It is never NULL, it gets checked by its only caller.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] fs/pstore: Perform erase from a worker

2017-03-21 Thread Chris Wilson
On Tue, Mar 21, 2017 at 02:58:48PM +0900, Namhyung Kim wrote:
> Hello,
> 
> On Mon, Mar 20, 2017 at 10:49:16AM -0700, Kees Cook wrote:
> > On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson  
> > wrote:
> > > In order to prevent a cyclic recursion between psi->read_mutex and the
> > > inode_lock, we need to move the pse->erase to a worker.
> > >
> > > [  605.374955] ==
> > > [  605.381281] [ INFO: possible circular locking dependency detected ]
> > > [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
> > > [  605.392826] ---
> > > [  605.399196] rm/7298 is trying to acquire lock:
> > > [  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [] 
> > > pstore_unlink+0x3f/0xa0
> > > [  605.412300]
> > > [  605.412300] but task is already holding lock:
> > > [  605.418237]  (&sb->s_type->i_mutex_key#14){++}, at: 
> > > [] vfs_unlink+0x4c/0x19
> > > 0
> > > [  605.427397]
> > > [  605.427397] which lock already depends on the new lock.
> > > [  605.427397]
> > > [  605.435770]
> > > [  605.435770] the existing dependency chain (in reverse order) is:
> > > [  605.443396]
> > > [  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++}:
> > > [  605.450347]lock_acquire+0xc9/0x220
> > > [  605.454551]down_write+0x3f/0x70
> > > [  605.458484]pstore_mkfile+0x1f4/0x460
> > > [  605.462835]pstore_get_records+0x17a/0x320
> > > [  605.467664]pstore_fill_super+0xa4/0xc0
> > > [  605.472205]mount_single+0x89/0xb0
> > > [  605.476314]pstore_mount+0x13/0x20
> > > [  605.480411]mount_fs+0xf/0x90
> > > [  605.484122]vfs_kern_mount+0x66/0x170
> > > [  605.488464]do_mount+0x190/0xd50
> > > [  605.492397]SyS_mount+0x90/0xd0
> > > [  605.496212]entry_SYSCALL_64_fastpath+0x1c/0xb1
> > > [  605.501496]
> > > [  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
> > > [  605.507747]__lock_acquire+0x1ac0/0x1bb0
> > > [  605.512401]lock_acquire+0xc9/0x220
> > > [  605.516594]__mutex_lock+0x6e/0x990
> > > [  605.520755]mutex_lock_nested+0x16/0x20
> > > [  605.525279]pstore_unlink+0x3f/0xa0
> > > [  605.529465]vfs_unlink+0xb5/0x190
> > > [  605.533477]do_unlinkat+0x24c/0x2a0
> > > [  605.537672]SyS_unlinkat+0x16/0x30
> > > [  605.541781]entry_SYSCALL_64_fastpath+0x1c/0xb1
> > 
> > If I'm reading this right it's a race between mount and unlink...
> > that's quite a corner case. :)
> > 
> > > [  605.547067]
> > > [  605.547067] other info that might help us debug this:
> > > [  605.547067]
> > > [  605.555221]  Possible unsafe locking scenario:
> > > [  605.555221]
> > > [  605.561280]CPU0CPU1
> > > [  605.565883]
> > > [  605.570502]   lock(&sb->s_type->i_mutex_key#14);
> > > [  605.575217]lock(&psinfo->read_mutex);
> > > [  605.581803]        
> > > lock(&sb->s_type->i_mutex_key#14);
> > > [  605.589159]   lock(&psinfo->read_mutex);
> > 
> > I haven't had time to dig much yet, but I wonder if the locking order
> > on unlink could just be reversed, and the deadlock would go away?
> 
> IIUC, the unlink path locks a file in the root directory, while the
> mount path locks the root directory.  Maybe we can use a subclass?
> (not tested)

More puzzling, or just my confusion, reports from our CI farm say that
this patch breaks removing objects from pstote. :|

Will look forward to better suggestions on how to avoid lockdep.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] [PATCH 2/3] drm/i915: split out check for noncontiguous pfn range

2017-03-21 Thread Chris Wilson
On Tue, Mar 21, 2017 at 09:56:52AM +, Chris Wilson wrote:
> On Mon, Mar 20, 2017 at 10:40:28AM +0100, Arnd Bergmann wrote:
> > We get a warning with gcc-7 about a pointless comparison when
> > using a linear memmap:
> > 
> > drivers/gpu/drm/i915/selftests/scatterlist.c: In function 'alloc_table':
> > drivers/gpu/drm/i915/selftests/scatterlist.c:219:66: error: self-comparison 
> > always evaluates to false [-Werror=tautological-compare]
> > 
> > Splitting out the comparison into a separate function avoids the warning
> > and makes it slightly more obvious what happens.
> 
> I worried whether gcc will simply get smarter and apply its warning more
> consistently, but I agree the helper is good documentation.
>  
> > Fixes: 935a2f776aa5 ("drm/i915: Add some selftests for sg_table 
> > manipulation")
> > Signed-off-by: Arnd Bergmann 
> Reviewed-by: Chris Wilson 

And applied.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 1/3] drm/i915: allocate mock file pointer dynamically

2017-03-21 Thread Chris Wilson
On Mon, Mar 20, 2017 at 10:40:27AM +0100, Arnd Bergmann wrote:
> A struct file is a bit too large to put on the kernel stack in general
> and triggers a warning for low settings of CONFIG_FRAME_WARN:
> 
> drivers/gpu/drm/i915/selftests/mock_drm.c: In function 'mock_file':
> drivers/gpu/drm/i915/selftests/mock_drm.c:46:1: error: the frame size of 1328 
> bytes is larger than 1280 bytes [-Werror=frame-larger-than=]
> drivers/gpu/drm/i915/selftests/mock_drm.c: In function 'mock_file_free':
> drivers/gpu/drm/i915/selftests/mock_drm.c:54:1: error: the frame size of 1312 
> bytes is larger than 1280 bytes [-Werror=frame-larger-than=]
> 
> It's also slightly dangerous to leave a reference to a stack object
> in the drm_file structure after leaving the stack frame.
> This changes the code to just allocate the object dynamically
> and release it when we are done with it.
> 
> Fixes: 66d9cb5d805a ("drm/i915: Mock the GEM device for self-testing")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/i915/selftests/mock_drm.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/mock_drm.c 
> b/drivers/gpu/drm/i915/selftests/mock_drm.c
> index 113dec05c7dc..18514065c93d 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_drm.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_drm.c
> @@ -32,15 +32,15 @@ static inline struct inode fake_inode(struct 
> drm_i915_private *i915)
>  struct drm_file *mock_file(struct drm_i915_private *i915)
>  {
>   struct inode inode = fake_inode(i915);
> - struct file filp = {};
> + struct file *filp = kzalloc(sizeof(struct file), GFP_KERNEL);
>   struct drm_file *file;
>   int err;
>  
> - err = drm_open(&inode, &filp);
> + err = drm_open(&inode, filp);
>   if (unlikely(err))
>   return ERR_PTR(err);
>  
> - file = filp.private_data;
> + file = filp->private_data;

What I don't like about this approach is that we leak the unwanted
inode/file. We only want the drm_file here and any access to above that
inside i915.ko is fubar.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH 2/3] drm/i915: split out check for noncontiguous pfn range

2017-03-21 Thread Chris Wilson
On Mon, Mar 20, 2017 at 10:40:28AM +0100, Arnd Bergmann wrote:
> We get a warning with gcc-7 about a pointless comparison when
> using a linear memmap:
> 
> drivers/gpu/drm/i915/selftests/scatterlist.c: In function 'alloc_table':
> drivers/gpu/drm/i915/selftests/scatterlist.c:219:66: error: self-comparison 
> always evaluates to false [-Werror=tautological-compare]
> 
> Splitting out the comparison into a separate function avoids the warning
> and makes it slightly more obvious what happens.

I worried whether gcc will simply get smarter and apply its warning more
consistently, but I agree the helper is good documentation.
 
> Fixes: 935a2f776aa5 ("drm/i915: Add some selftests for sg_table manipulation")
> Signed-off-by: Arnd Bergmann 
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH] fs/pstore: Perform erase from a worker

2017-03-17 Thread Chris Wilson
In order to prevent a cyclic recursion between psi->read_mutex and the
inode_lock, we need to move the pse->erase to a worker.

[  605.374955] ==
[  605.381281] [ INFO: possible circular locking dependency detected ]
[  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
[  605.392826] ---
[  605.399196] rm/7298 is trying to acquire lock:
[  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [] 
pstore_unlink+0x3f/0xa0
[  605.412300]
[  605.412300] but task is already holding lock:
[  605.418237]  (&sb->s_type->i_mutex_key#14){++}, at: [] 
vfs_unlink+0x4c/0x19
0
[  605.427397]
[  605.427397] which lock already depends on the new lock.
[  605.427397]
[  605.435770]
[  605.435770] the existing dependency chain (in reverse order) is:
[  605.443396]
[  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++}:
[  605.450347]lock_acquire+0xc9/0x220
[  605.454551]down_write+0x3f/0x70
[  605.458484]pstore_mkfile+0x1f4/0x460
[  605.462835]pstore_get_records+0x17a/0x320
[  605.467664]pstore_fill_super+0xa4/0xc0
[  605.472205]mount_single+0x89/0xb0
[  605.476314]pstore_mount+0x13/0x20
[  605.480411]mount_fs+0xf/0x90
[  605.484122]vfs_kern_mount+0x66/0x170
[  605.488464]do_mount+0x190/0xd50
[  605.492397]SyS_mount+0x90/0xd0
[  605.496212]entry_SYSCALL_64_fastpath+0x1c/0xb1
[  605.501496]
[  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
[  605.507747]__lock_acquire+0x1ac0/0x1bb0
[  605.512401]lock_acquire+0xc9/0x220
[  605.516594]__mutex_lock+0x6e/0x990
[  605.520755]mutex_lock_nested+0x16/0x20
[  605.525279]pstore_unlink+0x3f/0xa0
[  605.529465]vfs_unlink+0xb5/0x190
[  605.533477]do_unlinkat+0x24c/0x2a0
[  605.537672]SyS_unlinkat+0x16/0x30
[  605.541781]entry_SYSCALL_64_fastpath+0x1c/0xb1
[  605.547067]
[  605.547067] other info that might help us debug this:
[  605.547067]
[  605.555221]  Possible unsafe locking scenario:
[  605.555221]
[  605.561280]CPU0CPU1
[  605.565883]
[  605.570502]   lock(&sb->s_type->i_mutex_key#14);
[  605.575217]lock(&psinfo->read_mutex);
[  605.581803]lock(&sb->s_type->i_mutex_key#14);
[  605.589159]   lock(&psinfo->read_mutex);
[  605.593156]
[  605.593156]  *** DEADLOCK ***
[  605.593156]
[  605.599214] 3 locks held by rm/7298:
[  605.602896]  #0:  (sb_writers#11){.+.+..}, at: [] 
mnt_want_write+0x1f/0x50
[  605.611490]  #1:  (&sb->s_type->i_mutex_key#14/1){+.+...}, at: 
[] do_unlinkat+0
x11c/0x2a0
[  605.621417]  #2:  (&sb->s_type->i_mutex_key#14){++}, at: 
[] vfs_unlink+0x4c
/0x190
[  605.630995]
[  605.630995] stack backtrace:
[  605.635450] CPU: 7 PID: 7298 Comm: rm Not tainted 4.11.0-rc2-CI-CI_DRM_2352+ 
#1
[  605.642999] Hardware name: Gigabyte Technology Co., Ltd. 
Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
017
[  605.652305] Call Trace:
[  605.654814]  dump_stack+0x67/0x92
[  605.658184]  print_circular_bug+0x1e0/0x2e0
[  605.662465]  __lock_acquire+0x1ac0/0x1bb0
[  605.34]  ? retint_kernel+0x2d/0x2d
[  605.670456]  lock_acquire+0xc9/0x220
[  605.674112]  ? pstore_unlink+0x3f/0xa0
[  605.677970]  ? pstore_unlink+0x3f/0xa0
[  605.681818]  __mutex_lock+0x6e/0x990
[  605.685456]  ? pstore_unlink+0x3f/0xa0
[  605.689791]  ? pstore_unlink+0x3f/0xa0
[  605.694124]  ? vfs_unlink+0x4c/0x190
[  605.698310]  mutex_lock_nested+0x16/0x20
[  605.702859]  pstore_unlink+0x3f/0xa0
[  605.707021]  vfs_unlink+0xb5/0x190
[  605.711024]  do_unlinkat+0x24c/0x2a0
[  605.715194]  SyS_unlinkat+0x16/0x30
[  605.719275]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  605.724543] RIP: 0033:0x7f8b08073ed7
[  605.728676] RSP: 002b:7ffe70eff628 EFLAGS: 0206 ORIG_RAX: 
0107
[  605.736929] RAX: ffda RBX: 8147ea93 RCX: 7f8b08073ed7
[  605.744711] RDX:  RSI: 0145 RDI: ff9c
[  605.752512] RBP: c9000338ff88 R08: 0003 R09: 
[  605.760276] R10: 015e R11: 0206 R12: 
[  605.768040] R13: 7ffe70eff750 R14: 0144ff70 R15: 01451230
[  605.775800]  ? __this_cpu_preempt_check+0x13/0x20

Reported-by: Tomi Sarvela 
Fixes: e9e360b08a44 ("pstore: Protect unlink with read_mutex")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100234
Signed-off-by: Chris Wilson 
Cc: Tomi Sarvela 
Cc: Kees Cook 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Tony Luck 
Cc: Stefan Hajnoczi 
Cc: Namhyung Kim 
Cc:  # v4.10+
---
 fs/pstore/inode.c| 37 ++---
 fs/pstore/platform.c |  2 ++
 2 files changed, 32 insertions(+), 7 deletions(

Re: [PATCH RESEND v1] locking/ww_mutex: Prevent read of uninitialized memory

2017-03-14 Thread Chris Wilson
On Tue, Mar 14, 2017 at 03:44:02PM -0400, Robert Foss wrote:
> On "missed ABBA deadlock" abba.result is read, but not initialized
> in all situations.
> 
> Detected by CoverityScan, CID#1402035 ("Uninitialized scalar variable")

It's unconditionally set by the worker, and only accessed by the parent
after the flush_work() barrier.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] [linux-mmotm] i915_gem_userptr_get_pages: possible circular locking dependency detected

2017-03-14 Thread Chris Wilson
On Tue, Mar 14, 2017 at 10:21:09PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> [  530.698622] ==
> [  530.698623] WARNING: possible circular locking dependency detected
> [  530.698626] 4.11.0-rc2-mm1-dbg-00167-gdb8a9941614c-dirty #222 Not tainted
> [  530.698627] --
> [  530.698628] Xorg/343 is trying to acquire lock:
> [  530.698630]  (&mm->mmap_sem){++}, at: [] 
> i915_gem_userptr_get_pages+0x60/0x29c [i915]
> [  530.698702] 
>but task is already holding lock:
> [  530.698703]  (&obj->mm.lock){+.+.+.}, at: [] 
> __i915_gem_object_get_pages+0x21/0x62 [i915]
> [  530.698763] 
>which lock already depends on the new lock.

Yup, just seen it myself. This particular cycle is a non-issue, but
there is another possiblity for a mmap_sem recursion in the ggtt fault
handler.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [PATCH] drm/i915: annote drop_caches debugfs interface with lockdep

2017-03-12 Thread Chris Wilson
On Sun, Mar 12, 2017 at 08:27:16PM +0100, Daniel Vetter wrote:
> The trouble we have is that we can't really test all the shrinker
> recursion stuff exhaustively in BAT because any kind of thrashing
> stress test just takes too long.
> 
> But that leaves a really big gap open, since shrinker recursions are
> one of the most annoying bugs. Now lockdep already has support for
> checking allocation deadlocks:
> 
> - Direct reclaim paths are marked up with
>   lockdep_set_current_reclaim_state() and
>   lockdep_clear_current_reclaim_state().
> 
> - Any allocation paths are marked with lockdep_trace_alloc().
> 
> If we simply mark up our debugfs with the reclaim annotations, any
> code and locks taken in there will automatically complete the picture
> with any allocation paths we already have, as long as we have a simple
> testcase in BAT which throws out a few objects using this interface.
> Not stress test or thrashing needed at all.
> 
> v2: Need to EXPORT_SYMBOL_GPL to make it compile as a module.
> 
> Cc: Chris Wilson 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Chris Wilson  (v1)
> Signed-off-by: Daniel Vetter 
> 
> --
> 
> Peter/Ingo,
> 
> We want this to validate the i915 shrinker locking in our fast tests
> without thrashing badly (that takes too long, we can only thrash in
> the extended runs). Can you pls take a look and if it's ok ack for
> merging through drm-intel.git?
> 
> Thanks, Daniel
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 2 ++
>  kernel/locking/lockdep.c| 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 82fb005a5e22..0f1d6c4a212b 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4273,11 +4273,13 @@ i915_drop_caches_set(void *data, u64 val)
>   if (val & (DROP_RETIRE | DROP_ACTIVE))
>   i915_gem_retire_requests(dev_priv);
>  
> + lockdep_set_current_reclaim_state(GFP_KERNEL);
>   if (val & DROP_BOUND)
>   i915_gem_shrink(dev_priv, LONG_MAX, I915_SHRINK_BOUND);
>  
>   if (val & DROP_UNBOUND)
>   i915_gem_shrink(dev_priv, LONG_MAX, I915_SHRINK_UNBOUND);
> + lockdep_clear_current_reclaim_state();
>  
>   if (val & DROP_SHRINK_ALL)
>   i915_gem_shrink_all(dev_priv);

Best to move the clear to here.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH] locking/ww-mutex: Limit stress test to 2s

2017-03-10 Thread Chris Wilson
Use a timeout rather than a fixed number of loops to avoid running for
very long periods, such as under the kbuilder VMs.

Reported-by: kernel test robot 
Signed-off-by: Chris Wilson 
Cc: Boqun Feng 
Cc: Peter Zijlstra 
---
 kernel/locking/test-ww_mutex.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index 6b7abb334ca6..c66d584f898c 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -353,8 +353,8 @@ static int test_cycle(unsigned int ncpus)
 struct stress {
struct work_struct work;
struct ww_mutex *locks;
+   unsigned long timeout;
int nlocks;
-   int nloops;
 };
 
 static int *get_random_order(int count)
@@ -433,7 +433,7 @@ static void stress_inorder_work(struct work_struct *work)
__func__, err);
break;
}
-   } while (--stress->nloops);
+   } while (!time_after(jiffies, stress->timeout));
 
ww_acquire_fini(&ctx);
 
@@ -495,7 +495,7 @@ static void stress_reorder_work(struct work_struct *work)
dummy_load(stress);
list_for_each_entry(ll, &locks, link)
ww_mutex_unlock(ll->lock);
-   } while (--stress->nloops);
+   } while (!time_after(jiffies, stress->timeout));
 
ww_acquire_fini(&ctx);
 
@@ -523,7 +523,7 @@ static void stress_one_work(struct work_struct *work)
__func__, err);
break;
}
-   } while (--stress->nloops);
+   } while (!time_after(jiffies, stress->timeout));
 
kfree(stress);
 }
@@ -533,7 +533,7 @@ static void stress_one_work(struct work_struct *work)
 #define STRESS_ONE BIT(2)
 #define STRESS_ALL (STRESS_INORDER | STRESS_REORDER | STRESS_ONE)
 
-static int stress(int nlocks, int nthreads, int nloops, unsigned int flags)
+static int stress(int nlocks, int nthreads, unsigned int flags)
 {
struct ww_mutex *locks;
int n;
@@ -575,7 +575,7 @@ static int stress(int nlocks, int nthreads, int nloops, 
unsigned int flags)
INIT_WORK(&stress->work, fn);
stress->locks = locks;
stress->nlocks = nlocks;
-   stress->nloops = nloops;
+   stress->timeout = jiffies + 2*HZ;
 
queue_work(wq, &stress->work);
nthreads--;
@@ -619,15 +619,15 @@ static int __init test_ww_mutex_init(void)
if (ret)
return ret;
 
-   ret = stress(16, 2*ncpus, 1<<10, STRESS_INORDER);
+   ret = stress(16, 2*ncpus, STRESS_INORDER);
if (ret)
return ret;
 
-   ret = stress(16, 2*ncpus, 1<<10, STRESS_REORDER);
+   ret = stress(16, 2*ncpus, STRESS_REORDER);
if (ret)
return ret;
 
-   ret = stress(4095, hweight32(STRESS_ALL)*ncpus, 1<<12, STRESS_ALL);
+   ret = stress(4095, hweight32(STRESS_ALL)*ncpus, STRESS_ALL);
if (ret)
return ret;
 
-- 
2.11.0



Re: [lkp-robot] [locking/ww_mutex] 857811a371: INFO:task_blocked_for_more_than#seconds

2017-03-08 Thread Chris Wilson
On Wed, Mar 08, 2017 at 09:08:54AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: 857811a37129f5d2ba162d7be3986eff44724014 ("locking/ww_mutex: Adjust 
> the lock number for stress test")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: boot
> 
> on test machine: qemu-system-i386 -enable-kvm -m 320M
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):

Now the test is running, it takes too long. :)

wait_for_completion_interruptible() would stop the hung task check? That
leaves NMI watchdog to check if we hit a deadlock between the workers.

And add a timeout to the stress test.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


<    1   2   3   4   5   6   7   8   >