ccf vs iommu vs drm locking fun

2014-09-04 Thread Rob Clark
So, I was looking at the below lockdep splat, and discussing it a bit
w/ sboyd on IRC, and came to a slightly disturbing realization..

The interaction between prepare_lock and debugfs bits is a little bit
worrying.  In particular, it is probably not a good idea to assume
that anyone who needs to grab prepare_lock does not already hold
mmap_sem.  Not holding mmap_sem or locks that interact w/ mmap_sem is
going to be pretty hard to avoid, at least for gpu drivers that are
using iommus that are using CCF ;-)

BR,
-R


--

[15928.894558]
[15928.894609] ==
[15928.895145] [ INFO: possible circular locking dependency detected ]
[15928.901141] 3.17.0-rc1-00050-g07a489b #802 Tainted: GW
[15928.907335] ---
[15928.907348] Xorg.bin/5413 is trying to acquire lock:
[15928.907417]  (prepare_lock){+.+.+.}, at: []
clk_prepare_lock+0x88/0xfc
[15928.907424]
[15928.907424] but task is already holding lock:
[15928.907508]  (qcom_iommu_lock){+.+...}, at: []
qcom_iommu_unmap+0x1c/0x1f0
[15928.907519]
[15928.907519] which lock already depends on the new lock.
[15928.907519]
[15928.907532]
[15928.907532] the existing dependency chain (in reverse order) is:
[15928.907575]
[15928.907575] -> #4 (qcom_iommu_lock){+.+...}:
[15928.907611][] qcom_iommu_map+0x28/0x450
[15928.907634][] iommu_map+0xc8/0x12c
[15928.907662][] msm_iommu_map+0xb4/0x130
[15928.907681][] msm_gem_get_iova_locked+0x9c/0xe8
[15928.907714][] msm_gem_get_iova+0x4c/0x64
[15928.907765][] mdp4_kms_init+0x4c4/0x6c0
[15928.907813][] msm_load+0x2ac/0x34c
[15928.907846][] drm_dev_register+0xac/0x108
[15928.907868][] drm_platform_init+0x50/0xf0
[15928.907892][] try_to_bring_up_master.part.3+0xc8/0x108
[15928.907913][] component_master_add_with_match+0xa8/0x104
[15928.907934][] msm_pdev_probe+0x64/0x70
[15928.907955][] platform_drv_probe+0x2c/0x60
[15928.907983][] driver_probe_device+0x108/0x234
[15928.908003][] bus_for_each_drv+0x64/0x98
[15928.908040][] device_attach+0x78/0x8c
[15928.908082][] bus_probe_device+0x88/0xac
[15928.908126][] deferred_probe_work_func+0x68/0x9c
[15928.908182][] process_one_work+0x1a0/0x40c
[15928.908214][] worker_thread+0x44/0x4d8
[15928.908237][] kthread+0xd8/0xec
[15928.908262][] ret_from_fork+0x14/0x2c
[15928.908291]
[15928.908291] -> #3 (>struct_mutex){+.+.+.}:
[15928.908311][] drm_gem_mmap+0x38/0xd0
[15928.908329][] msm_gem_mmap+0xc/0x5c
[15928.908358][] mmap_region+0x35c/0x6c8
[15928.908377][] do_mmap_pgoff+0x314/0x398
[15928.908398][] vm_mmap_pgoff+0x84/0xb4
[15928.908416][] SyS_mmap_pgoff+0x94/0xbc
[15928.908436][] ret_fast_syscall+0x0/0x48
[15928.908463]
[15928.908463] -> #2 (>mmap_sem){++}:
[15928.908512][] filldir64+0x68/0x180
[15928.908558][] dcache_readdir+0x188/0x22c
[15928.908593][] iterate_dir+0x9c/0x11c
[15928.908616][] SyS_getdents64+0x78/0xe8
[15928.908640][] ret_fast_syscall+0x0/0x48
[15928.908671]
[15928.908671] -> #1 (>s_type->i_mutex_key#3){+.+.+.}:
[15928.908706][] __create_file+0x58/0x1dc
[15928.908728][] debugfs_create_dir+0x1c/0x24
[15928.908761][] clk_debug_create_subtree+0x20/0x170
[15928.908790][] clk_debug_init+0xec/0x14c
[15928.908816][] do_one_initcall+0x8c/0x1c8
[15928.908846][] kernel_init_freeable+0x13c/0x1dc
[15928.908873][] kernel_init+0x8/0xe8
[15928.908898][] ret_from_fork+0x14/0x2c
[15928.908925]
[15928.908925] -> #0 (prepare_lock){+.+.+.}:
[15928.908948][] mutex_lock_nested+0x70/0x3e8
[15928.908970][] clk_prepare_lock+0x88/0xfc
[15928.909001][] clk_prepare+0xc/0x24
[15928.909022][] __enable_clocks.isra.4+0x18/0xa4
[15928.909041][] __flush_iotlb_va+0xe0/0x114
[15928.909071][] qcom_iommu_unmap+0xac/0x1f0
[15928.909093][] iommu_unmap+0x9c/0xe8
[15928.909112][] msm_iommu_unmap+0x64/0x84
[15928.909130][] msm_gem_free_object+0x11c/0x338
[15928.909149][]
drm_gem_object_handle_unreference_unlocked+0xfc/0x130
[15928.909166][] drm_gem_object_release_handle+0x50/0x68
[15928.909199][] idr_for_each+0xa8/0xdc
[15928.909225][] drm_gem_release+0x1c/0x28
[15928.909258][] drm_release+0x370/0x428
[15928.909302][] __fput+0x98/0x1e8
[15928.909339][] task_work_run+0xb0/0xfc
[15928.909386][] do_exit+0x2ec/0x948
[15928.909415][] do_group_exit+0x4c/0xb8
[15928.909455][] get_signal+0x28c/0x6ac
[15928.909507][] do_signal+0xc4/0x3e4
[15928.909548][] do_work_pending+0xb4/0xc4
[15928.909584][] work_pending+0xc/0x20
[15928.909595]
[15928.909595] other info that might help us debug this:
[15928.909595]

ccf vs iommu vs drm locking fun

2014-09-04 Thread Stephen Boyd
On 09/04/14 17:46, Rob Clark wrote:
> So, I was looking at the below lockdep splat, and discussing it a bit
> w/ sboyd on IRC, and came to a slightly disturbing realization..
>
> The interaction between prepare_lock and debugfs bits is a little bit
> worrying.  In particular, it is probably not a good idea to assume
> that anyone who needs to grab prepare_lock does not already hold
> mmap_sem.  Not holding mmap_sem or locks that interact w/ mmap_sem is
> going to be pretty hard to avoid, at least for gpu drivers that are
> using iommus that are using CCF ;-)

I'm thinking one way to fix this is to replace the tree traversal for
debugfs registration with a list iteration of all registered clocks.
That way we don't hold the prepare mutex across debugfs directory/file
creation. This should break the chain.

Now that debugfs isn't a hierarchy, this becomes a lot easier, we just
need to keep a linked list of all the clocks that are registered. I
already have that patch for my wwmutex series, but I didn't convert
debugfs to use it. Two patches to follow.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation