Hi Laurent,

On 5/24/2017 4:56 PM, Laurent Pinchart wrote:
> Hello,
> On Wednesday 24 May 2017 16:01:45 Sricharan R wrote:
>> On 5/24/2017 4:12 AM, Russell King - ARM Linux wrote:
>>> On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote:
>>>> On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote:
>>>>> On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote:
>>>>>> On 23/05/17 17:25, Russell King - ARM Linux wrote:
>>>>>>> So, I've come to apply this patch (since it's finally landed in the
>>>>>>> patch system), and I'm not convinced that the commit message is really
>>>>>>> up to scratch.
>>>>>>> The current commit message looks like this:
>>>>>>> "   ARM: 8674/1: dma-mapping: Reset the device's dma_ops
>>>>>>>     arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
>>>>>>>     dma_ops should be cleared in the teardown path. Otherwise this
>>>>>>>     causes problem when the probe of device is retried after being
>>>>>>>     deferred. The device's iommu structures are cleared after
>>>>>>>     EPROBEDEFER error, but on the next try dma_ops will still be set
>>>>>>>     to old value, which is not right."
>>>>>>> It is obviously a fix, but a fix for which patch?  Looking at the
>>>>>>> history, we have "arm: dma-mapping: Don't override dma_ops in
>>>>>>> arch_setup_dma_ops()" which I would have guessed is the appropriate
>>>>>>> one, but this post-dates your patch (it's very recent, v4.12-rc
>>>>>>> recent.)
>>>>>>> So, I need more description about the problem you were seeing when
>>>>>>> you first proposed this patch.
>>>>>>> How does leaving the dma_ops in place prior to "arm: dma-mapping:
>>>>>>> Don't override dma_ops in arch_setup_dma_ops()" cause problems for
>>>>>>> deferred probing?
>>>>>>> What patch is your change trying to fix?  In other words, how far
>>>>>>> back does this patch need to be backported?
>>>>>> In effect, it's fixing a latent inconsistency that's been present since
>>>>>> its introduction in 4bb25789ed28. However, that has clearly not proven
>>>>>> to be an issue in practice since then. With 09515ef5ddad we start
>>>>>> actually calling arch_teardown_dma_ops() in a manner that might leave
>>>>>> things partially initialised if anyone starts removing and reprobing
>>>>>> drivers, but so far that's still from code inspection[1] rather than
>>>>>> anyone hitting it.
>>>>>> Given that the changes which tickle it are fresh in -rc1 I'd say
>>>>>> there's no need to backport this, but at the same time it shouldn't do
>>>>>> any damage if you really want to.
>>>>> Well, looking at this, I'm not convinced that much of it is correct.
>>>>> 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds
>>>>>    the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops()
>>>>>    rather than arch_teardown_dma_ops().
>>>>>    This doesn't strike me as being particularly symmetric.
>>>>>    arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s
>>>>>    counterpart.
>>>>> 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops
>>>>>    check, and Xen - Xen wants to override the DMA ops if in the
>>>>>    initial domain.  It's not clear (at least to me) whether the recent
>>>>>    patch adding the dma_ops check took account of this or not.
>>>>> 3) random places seem to fiddle with the dma_ops - notice that
>>>>>    arm_iommu_detach_device() sets the dma_ops to NULL.
>>>>>    In fact, I think moving __arm_iommu_detach_device() into
>>>>>    arm_iommu_detach_device(), calling arm_iommu_detach_device(),
>>>>>    and getting rid of the explicit set_dma_ops(, NULL) in this
>>>>>    path would be a good first step.
>>>>> 4) I think arch_setup_dma_ops() is over-complex.
>>>>> So, in summary, this code is a mess today, and that means it's not
>>>>> obviously correct - which is bad.  This needs sorting.
>>>> We've reached the same conclusion independently, but I'll refrain from
>>>> commenting on whether that's a good or bad thing :-)
>>>> I don't think this patch should be applied, as it could break Xen (and
>>>> other platforms/drivers that set the DMA operations manually) by
>>>> resetting DMA operations at device remove() time even if they have been
>>>> set independently of arch_setup_dma_ops().
>>> That will only occur if the dma ops have been overriden once the DMA
>>> operations have been setup via arch_setup_dma_ops. What saves it from
>>> wholesale NULLing of the DMA operations is the check for a valid
>>> dma_iommu_mapping structure in arm_teardown_iommu_dma_ops().  This only
>>> exists when arm_setup_iommu_dma_ops() has attached a mapping to the
>>> device.
> Unfortunately I don't think that's always the case. The dma_iommu_mapping is 
> also set by direct callers of arm_iommu_attach_device(), namely
> - the Renesas R-Car IOMMU driver (ipmmu-vmsa)
> - the Mediatek IOMMU driver (mtk-iommu-v1)
> - the Exynos DRM driver
> - the OMAP3 ISP driver
> All these need to be fixed, but that's not v4.12-rc material. At least in the 
> ipmmu-vmsa case, which is the one I noticed the problem with, 
> arm_iommu_attach_device() is called before arch_setup_dma_ops(). 
> arch_setup_dma_ops() then exits immediately when called due to the
>         if (dev->dma_ops)
>                 return;
> check at the beginning of the function. We must ensure that in that case 
> arch_teardown_dma_ops() will not remove the mapping or set the DMA ops to 
> NULL, and testing to_dma_iommu_mapping(dev) won't help.

Right, agree. Same issue highlighted in #1, #2 below and the fixes posted [1]
for 4.12-rc.

>> Right, only if the dma ops are set and no dma_iommu_mapping is created for
>> the device, then arch_teardown_iommu_dma_ops does nothing.
>> Firstly, this patch for resetting the dma_ops in teardown was required
>> only when arch_setup_dma_ops has created both the mapping and dma_ops
>> for the device. Because mappings that were created in arch_setup_dma_ops
>> are cleared in teardown, so dma ops should also be reset. But this can be
>> done by calling arm_iommu_detach_device() from arm_teardown_iommu_dma_ops
>> to avoid explicitly calling set_dma_ops again, probably this what was
>> suggested in #3 above ?
>> Really sorry for the mess, but below cleanups looks required otherwise,
>> 1) set_dma_ops is called for mach-highbank, mach-mevbu during the
>>    BUS_NOTIFY_ADD_DEVICE. That should be removed and made to come from
>>    DT path and arch_setup_dma_ops. dmabounce.c also does set_dma_ops,
>>    not very sure how to handle that (swiotlb ?), are call
>>    dmabounce_register_dev during the device's probe instead to have the
>>    dma_set_ops overriding later in probe ?
> All this needs to be addressed, but it's definitely not v4.12-rc material.
>> 2) arm_iommu_attach_device is called from ipmmu-vmsa.c, mtk_iommu_v1.c,
>>    iommu drivers, from the iommu add_device callback, called
>>    from BUS_NOTIFY_ADD_DEVICE notifier. This is a problem because,
>>    with probe-deferral, this can be overridden in arch_setup_dma_ops
>>    during device probe and cleared in teardown path. But the add_device
>>    callback notifier is not called again when the device gets reprobed
>>    again.
>>    With probe deferral, add_device callback also gets called from
>>    of_iommu_configure during device probe, so the above drivers should
>>    be adapted to properly register the iommu_ops to have its add_device
>>    called from of_iommu_configure path. mtk_iommu_v1.c seems to be fine,
>>    but ipmmu-vmsa.c should be adapted. otherwise if the devices attached
>>    to those iommus call arm_iommu_attach_device from its probe path to
>>    override the default ops set in arch_setup_dma_ops, then all is fine.
>>    This seems to be the case with exynos_drm_iommu.c, omap3isp/isp.c.
> Same here, this needs to be addressed, but not in v4.12-rc. We need a simpler 
> fix for v4.12-rc.

Right, not for 4.12-rc, but as a better cleanup beyond, so
wanted to discuss the right sequence further. The fixes posted are the ones
for 4.12-rc.

>>    If the above two are done, the overridding of the default dma_ops and
>>    mapping should happen after arch_setup_dma_ops is called and also
>>    overridden every time the device gets reprobed. That should help to get
>>    rid of couple of fixes that has been added.
>> 3) As Laurent already pointed out earlier, return error codes from some of
>>    the IOMMU apis needs to standardized.
>>    Please let me know if its right way of doing it ?
> Again, the patch I propose is the simplest v4.12-rc fix I can think of, short 
> of reverting your complete IOMMU probe deferral patch series. Let's focus on 
> the v4.12-rc fix, and then discuss how to move forward in v4.13 and beyond.


[1] https://lkml.org/lkml/2017/5/23/411

"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
iommu mailing list

Reply via email to