Re: Consult on ARM SMMU debugfs
On 2021-01-15 22:47, Robin Murphy wrote: On 2021-01-15 15:14, Russell King - ARM Linux admin wrote: On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote: On 2021-01-07 02:45, chenxiang (M) wrote: Hi Will,� Robin or other guys, When debugging SMMU/SVA issue on huawei ARM64 board, we find that it lacks of enough debugfs for ARM SMMU driver (such as the value of STE/CD which we need to check sometimes). Currently it creates top-level iommu directory in debugfs, but there is no debugfs for ARM SMMU driver specially. Do you know whether ARM have the plan to do that recently? FWIW I don't think I've ever felt the need to need to inspect the Stream Table on a live system. So far the nature of the STE code has been simple enough that it's very hard for any given STE to be *wrong* - either it's set up as expected and thus works fine, or it's not initialised at all and you get C_BAD_STE, where 99% of the time you then just cross-reference the Stream ID against the firmware and find that the DT/IORT is wrong. Similarly I don't think I've even even *seen* an issue that could be attributed to a context descriptor, although I appreciate that as we start landing more PASID and SVA support the scope for that starts to widen considerably. Feel free to propose a patch if you believe it would be genuinely useful and won't just bit-rot into a maintenance burden, but it's not something that's on our roadmap here. I do think that the IOMMU stuff needs better debugging. I've hit the WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable, so I've resorted to putting the IOMMU into bypass mode permanently to work around the issue. The reason that it's undebuggable is if one puts printk() or trace statements in the code, boots the platform, you get flooded with those debugging messages, because every access to the rootfs generates and tears down a mapping. It would be nice to be able to inspect the IOMMU page tables and state of the IOMMU, rather than having to resort to effectively disabling the IOMMU. Certainly once we get to stuff like unpinned VFIO, having the ability to inspect pagetables for arbitrary IOMMU API usage will indeed be useful. From the DMA mapping perspective, though, unless you're working on the io-pgtable code itself it's not really going to tell you much that dumping the mappings from dma-debug can't already. FWIW whenever I encounter that particular warning in iommu-dma context, I don't care where the existing mapping is pointing, since it's merely a symptom of the damage already having been done. At that point I'd usually go off and audit all the DMA API calls in the offending driver, since it's typically caused by corruption in the IOVA allocator from passing the wrong size in a dma_unmap_*() call, and those can often be spotted by inspection. For active debugging, what you really want to know is the *history* of operations around that IOVA, since you're primarily interested in the request that last mapped it, then the corresponding unmap request for nominally the same buffer (which allowed the IOVA region to be freed for reuse) that for some reason didn't cover one or more pages that it should have. The IOMMU API tracepoints can be a handy tool there. Currently IOMMU trace events are not straight forward to decode if there are multiple devices attached. For ex: consider below: map: IOMMU: iova=0x00035000 paddr=0x000113be2000 size=4096 unmap: IOMMU: iova=0x00034000 size=4096 unmapped_size=4096 unmap: IOMMU: iova=0x00035000 size=4096 unmapped_size=4096 map: IOMMU: iova=0x00036000 paddr=0x0001164d8000 size=4096 map: IOMMU: iova=0x00037000 paddr=0x0001164da000 size=4096 unmap: IOMMU: iova=0x00036000 size=4096 unmapped_size=4096 unmap: IOMMU: iova=0x00037000 size=4096 unmapped_size=4096 How about making it more useful adding the device name as well? Ex: map: IOMMU:ae0.mdss iova=0x0002b000 paddr=0x00010a9e6000 size=8192 map: IOMMU:ae0.mdss iova=0x0002d000 paddr=0x00010a9ec000 size=21790 map: IOMMU:ae0.mdss iova=0x00241000 paddr=0x00010c40 size=59392 map: IOMMU:a60.dwc3 iova=0x0004a000 paddr=0x00010a821000 size=4096 map: IOMMU:a60.dwc3 iova=0x00049000 paddr=0x00010a82 size=4096 unmap: IOMMU:a60.dwc3 iova=0x0004a000 size=4096 unmapped_size=4096 unmap: IOMMU:a60.dwc3 iova=0x00049000 size=4096 unmapped_size=4096 We have been carrying a local patch downstream like forever, I can post a patch if you guys think it is useful in general. Thanks Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Consult on ARM SMMU debugfs
On 2021-01-15 15:14, Russell King - ARM Linux admin wrote: On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote: On 2021-01-07 02:45, chenxiang (M) wrote: Hi Will,� Robin or other guys, When debugging SMMU/SVA issue on huawei ARM64 board, we find that it lacks of enough debugfs for ARM SMMU driver (such as the value of STE/CD which we need to check sometimes). Currently it creates top-level iommu directory in debugfs, but there is no debugfs for ARM SMMU driver specially. Do you know whether ARM have the plan to do that recently? FWIW I don't think I've ever felt the need to need to inspect the Stream Table on a live system. So far the nature of the STE code has been simple enough that it's very hard for any given STE to be *wrong* - either it's set up as expected and thus works fine, or it's not initialised at all and you get C_BAD_STE, where 99% of the time you then just cross-reference the Stream ID against the firmware and find that the DT/IORT is wrong. Similarly I don't think I've even even *seen* an issue that could be attributed to a context descriptor, although I appreciate that as we start landing more PASID and SVA support the scope for that starts to widen considerably. Feel free to propose a patch if you believe it would be genuinely useful and won't just bit-rot into a maintenance burden, but it's not something that's on our roadmap here. I do think that the IOMMU stuff needs better debugging. I've hit the WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable, so I've resorted to putting the IOMMU into bypass mode permanently to work around the issue. The reason that it's undebuggable is if one puts printk() or trace statements in the code, boots the platform, you get flooded with those debugging messages, because every access to the rootfs generates and tears down a mapping. It would be nice to be able to inspect the IOMMU page tables and state of the IOMMU, rather than having to resort to effectively disabling the IOMMU. Certainly once we get to stuff like unpinned VFIO, having the ability to inspect pagetables for arbitrary IOMMU API usage will indeed be useful. From the DMA mapping perspective, though, unless you're working on the io-pgtable code itself it's not really going to tell you much that dumping the mappings from dma-debug can't already. FWIW whenever I encounter that particular warning in iommu-dma context, I don't care where the existing mapping is pointing, since it's merely a symptom of the damage already having been done. At that point I'd usually go off and audit all the DMA API calls in the offending driver, since it's typically caused by corruption in the IOVA allocator from passing the wrong size in a dma_unmap_*() call, and those can often be spotted by inspection. For active debugging, what you really want to know is the *history* of operations around that IOVA, since you're primarily interested in the request that last mapped it, then the corresponding unmap request for nominally the same buffer (which allowed the IOVA region to be freed for reuse) that for some reason didn't cover one or more pages that it should have. The IOMMU API tracepoints can be a handy tool there. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Consult on ARM SMMU debugfs
On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote: > On 2021-01-07 02:45, chenxiang (M) wrote: > > Hi Will, Robin or other guys, > > > > When debugging SMMU/SVA issue on huawei ARM64 board, we find that it > > lacks of enough debugfs for ARM SMMU driver (such as > > > > the value of STE/CD which we need to check sometimes). Currently it > > creates top-level iommu directory in debugfs, but there is no debugfs > > > > for ARM SMMU driver specially. Do you know whether ARM have the plan to > > do that recently? > > FWIW I don't think I've ever felt the need to need to inspect the Stream > Table on a live system. So far the nature of the STE code has been simple > enough that it's very hard for any given STE to be *wrong* - either it's set > up as expected and thus works fine, or it's not initialised at all and you > get C_BAD_STE, where 99% of the time you then just cross-reference the > Stream ID against the firmware and find that the DT/IORT is wrong. > > Similarly I don't think I've even even *seen* an issue that could be > attributed to a context descriptor, although I appreciate that as we start > landing more PASID and SVA support the scope for that starts to widen > considerably. > > Feel free to propose a patch if you believe it would be genuinely useful and > won't just bit-rot into a maintenance burden, but it's not something that's > on our roadmap here. I do think that the IOMMU stuff needs better debugging. I've hit the WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable, so I've resorted to putting the IOMMU into bypass mode permanently to work around the issue. The reason that it's undebuggable is if one puts printk() or trace statements in the code, boots the platform, you get flooded with those debugging messages, because every access to the rootfs generates and tears down a mapping. It would be nice to be able to inspect the IOMMU page tables and state of the IOMMU, rather than having to resort to effectively disabling the IOMMU. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Consult on ARM SMMU debugfs
在 2021/1/12 4:01, Robin Murphy 写道: On 2021-01-07 02:45, chenxiang (M) wrote: Hi Will, Robin or other guys, When debugging SMMU/SVA issue on huawei ARM64 board, we find that it lacks of enough debugfs for ARM SMMU driver (such as the value of STE/CD which we need to check sometimes). Currently it creates top-level iommu directory in debugfs, but there is no debugfs for ARM SMMU driver specially. Do you know whether ARM have the plan to do that recently? FWIW I don't think I've ever felt the need to need to inspect the Stream Table on a live system. So far the nature of the STE code has been simple enough that it's very hard for any given STE to be *wrong* - either it's set up as expected and thus works fine, or it's not initialised at all and you get C_BAD_STE, where 99% of the time you then just cross-reference the Stream ID against the firmware and find that the DT/IORT is wrong. Similarly I don't think I've even even *seen* an issue that could be attributed to a context descriptor, although I appreciate that as we start landing more PASID and SVA support the scope for that starts to widen considerably. Thank you for your reply. I aggree that it is very hard for the content of STE/CD to be wrong in current code, but there are more upsteaming features(such as SVA/vSMMU) which are related to SMMU, when debugging with those features, it is useful for us to locate issues if there are interfaces to dump those info. And also when debugging together with hardware guys, the content of dump is important for them too. Feel free to propose a patch if you believe it would be genuinely useful and won't just bit-rot into a maintenance burden, but it's not something that's on our roadmap here. OK, we are considering about incorporating following requirements into the plan: - Dump the value of STE/CD of devices - Dump page table entries of devices - Dump the entries info of CMDQ/EVENTQ Thanks, Robin. . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Consult on ARM SMMU debugfs
On 2021-01-07 02:45, chenxiang (M) wrote: Hi Will, Robin or other guys, When debugging SMMU/SVA issue on huawei ARM64 board, we find that it lacks of enough debugfs for ARM SMMU driver (such as the value of STE/CD which we need to check sometimes). Currently it creates top-level iommu directory in debugfs, but there is no debugfs for ARM SMMU driver specially. Do you know whether ARM have the plan to do that recently? FWIW I don't think I've ever felt the need to need to inspect the Stream Table on a live system. So far the nature of the STE code has been simple enough that it's very hard for any given STE to be *wrong* - either it's set up as expected and thus works fine, or it's not initialised at all and you get C_BAD_STE, where 99% of the time you then just cross-reference the Stream ID against the firmware and find that the DT/IORT is wrong. Similarly I don't think I've even even *seen* an issue that could be attributed to a context descriptor, although I appreciate that as we start landing more PASID and SVA support the scope for that starts to widen considerably. Feel free to propose a patch if you believe it would be genuinely useful and won't just bit-rot into a maintenance burden, but it's not something that's on our roadmap here. Thanks, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Consult on ARM SMMU debugfs
Hi Will, Robin or other guys, When debugging SMMU/SVA issue on huawei ARM64 board, we find that it lacks of enough debugfs for ARM SMMU driver (such as the value of STE/CD which we need to check sometimes). Currently it creates top-level iommu directory in debugfs, but there is no debugfs for ARM SMMU driver specially. Do you know whether ARM have the plan to do that recently? Best regards, Shawn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu