Re: [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling
On Tue, Jul 6, 2021 at 10:12 PM John Stultz wrote: > > On Sun, Jul 4, 2021 at 11:16 AM Rob Clark wrote: > > > > I suspect you are getting a dpu fault, and need: > > > > https://lore.kernel.org/linux-arm-msm/CAF6AEGvTjTUQXqom-xhdh456tdLscbVFPQ+iud1H1gHc8A2=h...@mail.gmail.com/ > > > > I suppose Bjorn was expecting me to send that patch > > If it's helpful, I applied that and it got the db845c booting mainline > again for me (along with some reverts for a separate ext4 shrinker > crash). > Tested-by: John Stultz > Thanks, I'll send a patch shortly BR, -R ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling
On Sun, Jul 4, 2021 at 11:16 AM Rob Clark wrote: > > I suspect you are getting a dpu fault, and need: > > https://lore.kernel.org/linux-arm-msm/CAF6AEGvTjTUQXqom-xhdh456tdLscbVFPQ+iud1H1gHc8A2=h...@mail.gmail.com/ > > I suppose Bjorn was expecting me to send that patch If it's helpful, I applied that and it got the db845c booting mainline again for me (along with some reverts for a separate ext4 shrinker crash). Tested-by: John Stultz thanks -john ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling
On Sun 04 Jul 13:20 CDT 2021, Rob Clark wrote: > I suspect you are getting a dpu fault, and need: > > https://lore.kernel.org/linux-arm-msm/CAF6AEGvTjTUQXqom-xhdh456tdLscbVFPQ+iud1H1gHc8A2=h...@mail.gmail.com/ > > I suppose Bjorn was expecting me to send that patch > No, I left that discussion with the same understanding as you... But I ended up side tracked by some other craziness. Did you post this somewhere or would you still like me to test it and spin a patch? Regards, Bjorn > BR, > -R > > On Sun, Jul 4, 2021 at 5:53 AM Dmitry Baryshkov > wrote: > > > > Hi, > > > > I've had splash screen disabled on my RB3. However once I've enabled it, > > I've got the attached crash during the boot on the msm/msm-next. It > > looks like it is related to this particular set of changes. > > > > On 11/06/2021 00:44, Rob Clark wrote: > > > From: Rob Clark > > > > > > This picks up an earlier series[1] from Jordan, and adds additional > > > support needed to generate GPU devcore dumps on iova faults. Original > > > description: > > > > > > This is a stack to add an Adreno GPU specific handler for pagefaults. The > > > first > > > patch starts by wiring up report_iommu_fault for arm-smmu. The next patch > > > adds > > > a adreno-smmu-priv function hook to capture a handful of important > > > debugging > > > registers such as TTBR0, CONTEXTIDR, FSYNR0 and others. This is used by > > > the > > > third patch to print more detailed information on page fault such as the > > > TTBR0 > > > for the pagetable that caused the fault and the source of the fault as > > > determined by a combination of the FSYNR1 register and an internal GPU > > > register. > > > > > > This code provides a solid base that we can expand on later for even more > > > extensive GPU side page fault debugging capabilities. > > > > > > v5: [Rob] Use RBBM_STATUS3.SMMU_STALLED_ON_FAULT to detect case where > > > GPU snapshotting needs to avoid crashdumper, and check the > > > RBBM_STATUS3.SMMU_STALLED_ON_FAULT in GPU hang irq paths > > > v4: [Rob] Add support to stall SMMU on fault, and let the GPU driver > > > resume translation after it has had a chance to snapshot the GPUs > > > state > > > v3: Always clear FSR even if the target driver is going to handle resume > > > v2: Fix comment wording and function pointer check per Rob Clark > > > > > > [1] > > > https://lore.kernel.org/dri-devel/20210225175135.91922-1-jcro...@codeaurora.org/ > > > > > > Jordan Crouse (3): > > >iommu/arm-smmu: Add support for driver IOMMU fault handlers > > >iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault > > > info > > >drm/msm: Improve the a6xx page fault handler > > > > > > Rob Clark (2): > > >iommu/arm-smmu-qcom: Add stall support > > >drm/msm: devcoredump iommu fault support > > > > > > drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 23 +++- > > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 110 +++- > > > drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 42 ++-- > > > drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 +++ > > > drivers/gpu/drm/msm/msm_gem.h | 1 + > > > drivers/gpu/drm/msm/msm_gem_submit.c| 1 + > > > drivers/gpu/drm/msm/msm_gpu.c | 48 + > > > drivers/gpu/drm/msm/msm_gpu.h | 17 +++ > > > drivers/gpu/drm/msm/msm_gpummu.c| 5 + > > > drivers/gpu/drm/msm/msm_iommu.c | 22 +++- > > > drivers/gpu/drm/msm/msm_mmu.h | 5 +- > > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 50 + > > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +- > > > drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 + > > > include/linux/adreno-smmu-priv.h| 38 ++- > > > 15 files changed, 367 insertions(+), 21 deletions(-) > > > > > > > > > -- > > With best wishes > > Dmitry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling
I suspect you are getting a dpu fault, and need: https://lore.kernel.org/linux-arm-msm/CAF6AEGvTjTUQXqom-xhdh456tdLscbVFPQ+iud1H1gHc8A2=h...@mail.gmail.com/ I suppose Bjorn was expecting me to send that patch BR, -R On Sun, Jul 4, 2021 at 5:53 AM Dmitry Baryshkov wrote: > > Hi, > > I've had splash screen disabled on my RB3. However once I've enabled it, > I've got the attached crash during the boot on the msm/msm-next. It > looks like it is related to this particular set of changes. > > On 11/06/2021 00:44, Rob Clark wrote: > > From: Rob Clark > > > > This picks up an earlier series[1] from Jordan, and adds additional > > support needed to generate GPU devcore dumps on iova faults. Original > > description: > > > > This is a stack to add an Adreno GPU specific handler for pagefaults. The > > first > > patch starts by wiring up report_iommu_fault for arm-smmu. The next patch > > adds > > a adreno-smmu-priv function hook to capture a handful of important debugging > > registers such as TTBR0, CONTEXTIDR, FSYNR0 and others. This is used by the > > third patch to print more detailed information on page fault such as the > > TTBR0 > > for the pagetable that caused the fault and the source of the fault as > > determined by a combination of the FSYNR1 register and an internal GPU > > register. > > > > This code provides a solid base that we can expand on later for even more > > extensive GPU side page fault debugging capabilities. > > > > v5: [Rob] Use RBBM_STATUS3.SMMU_STALLED_ON_FAULT to detect case where > > GPU snapshotting needs to avoid crashdumper, and check the > > RBBM_STATUS3.SMMU_STALLED_ON_FAULT in GPU hang irq paths > > v4: [Rob] Add support to stall SMMU on fault, and let the GPU driver > > resume translation after it has had a chance to snapshot the GPUs > > state > > v3: Always clear FSR even if the target driver is going to handle resume > > v2: Fix comment wording and function pointer check per Rob Clark > > > > [1] > > https://lore.kernel.org/dri-devel/20210225175135.91922-1-jcro...@codeaurora.org/ > > > > Jordan Crouse (3): > >iommu/arm-smmu: Add support for driver IOMMU fault handlers > >iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault > > info > >drm/msm: Improve the a6xx page fault handler > > > > Rob Clark (2): > >iommu/arm-smmu-qcom: Add stall support > >drm/msm: devcoredump iommu fault support > > > > drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 23 +++- > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 110 +++- > > drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 42 ++-- > > drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 +++ > > drivers/gpu/drm/msm/msm_gem.h | 1 + > > drivers/gpu/drm/msm/msm_gem_submit.c| 1 + > > drivers/gpu/drm/msm/msm_gpu.c | 48 + > > drivers/gpu/drm/msm/msm_gpu.h | 17 +++ > > drivers/gpu/drm/msm/msm_gpummu.c| 5 + > > drivers/gpu/drm/msm/msm_iommu.c | 22 +++- > > drivers/gpu/drm/msm/msm_mmu.h | 5 +- > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 50 + > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +- > > drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 + > > include/linux/adreno-smmu-priv.h| 38 ++- > > 15 files changed, 367 insertions(+), 21 deletions(-) > > > > > -- > With best wishes > Dmitry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling
Hi, I've had splash screen disabled on my RB3. However once I've enabled it, I've got the attached crash during the boot on the msm/msm-next. It looks like it is related to this particular set of changes. On 11/06/2021 00:44, Rob Clark wrote: From: Rob Clark This picks up an earlier series[1] from Jordan, and adds additional support needed to generate GPU devcore dumps on iova faults. Original description: This is a stack to add an Adreno GPU specific handler for pagefaults. The first patch starts by wiring up report_iommu_fault for arm-smmu. The next patch adds a adreno-smmu-priv function hook to capture a handful of important debugging registers such as TTBR0, CONTEXTIDR, FSYNR0 and others. This is used by the third patch to print more detailed information on page fault such as the TTBR0 for the pagetable that caused the fault and the source of the fault as determined by a combination of the FSYNR1 register and an internal GPU register. This code provides a solid base that we can expand on later for even more extensive GPU side page fault debugging capabilities. v5: [Rob] Use RBBM_STATUS3.SMMU_STALLED_ON_FAULT to detect case where GPU snapshotting needs to avoid crashdumper, and check the RBBM_STATUS3.SMMU_STALLED_ON_FAULT in GPU hang irq paths v4: [Rob] Add support to stall SMMU on fault, and let the GPU driver resume translation after it has had a chance to snapshot the GPUs state v3: Always clear FSR even if the target driver is going to handle resume v2: Fix comment wording and function pointer check per Rob Clark [1] https://lore.kernel.org/dri-devel/20210225175135.91922-1-jcro...@codeaurora.org/ Jordan Crouse (3): iommu/arm-smmu: Add support for driver IOMMU fault handlers iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info drm/msm: Improve the a6xx page fault handler Rob Clark (2): iommu/arm-smmu-qcom: Add stall support drm/msm: devcoredump iommu fault support drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 23 +++- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 110 +++- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 42 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 +++ drivers/gpu/drm/msm/msm_gem.h | 1 + drivers/gpu/drm/msm/msm_gem_submit.c| 1 + drivers/gpu/drm/msm/msm_gpu.c | 48 + drivers/gpu/drm/msm/msm_gpu.h | 17 +++ drivers/gpu/drm/msm/msm_gpummu.c| 5 + drivers/gpu/drm/msm/msm_iommu.c | 22 +++- drivers/gpu/drm/msm/msm_mmu.h | 5 +- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 50 + drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +- drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 + include/linux/adreno-smmu-priv.h| 38 ++- 15 files changed, 367 insertions(+), 21 deletions(-) -- With best wishes Dmitry Handling Cmd: set_active:a SetActiveSlot: _a already active slot Handling Cmd: download:061b8800 Download Finished Handling Cmd: boot A/B retry count NOT decremented Booting Into Mission Mode No dtbo partition is found, Skip dtbo Exit key detection timer GetVmData: making ScmCall to get HypInfo GetVmData: No Vm data present! Status = (0x3) No Ffbm cookie found, ignore: Not Found Memory Base Address: 0x8000 Decompressing kernel image start: 32313 ms Decompressing kernel image done: 35477 ms BootLinux: failed to get dtbo image DTB offset is incorrect, kernel image does not have appended DTB Cmdline: ignore_loglevel console=ttyMSM0,115200n8 clk_ignore_unused pd_ignore_unused earlycon pcie_pme=nomsi root=PARTLABEL=userdata rootwait androidboot.bootdevice=1d84000.ufshc androidboot.serialno=8f186bb6 androidboot.baseband=msm msm_drm.dsi_display0= RAM Partitions Add Base: 0x8000 Available Length: 0xFDFA WARNING: Unsupported EFI_RAMPARTITION_PROTOCOL ERROR: Could not get splash memory region node kaslr-Seed is added to chosen node Shutting Down UEFI Boot Services: 36987 ms BDS: LogFs sync skipped, Unsupported App Log Flush : 0 ms Exit BS[37141] UEFI End [0.00] Booting Linux on physical CPU 0x00 [0x517f803c] [0.00] Linux version 5.13.0-rc3-00115-g9b6193ea776c-dirty (lumag@eriador) (aarch64-linux-gnu-gcc (Debian 11.1.0-1) 11.1.0, GNU ld (GNU Binutils for Debian) 2.35.2) #142 SMP PREEMPT Sun Jul 4 15:25:15 MSK 2021 [0.00] Machine model: Thundercomm Dragonboard 845c [0.00] printk: debug: ignoring loglevel setting. [0.00] earlycon: qcom_geni0 at MMIO 0x00a84000 (options '115200n8') [0.00] printk: bootconsole [qcom_geni0] enabled [0.00] efi: UEFI not found. [0.00] NUMA: No NUMA configuration found [0.00] NUMA: Faking a node at [mem 0x8000-0x00017df9] [0.00] NUMA: NODE_DATA [mem 0x17d78c200-0x17d78dfff] [0.00] Zone ranges: [0.00] DMA [mem
[PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling
From: Rob Clark This picks up an earlier series[1] from Jordan, and adds additional support needed to generate GPU devcore dumps on iova faults. Original description: This is a stack to add an Adreno GPU specific handler for pagefaults. The first patch starts by wiring up report_iommu_fault for arm-smmu. The next patch adds a adreno-smmu-priv function hook to capture a handful of important debugging registers such as TTBR0, CONTEXTIDR, FSYNR0 and others. This is used by the third patch to print more detailed information on page fault such as the TTBR0 for the pagetable that caused the fault and the source of the fault as determined by a combination of the FSYNR1 register and an internal GPU register. This code provides a solid base that we can expand on later for even more extensive GPU side page fault debugging capabilities. v5: [Rob] Use RBBM_STATUS3.SMMU_STALLED_ON_FAULT to detect case where GPU snapshotting needs to avoid crashdumper, and check the RBBM_STATUS3.SMMU_STALLED_ON_FAULT in GPU hang irq paths v4: [Rob] Add support to stall SMMU on fault, and let the GPU driver resume translation after it has had a chance to snapshot the GPUs state v3: Always clear FSR even if the target driver is going to handle resume v2: Fix comment wording and function pointer check per Rob Clark [1] https://lore.kernel.org/dri-devel/20210225175135.91922-1-jcro...@codeaurora.org/ Jordan Crouse (3): iommu/arm-smmu: Add support for driver IOMMU fault handlers iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info drm/msm: Improve the a6xx page fault handler Rob Clark (2): iommu/arm-smmu-qcom: Add stall support drm/msm: devcoredump iommu fault support drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 23 +++- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 110 +++- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 42 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 +++ drivers/gpu/drm/msm/msm_gem.h | 1 + drivers/gpu/drm/msm/msm_gem_submit.c| 1 + drivers/gpu/drm/msm/msm_gpu.c | 48 + drivers/gpu/drm/msm/msm_gpu.h | 17 +++ drivers/gpu/drm/msm/msm_gpummu.c| 5 + drivers/gpu/drm/msm/msm_iommu.c | 22 +++- drivers/gpu/drm/msm/msm_mmu.h | 5 +- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 50 + drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +- drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 + include/linux/adreno-smmu-priv.h| 38 ++- 15 files changed, 367 insertions(+), 21 deletions(-) -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu