[PATCH] drm/amdgpu: add IP's FW information to devcoredump

2024-03-26 Thread Sunil Khatri
Add FW information of all the IP's in the devcoredump. Signed-off-by: Sunil Khatri --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 122 ++ 1 file changed, 122 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: add support of bios dump in devcoredump

2024-03-26 Thread Sunil Khatri
dump the bios binary in the devcoredump. Signed-off-by: Sunil Khatri --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 20 +++ 1 file changed, 20 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c

[PATCH] drm/amdgpu: fix function implicit declaration error

2024-03-21 Thread Sunil Khatri
in both the cases the build does not fail. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 95028f57cb56..f771b2042a43

[PATCH v2] drm/amdgpu: refactor code to split devcoredump code

2024-03-20 Thread Sunil Khatri
Refractor devcoredump code into new files since its functionality is expanded further and better to slit and devcoredump to have its own file. v2: Fix the build failure caught by arm compiler of implicit function declaration with #ifdef Cc: Ivan Lipski Signed-off-by: Sunil Khatri --- drivers

[PATCH] drm/amdgpu: refactor code to split devcoredump code

2024-03-20 Thread Sunil Khatri
Refractor devcoredump code into new files since its functionality is expanded further and better to slit and devcoredump to have its own file. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 218

[PATCH v3] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Sunil Khatri
-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_coreinfo.c | 146 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_coreinfo.h | 33 + drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 117 +-- 4 files changed, 182 insertions

[PATCH v2] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Sunil Khatri
-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_coreinfo.c | 146 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_coreinfo.h | 33 + drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 117 +-- 4 files changed, 182 insertions

[PATCH v2] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Sunil Khatri
-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_devinfo.c | 151 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 118 +-- 4 files changed, 157 insertions

[PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Sunil Khatri
-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_devinfo.c | 151 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 118 +-- 4 files changed, 157 insertions

[PATCH] drm/amdgpu: remove the adev check for NULL

2024-03-18 Thread Sunil Khatri
adev is a global data structure and isn't expected to be NULL and hence removing the redundant adev check from the devcoredump code. CC: Dan Carpenter Signed-off-by: Sunil Khatri Suggested-by: Dan Carpenter --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 57 ++- 1 file

[PATCH] drm/amdgpu: add the hw_ip version of all IP's

2024-03-15 Thread Sunil Khatri
Add all the IP's version information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 62 +++ 1 file changed, 62 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu

[PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-12 Thread Sunil Khatri
Add firmware version information of each IP and each instance where applicable. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 122 ++ 1 file changed, 122 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm

[PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-12 Thread Sunil Khatri
Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c

[PATCH v2] drm/amdgpu: add ring buffer information in devcoredump

2024-03-11 Thread Sunil Khatri
Add relevant ringbuffer information such as rptr, wptr,rb mask, ring name, ring size and also the rings content for each ring on a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 21 + 1 file changed, 21 insertions(+) diff --git

[PATCH] drm/amdgpu: add ring buffer information in devcoredump

2024-03-11 Thread Sunil Khatri
Add relevant ringbuffer information such as rptr, wptr, ring name, ring size and also the ring contents for each ring on a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/gpu

[PATCH] drm/amdgpu: add all ringbuffer information in devcoredump

2024-03-11 Thread Sunil Khatri
Add ringbuffer information such as: rptr, wptr, ring name, ring size and also the ring contents for each ring on a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/gpu/drm/amd

[PATCH v3] drm/amdgpu: add vm fault information to devcoredump

2024-03-08 Thread Sunil Khatri
fault observed Faulty page starting at address: 0x Protection fault status register: 0x301031 VRAM is lost due to GPU reset! Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm

[PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Sunil Khatri
fault observed Faulty page starting at address: 0x Protection fault status register: 0x301031 VRAM is lost due to GPU reset! Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git

[PATCH v2 0/2] Add pagefault support for devcoredump

2024-03-07 Thread Sunil Khatri
Add support of devcoredump from global object of amdgpu_device Sunil Khatri (2): drm/amdgpu: add recent pagefault info in vm_manager drm/amdgpu: add vm fault information to devcoredump drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

[PATCH v2 1/2] drm/amdgpu: add recent pagefault info in vm_manager

2024-03-07 Thread Sunil Khatri
Currently page fault information is stored per vm and which could be freed or stale during reset. Add it pagefault information in the vm_manager which is a global space for vm's and remains valid across. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8

[PATCH 2/2] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Sunil Khatri
fault observed Faulty page starting at address 0x Protection fault status register:0x301031 VRAM is lost due to GPU reset! Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git

[PATCH 1/2] drm/amdgpu: add recent pagefault info in vm_manager

2024-03-07 Thread Sunil Khatri
Currently page fault information is stored per vm and which could be freed or stale during reset. Add it pagefault information in the vm_manager which is a global space for vm's and remains valid across. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8

[PATCH 0/2] Add pagefault support for devcoredump

2024-03-07 Thread Sunil Khatri
Add support of devcoredump from global object of amdgpu_device Sunil Khatri (2): drm/amdgpu: add recent pagefault info in vm_manager drm/amdgpu: add vm fault information to devcoredump drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

[PATCH v2] drm/amdgpu: add vm fault information to devcoredump

2024-03-06 Thread Sunil Khatri
fault observed Faulty page starting at address 0x Protection fault status register:0x301031 VRAM is lost due to GPU reset! Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 + 2 files changed

[PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-06 Thread Sunil Khatri
fault observed for GPU family:143 Faulty page starting at address 0x Protection fault status register:0x301031 VRAM is lost due to GPU reset! Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 15 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h

[PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Sunil Khatri
When an page fault interrupt is raised there is a lot more information that is useful for developers to analyse the pagefault. Add all such information in the last cached pagefault from an interrupt handler. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9

[PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Sunil Khatri
When an page fault interrupt is raised there is a lot more information that is useful for developers to analyse the pagefault. Add all such information in the last cached pagefault from an interrupt handler. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9

[PATCH v3] drm/amdgpu: add ring timeout information in devcoredump

2024-03-05 Thread Sunil Khatri
-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 + 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c index a59364e9b6ed

[PATCH v2] drm/amdgpu: add ring timeout information in devcoredump

2024-03-05 Thread Sunil Khatri
-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 15 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 2 ++ 2 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c index a59364e9b6ed