Re: [PATCH v2] drm/amdgpu: add ring timeout information in devcoredump

2024-03-05 Thread Khatri, Sunil
On 3/5/2024 6:40 PM, Christian König wrote: Am 05.03.24 um 12:58 schrieb Sunil Khatri: Add ring timeout related information in the amdgpu devcoredump file for debugging purposes. During the gpu recovery process the registered call is triggered and add the debug information in data file

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil wrote: On 3/6/2024 6:12 PM, Christian König wrote: Am 06.03.24 um 11:40 schrieb Khatri, Sunil: On 3/6/2024 3:37 PM, Christian König wrote: Am 06.03.24 um

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 6:12 PM, Christian König wrote: Am 06.03.24 um 11:40 schrieb Khatri, Sunil: On 3/6/2024 3:37 PM, Christian König wrote: Am 06.03.24 um 10:04 schrieb Sunil Khatri: When an  page fault interrupt is raised there is a lot more information that is useful for developers to analyse

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 3:37 PM, Christian König wrote: Am 06.03.24 um 10:04 schrieb Sunil Khatri: When an  page fault interrupt is raised there is a lot more information that is useful for developers to analyse the pagefault. Well actually those information are not that interesting  because they are

Re: [PATCH 2/2] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Khatri, Sunil
On 3/8/2024 12:44 AM, Alex Deucher wrote: On Thu, Mar 7, 2024 at 12:00 PM Sunil Khatri wrote: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name:

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:59 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 11:21 AM Khatri, Sunil wrote: On 3/6/2024 9:45 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 11:06 AM Khatri, Sunil wrote: On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:19 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 10:32 AM Alex Deucher wrote: On Wed, Mar 6, 2024 at 10:13 AM Khatri, Sunil wrote: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil wrote

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
, we just need to provide faulting address, Fault status register with gpu family to decode the fault along with process information. Regards Sunil Khatri On 3/6/2024 9:56 PM, Khatri, Sunil wrote: On 3/6/2024 9:49 PM, Christian König wrote: Am 06.03.24 um 17:06 schrieb Khatri, Sunil: On 3/6

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil wrote: On 3/6/2024 6:12 PM, Christian König wrote: Am 06.03.24 um

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:45 PM, Alex Deucher wrote: On Wed, Mar 6, 2024 at 11:06 AM Khatri, Sunil wrote: On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6

RE: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
...@vger.kernel.org; Joshi, Mukul ; Paneer Selvam, Arunpravin ; Khatri, Sunil Subject: [PATCH] drm/amdgpu: cache in more vm fault information When an page fault interrupt is raised there is a lot more information that is useful for developers to analyse the pagefault. Add all such information

Re: [PATCH] drm/amdgpu: cache in more vm fault information

2024-03-06 Thread Khatri, Sunil
On 3/6/2024 9:49 PM, Christian König wrote: Am 06.03.24 um 17:06 schrieb Khatri, Sunil: On 3/6/2024 9:07 PM, Christian König wrote: Am 06.03.24 um 16:13 schrieb Khatri, Sunil: On 3/6/2024 8:34 PM, Christian König wrote: Am 06.03.24 um 15:29 schrieb Alex Deucher: On Wed, Mar 6, 2024 at 8

Re: [PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump

2024-03-08 Thread Khatri, Sunil
On 3/8/2024 2:39 PM, Christian König wrote: Am 07.03.24 um 21:50 schrieb Sunil Khatri: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name:

Re: [PATCH] drm/amdgpu: add the hw_ip version of all IP's

2024-03-15 Thread Khatri, Sunil
On 3/15/2024 6:45 PM, Alex Deucher wrote: On Fri, Mar 15, 2024 at 8:13 AM Sunil Khatri wrote: Add all the IP's version information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri This looks great. Reviewed-by: Alex Deucher Thanks Alex ---

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-14 Thread Khatri, Sunil
On 3/14/2024 11:40 AM, Sharma, Shashank wrote: On 14/03/2024 06:58, Khatri, Sunil wrote: On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-14 Thread Khatri, Sunil
On 3/14/2024 8:12 PM, Alex Deucher wrote: On Thu, Mar 14, 2024 at 1:44 AM Khatri, Sunil wrote: On 3/14/2024 1:58 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri

RE: [PATCH v2] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
...@vger.kernel.org; Zhang, Hawking ; Kuehling, Felix ; Lazar, Lijo ; Khatri, Sunil Subject: [PATCH v2] drm/amdgpu: refactor code to reuse system information Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a new file which

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
On 3/19/2024 7:19 PM, Lazar, Lijo wrote: On 3/19/2024 6:02 PM, Sunil Khatri wrote: Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a new file which would be the right place to hold functions which will be used between

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
Sent a new patch based on discussion with Alex. On 3/19/2024 8:34 PM, Christian König wrote: Am 19.03.24 um 15:59 schrieb Alex Deucher: On Tue, Mar 19, 2024 at 10:56 AM Christian König wrote: Am 19.03.24 um 15:26 schrieb Alex Deucher: On Tue, Mar 19, 2024 at 8:32 AM Sunil Khatri wrote:

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
On 3/19/2024 7:43 PM, Lazar, Lijo wrote: On 3/19/2024 7:27 PM, Khatri, Sunil wrote: On 3/19/2024 7:19 PM, Lazar, Lijo wrote: On 3/19/2024 6:02 PM, Sunil Khatri wrote: Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created

Re: [PATCH v2] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
On 3/19/2024 8:07 PM, Christian König wrote: Am 19.03.24 um 15:25 schrieb Sunil Khatri: Refactor the code so debugfs and devcoredump can reuse the common information and avoid unnecessary copy of it. created a new file which would be the right place to hold functions which will be used

RE: [PATCH] drm/amdgpu: add the hw_ip version of all IP's

2024-03-15 Thread Khatri, Sunil
; linux-ker...@vger.kernel.org; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add the hw_ip version of all IP's Add all the IP's version information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 62 +++ 1 file changed

Re: [PATCH] drm/amdgpu: refactor code to reuse system information

2024-03-19 Thread Khatri, Sunil
Validated the code by using the function in same way as ioctl would use in devcoredump and getting the valid values. Also this would be the container of the information that we need to share between ioctl, debugfs and devcoredump and keep updating this based on information needed. On

Re: [PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Khatri, Sunil
On 3/7/2024 6:10 PM, Christian König wrote: Am 07.03.24 um 09:37 schrieb Khatri, Sunil: On 3/7/2024 1:47 PM, Christian König wrote: Am 06.03.24 um 19:19 schrieb Sunil Khatri: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1

Re: [PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-06 Thread Khatri, Sunil
...@vger.kernel.org; Joshi, Mukul ; Paneer Selvam, Arunpravin ; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add vm fault information to devcoredump Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module

Re: [PATCH] drm/amdgpu: add vm fault information to devcoredump

2024-03-07 Thread Khatri, Sunil
On 3/7/2024 1:47 PM, Christian König wrote: Am 06.03.24 um 19:19 schrieb Sunil Khatri: Add page fault information to the devcoredump. Output of devcoredump: AMDGPU Device Coredump version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name:

RE: [PATCH] drm/amdgpu: add all ringbuffer information in devcoredump

2024-03-11 Thread Khatri, Sunil
-ker...@vger.kernel.org; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add all ringbuffer information in devcoredump Add ringbuffer information such as: rptr, wptr, ring name, ring size and also the ring contents for each ring on a gpu reset. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd

Re: [PATCH] drm/amdgpu: add ring buffer information in devcoredump

2024-03-11 Thread Khatri, Sunil
On 3/11/2024 7:29 PM, Christian König wrote: Am 11.03.24 um 13:22 schrieb Sunil Khatri: Add relevant ringbuffer information such as rptr, wptr, ring name, ring size and also the ring contents for each ring on a gpu reset. Signed-off-by: Sunil Khatri ---  

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Khatri, Sunil
c: amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; Khatri, Sunil Subject: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's Add firmware version information of each IP and each instance where applicable. Signed-off-by: Sunil Khatri --- drivers/gpu/d

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
c: amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; Khatri, Sunil Subject: [PATCH 1/2] drm/amdgpu: add the IP information of the soc Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
c: amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; Khatri, Sunil Subject: [PATCH 1/2] drm/amdgpu: add the IP information of the soc Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
On 3/14/2024 1:58 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 19 +++ 1 file changed, 19 insertions(+)

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Khatri, Sunil
On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can share some common code with devcoredump, debugfs, and the info IOCTL? All three places need

Re: [PATCH] drm/amdgpu: add support of bios dump in devcoredump

2024-03-26 Thread Khatri, Sunil
On 3/26/2024 10:23 PM, Alex Deucher wrote: On Tue, Mar 26, 2024 at 10:38 AM Sunil Khatri wrote: dump the bios binary in the devcoredump. Signed-off-by: Sunil Khatri --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 20 +++ 1 file changed, 20 insertions(+) diff --git

Re: [PATCH] drm/amdgpu: add IP's FW information to devcoredump

2024-03-27 Thread Khatri, Sunil
On 3/28/2024 8:38 AM, Alex Deucher wrote: On Tue, Mar 26, 2024 at 1:31 PM Sunil Khatri wrote: Add FW information of all the IP's in the devcoredump. Signed-off-by: Sunil Khatri Might want to include the vbios version info as well, e.g., atom_context->name atom_context->vbios_pn