Hi Bin,

Yes, with this change applied, its not necessary to explicitly initialize the kernel bo handles in ISP driver. We can continue using kmalloc() in isp4if_gpu_mem_alloc() of ISP driver.

Thanks,
Pratap


On 11/9/2025 10:12 PM, Du, Bin wrote:
Thans Sultan & Pratap,

So, based on the discussion, the ISP driver will retain its current implementation and won’t do unnecessary init to *bo before calling isp_kernel_buffer_alloc().

On 11/7/2025 4:51 AM, Nirujogi, Pratap wrote:


On 11/6/2025 3:31 PM, Sultan Alsawaf wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


On Thu, Nov 06, 2025 at 03:08:44PM -0500, Nirujogi, Pratap wrote:


On 11/6/2025 1:58 PM, Sultan Alsawaf wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote:
On 11/5/25 5:21 PM, Sultan Alsawaf wrote:
From: Sultan Alsawaf <[email protected]>

When the BO pointer provided to amdgpu_bo_create_kernel() points to
non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address
rather than allocate a new BO.

This functionality is never desired for allocating ISP buffers. A new BO should always be created when isp_kernel_buffer_alloc() is called, per the
description for isp_kernel_buffer_alloc().

Are you just going off descriptions, or is there a problem with reuse?

I am going based off the ISP4 driver's usage of isp_kernel_buffer_alloc().

This fixes the following crash I experienced on v5 of the ISP4 patchset:

    [  175.485627] BUG: unable to handle page fault for address: 00007f6b1092e888
    [  175.485799] #PF: supervisor read access in kernel mode
    [  175.485840] #PF: error_code(0x0000) - not-present page
    [  175.485869] PGD 0 P4D 0
    [  175.485889] Oops: Oops: 0000 [#1] SMP
    [  175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese Tainted: G     U  W           6.17.7 #1 PREEMPT
    [  175.485921] Tainted: [U]=USER, [W]=WARN
    [  175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025     [  175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu]     [  175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6
    [  175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202
    [  175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000     [  175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20     [  175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8     [  175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002     [  175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000     [  175.486037] FS:  00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000
    [  175.486046] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0
    [  175.486067] PKRU: 55555554
    [  175.486072] Call Trace:
    [  175.486081]  <TASK>
    [  175.486092]  ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 [amdgpu]
    [  175.486102]  amdgpu_bo_create_kernel+0x15/0x70 [amdgpu]
    [  175.486110]  isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu]
    [  175.486118]  isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture]
    [  175.486126]  isp4if_start+0x3a/0x320 [amd_capture]
    [  175.486141]  isp4sd_set_power+0x96/0x1e0 [amd_capture]
    [  175.486148]  pipeline_pm_power_one+0xf2/0x110 [videodev]
    [  175.486155]  pipeline_pm_power+0x51/0x90 [videodev]
    [  175.486161]  v4l2_pipeline_pm_use+0x3b/0x60 [videodev]
    [  175.486169]  isp4vid_qops_start_streaming+0x22/0x140 [amd_capture]     [  175.486176]  ? isp4vid_qops_buffer_queue+0xb1/0x140 [amd_capture]
    [  175.486185]  vb2_start_streaming+0x79/0x140 [videobuf2_common]
    [  175.486192]  vb2_core_qbuf+0x3b5/0x480 [videobuf2_common]
    [  175.486200]  vb2_qbuf+0x98/0x100 [videobuf2_v4l2]
    [  175.486208]  __video_do_ioctl+0x480/0x4b0 [videodev]
    [  175.486219]  video_usercopy+0x1e5/0x760 [videodev]
    [  175.486226]  ? v4l_s_output+0x50/0x50 [videodev]
    [  175.486237]  v4l2_ioctl+0x45/0x50 [videodev]
    [  175.486245]  __x64_sys_ioctl+0x77/0xc0
    [  175.486251]  do_syscall_64+0x48/0xbf0
    [  175.486260]  entry_SYSCALL_64_after_hwframe+0x50/0x58
    [  175.486268] RIP: 0033:0x7fb03371674d
    [  175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00     [  175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010     [  175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: 00007fb03371674d     [  175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: 000000000000002c     [  175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: 0000000000000210     [  175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: 00007faf9801c4b0     [  175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: 0000000000000001
    [  175.486324]  </TASK>
    [  175.486330] Modules linked in: ccm hid_sensor_prox hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi     [  175.486373]  soundwire_bus snd_soc_sdca snd_soc_core snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee i2c_hid_acpi serial_multi_instantiate     [  175.486384]  i2c_hid amd_sfh thunderbolt wireless_hotkey amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring serio_raw ccp nvme_auth
    [  175.486394] CR2: 00007f6b1092e888
    [  175.486402] ---[ end trace 0000000000000000 ]---
    [  175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu]     [  175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6
    [  175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202
    [  175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000     [  175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20     [  175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8     [  175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002     [  175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000     [  175.486455] FS:  00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000
    [  175.486460] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0
    [  175.486470] PKRU: 55555554

Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer argument (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But since this
issue slipped past Bin and presumably others who reviewed the code, it
highlights that isp_kernel_buffer_alloc() is not working as expected in this respect, and the description for isp_kernel_buffer_alloc() reinforces this.

Thanks Sultan for suggesting this fix. Yes, its hard to believe that this
slipped through until now.

Instead of initializing *bo=NULL, I suggest ensuring *bo is actually NULL
before calling amdgpu_bo_create_kernel().

int isp_kernel_buffer_alloc(...)
{
       ...
       if (WARN_ON(*bo))
               return -EINVAL;
       ...
       ret = amdgpu_bo_create_kernel(..)
       ...
}

This way the caller will get to know parameters passed are invalid and can
take care of initializing the handles appropriately.

Hi Pratap,

I am opposed to this idea for multiple reasons:

1. *bo is an output parameter from isp_kernel_buffer_alloc(), so the input value
    should not matter.

2. The only correct value for *bo is NULL when isp_kernel_buffer_alloc() passes     it to amdgpu_bo_create_kernel(). Since there is only one correct value, there     is no reason to burden callers of isp_kernel_buffer_alloc() with intializing
    *bo to NULL.

3. This adds more code, another WARN_ON(), and another error case to
    isp_kernel_buffer_alloc(). All of that can be eliminated entirely by just
    initializing *bo to NULL in isp_kernel_buffer_alloc(), as I've done.

4. The reason *bo needs to be NULL is an implementation detail that is internal     to isp_kernel_buffer_alloc(), because amdgpu_bo_create_kernel() needs it to     be NULL in order to allocate a new buffer. isp_kernel_buffer_alloc() should     handle its own internal implementation details instead of punting the
    responsibility onto callers.

Sultan

Either approach works for me. My main intention is to ensure the caller passes the BO handle initialized to NULL in this case. That said, initializing *bo = NULL as you explained is perfectly fine too.

Reviewed-by: Pratap Nirujogi <[email protected]>


Thanks,
Pratap


Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call.

Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers")
Signed-off-by: Sultan Alsawaf <[email protected]>
---
    drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++
    1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/ gpu/drm/amd/amdgpu/amdgpu_isp.c
index 9cddbf50442a..37270c4dab8d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
@@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size,
      if (ret)
              return ret;
+   /* Ensure *bo is NULL so a new BO will be created */
+   *bo = NULL;
      ret = amdgpu_bo_create_kernel(adev,
                                    size,
                                    ISP_MC_ADDR_ALIGN,


[1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/

Sultan




Reply via email to