Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues: - [High] Missing bounds check on `rdev->num_cores` against `soc_data->num_cores` causes an out-of-bounds array write. Pre-existing issues: - [Critical] Early registration of the DRM device creates a race window leading to a NULL pointer dereference in `rocket_open`. - [High] Array indexing mismatch on unbind/rebind causes state corruption and failed lookups. - [High] Race condition during global device initialization and array modification due to lack of locking. - [High] Failure in `rocket_device_init()` leaves an `ERR_PTR` in the global `rdev` variable, causing a panic on the next probe. -- commit d76917aa771a1a958c02a6121211647bc25a0160 Author: Midgy BALON <[email protected]> accel: rocket: Derive DMA width and core count from match data The probe already has the per-SoC match data, which now records the core count and DMA width. Use it for the cores array allocation and the device DMA mask instead of re-scanning the device tree for available core nodes. > diff --git a/drivers/accel/rocket/rocket_device.c > b/drivers/accel/rocket/rocket_device.c > index 46e6ee1e72c5f..6186f4faa3a2a 100644 > --- a/drivers/accel/rocket/rocket_device.c > +++ b/drivers/accel/rocket/rocket_device.c [ ... ] > @@ -27,17 +25,14 @@ struct rocket_device *rocket_device_init(struct > platform_device *pdev, > ddev = &rdev->ddev; > dev_set_drvdata(dev, rdev); > > - for_each_compatible_node(core_node, NULL, "rockchip,rk3588-rknn-core") > - if (of_device_is_available(core_node)) > - num_cores++; > - > - rdev->cores = devm_kcalloc(dev, num_cores, sizeof(*rdev->cores), > GFP_KERNEL); > + rdev->cores = devm_kcalloc(dev, soc_data->num_cores, > sizeof(*rdev->cores), > + GFP_KERNEL); [Severity: Critical] This is a pre-existing issue, but does rocket_device_init() register the DRM device prematurely? The first core to probe invokes rocket_device_init(), which immediately calls drm_dev_register() at the end of the function. This exposes the DRM device to userspace before the caller can initialize the core array elements. If userspace quickly opens the DRM device node: rocket_open() rocket_iommu_domain_create(rdev->cores[0].dev) Since the device pointer in the array has not yet been assigned in rocket_probe() and is still NULL, this results in a NULL pointer dereference. > if (!rdev->cores) > return ERR_PTR(-ENOMEM); > > dma_set_max_seg_size(dev, UINT_MAX); > > - err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40)); > + err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(soc_data->dma_bits)); > if (err) > return ERR_PTR(err); > > diff --git a/drivers/accel/rocket/rocket_drv.c > b/drivers/accel/rocket/rocket_drv.c > index 384c38e13acce..c18840e5aff76 100644 > --- a/drivers/accel/rocket/rocket_drv.c > +++ b/drivers/accel/rocket/rocket_drv.c > @@ -159,11 +159,15 @@ static const struct drm_driver rocket_drm_driver = { > > static int rocket_probe(struct platform_device *pdev) > { > + const struct rocket_soc_data *soc_data = > of_device_get_match_data(&pdev->dev); > int ret; > > + if (!soc_data) > + return -EINVAL; > + > if (rdev == NULL) { [Severity: High] This isn't a bug introduced by this patch, but does this lack locking? The kernel driver core can probe multiple matching rockchip,rk3588-rknn-core devices in parallel. Concurrent probes could observe rdev == NULL simultaneously, instantiating the DRM device multiple times and leaking memory. Additionally, concurrent reads and increments of rdev->num_cores can cause multiple threads to overwrite the same array slot in rdev->cores[] while leaving others uninitialized. > /* First core probing, initialize DRM device. */ > - rdev = rocket_device_init(drm_dev, &rocket_drm_driver); > + rdev = rocket_device_init(drm_dev, &rocket_drm_driver, > soc_data); [Severity: High] This is a pre-existing issue, but what happens if rocket_device_init() fails? If rocket_device_init() returns an error pointer, it is directly assigned to the global rdev variable. The error path correctly returns the error but fails to reset the static global rdev back to NULL. When the driver core invokes rocket_probe() for the next matching device tree node, the rdev == NULL check evaluates to false because rdev contains an error pointer. The probe will then attempt to access rdev->num_cores, causing an invalid pointer dereference. > if (IS_ERR(rdev)) { > dev_err(&pdev->dev, "failed to initialize rocket > device\n"); > return PTR_ERR(rdev); > } > } [Severity: High] Can this cause an out-of-bounds array write? The allocation of the rdev->cores array now uses a fixed size defined by soc_data->num_cores instead of dynamically counting the device tree nodes. However, rocket_probe() is called by the driver core for every matching DT node, and it unconditionally uses the auto-incrementing rdev->num_cores as the array index later in the function. If a malformed or custom device tree contains more core nodes than soc_data->num_cores, rocket_probe() will be invoked more times than the array can hold, writing past the end of the rdev->cores buffer. [ ... ] > @@ -214,6 +218,7 @@ static void rocket_remove(struct platform_device *pdev) > } [Severity: High] This isn't a bug introduced by this patch, but does the array indexing handle unbind and rebind correctly? When a core is unbound, rocket_remove() decrements rdev->num_cores without compacting the remaining active cores in the rdev->cores array to fill the empty slot. Consequently, find_core_for_dev() (which relies on rdev->num_cores for its loop bound) fails to find any remaining cores located at indices >= the new rdev->num_cores, preventing their cleanup. Furthermore, if a core is re-bound, rocket_probe() will use the decremented rdev->num_cores as the index and overwrite an existing active core, causing resource leaks and state corruption. > > static const struct rocket_soc_data rk3588_soc_data = { > + .num_cores = 3, > .dma_bits = 40, > }; -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=2
