Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
to consider:
- [Critical] Use-After-Free and Stack Corruption due to abandoned VMBus 
requests on timeout.
- [Medium] Race condition in wait_for_response() polling loop can falsely 
trigger timeout.
--

commit 18b7ab3e09172b3f0651ad176d0ab6367f1e43bb
Author: Hamza Mahfooz <[email protected]>

PCI: hv: add hard timeout to wait_for_response()

This commit adds a hard 5-second timeout to wait_for_response() in the
hyper-V PCI controller driver. It addresses an issue where the driver could
wait indefinitely for a device to appear if a rescind event is never
received.

> diff --git a/drivers/pci/controller/pci-hyperv.c 
> b/drivers/pci/controller/pci-hyperv.c
> index cfc8fa403dad6..bd63efc4a2103 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c

[ ... ]

> @@ -1045,8 +1048,13 @@ static void put_pcichild(struct hv_pci_dev *hpdev)
>  static int wait_for_response(struct hv_device *hdev,
>                            struct completion *comp)
>  {
> +     unsigned long timeout = get_jiffies_64() + msecs_to_jiffies(TIMEOUT_MS);
> +     unsigned long now;
> +
>       while (true) {
> -             if (hdev->channel->rescind) {
> +             now = get_jiffies_64();
> +             if (hdev->channel->rescind ||
> +                 time_after(now, timeout)) {

[Severity: Medium]
Is there a race condition if the thread is preempted right before updating
the time?

If the thread is preempted just before calling get_jiffies_64() and the
delay crosses the 5-second boundary, this will evaluate to true and
immediately return -ENODEV.

Since there is no final check of the completion status after the timeout
check, could this falsely assume a timeout even if the Hyper-V host
successfully replied during the preemption window?

>                       dev_warn_once(&hdev->device, "The device is gone.\n");
>                       return -ENODEV;

[Severity: Critical]
By adding this timeout, do we risk a use-after-free and stack corruption if
the VMBus request is never explicitly cancelled?

When we return -ENODEV early here, callers like hv_send_resources_allocated()
will free their heap-allocated request packets and return, which pops the
local completion structure off the stack.

However, since the VMBus channel remains active and the pending request is
not removed from the requestor array, what happens if the Hyper-V host
eventually replies?

Wouldn't hv_pci_onchannelcallback() fetch the dangling packet pointer and
invoke comp_packet->completion_func(...) on freed memory?

It seems this could also corrupt the stack when writing to the destroyed
completion structure. Additionally, in hv_pci_query_relations(), returning
on timeout leaves hbus->survey_event pointing to a local stack variable
that will later be completed by survey_child_resources().

Does this also leak a slot in the VMBus requestor array for each timed-out
request?

>               }
> 
>               if (wait_for_completion_timeout(comp, HZ / 10))

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to