On 3/26/26 6:48 PM, Erni Sri Satya Vennela wrote:
> As a part of MANA hardening for CVM, validate that max_num_sq and
> max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
> values flow into apc->num_queues, which is used as an allocation count
> and loop bound. A zero value would result in zero-size allocations and
> incorrect driver behavior.
>
> Return -EPROTO if either value is zero.
>
> Signed-off-by: Erni Sri Satya Vennela <[email protected]>
> ---
> drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index b39e8b920791..a4197b4b0597 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1249,6 +1249,12 @@ static int mana_query_vport_cfg(struct
> mana_port_context *apc, u32 vport_index,
>
> *max_sq = resp.max_num_sq;
> *max_rq = resp.max_num_rq;
> +
> + if (*max_sq == 0 || *max_rq == 0) {
> + netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
> + return -EPROTO;
AI review says:
Will returning -EPROTO here expose a pre-existing resource leak in the
driver's teardown path?
If mana_query_vport_cfg() returns an error, mana_init_port() fails and
mana_probe_port() frees the ndev, leaving ac->ports[i] as NULL. In
mana_probe(), the port initialization loop breaks upon this error, but
the err variable is then overwritten:
mana_probe() {
...
for (i = 0; i < ac->num_ports; i++) {
err = mana_probe_port(ac, i, &ac->ports[i]);
if (err) {
dev_err(dev, "Probe Failed for port %d\n", i);
break;
}
}
err = add_adev(gd, "eth");
...
}
If add_adev() succeeds, mana_probe() completes successfully instead of
failing, masking the earlier error while leaving ac->ports[0] as NULL.
Later, when the driver is unloaded or if add_adev() fails and triggers
immediate cleanup, mana_remove() is called. It iterates over ac->ports
and, upon encountering the NULL device, immediately executes goto out:
mana_remove() {
...
for (i = 0; i < ac->num_ports; i++) {
ndev = ac->ports[i];
if (!ndev) {
if (i == 0)
...
goto out;
}
...
}
mana_destroy_eq(ac);
out:
...
}
Because the out label in mana_remove() is located after the
mana_destroy_eq(ac) call, jumping there completely skips destroying the
event queues allocated earlier by mana_create_eq(ac).
In a Confidential Virtual Machine context, could an untrusted hypervisor
repeatedly return invalid configs to continuously leak guest memory and
hardware queues?