GitHub user meisenst-dnd edited a discussion: NVIDIA vGPU with A16
Hi all,
I'm trying to enable NVIDIA vGPU with an A16 card, which is essentially
multiple A2 chips on a single board (and, therefore, driver-compatible with the
A2). With SR-IOV enabled, you can use up to 68 individual profiles on the card
at the same time.
CloudStack has support built-in for the A2, but it doesn't seem to recognize
the A16 as many A2s. I've set the offering to be deployed with the A2-1B
profile, however, the virtual devices list A16 profiles instead:
> root@[redacted]:/sys/class/mdev_bus/000:ce:00.4/mdev_support_types/pci-709#
> cat name
> NVIDIA A16-1B
The management log confirms that CloudStack doesn't see the appropriate card
present in any of the hosts (there are 6 hosts, all with an A16 in them):
> 2024-12-10 12:39:27,184 DEBUG [c.c.a.m.a.i.FirstFitRoutingAllocator]
> (API-Job-Executor-1:[ctx-3d64c55e, job-882, ctx-e8fb6f86,
> FirstFitRoutingAllocator]) (logid:4e08e7e3) Adding host
> [{"name":"[Redacted]",uuid":"13ac1a87-d02a-4356-a4fa-584804de1849"}] to avoid
> set, because this host does not have required GPU devices available.
The devices are listed in lspci as such:
> d4:02.3 3D controller [0302]: NVIDIA Corporation GA107GL [A2 / A16]
> [10de:25b6] (rev a1)
Has anyone successfully used an A16 with CloudStack? If not, can support for
the A16 be added?
Thanks,
Mark
GitHub link: https://github.com/apache/cloudstack/discussions/10076
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]