There is no issue delivering VM with passthrough GPU with CloudStack, We've been doing this using XenServer as hypervisor. It requires an enterprise license from Citrix to enable the GPU and vGPU support features.
As we don't use KVM , I can't say much about it. there is some limitation delivering GPU/vGPU to VMs, on XenServer you can only share a single GPU per VM in passthrough, so if your server have 4 GPU cards, you can have 4 VMs with passthrough GPU. 1 VM can support multiple vGPU, not sure for the count. If you plan to deploy vGPU , you also need the RTX driver installed on the hypervisor, and licensing service in order to deliver the vGPU. vGPU definition and naming seams standard across hypervisors depending on the GPU model. I would think that you can deliver GPU in passthrough for H100 GPU, will know more later this year. What would you like to deliver from single H100 per VMs? Extending vGPU support in cloudstack is easy: https://github.com/apache/cloudstack/blob/6dc3d06037c39019f29686281856443c37a3e6c0/api/src/main/java/com/cloud/gpu/GPU.java#L27 Offering can be created for GPU passthrough that are not listed, this is mostly for CloudStack UI to list available GPU during compute-offering creation. On Fri, Feb 23, 2024 at 9:04 AM Ivan Kud <kudryavtsev...@bw-sw.com> wrote: > Another way to deal with it is to use KVM agent hooks (this is my code > implemented specifically to deal with GPUs and VM-dedicated drives): > > https://github.com/apache/cloudstack/blob/8f6721ed4c4e1b31081a951c62ffbe5331cf16d4/agent/conf/agent.properties#L123 > > You can implement the logic in Groovy to modify XML during the start to > support extra devices out of CloudStack management. > > On Fri, Feb 23, 2024 at 2:36 PM Jorge Luiz Correa > <jorge.l.cor...@embrapa.br.invalid> wrote: > > > Hi Bryan! We are using here but in a different way, customized for our > > environment and using how it is possible the features of CloudStack. In > > documentation we can see support for some GPU models a little bit old > > today. > > > > We are using pci passthrough. All hosts with GPU are configured to boot > > with IOMMU and vfio-pci, not loading kernel modules for each GPU. > > > > Then, we create a serviceoffering to describe VMs that will have GPU. In > > this serviceoffering we use the serviceofferingdetails[1].value field to > > insert a block of configuration related to the GPU. It is something like > > "<device> ... <hostdev> ... address type=pci" that describes the PCI bus > > from each GPU. Then, we use tags to force this computeoffering to run > only > > in hosts with GPUs. > > > > We create a Cloudstack cluster with a lot of hosts equipped with GPUs. > When > > a user needs a VM with GPU he/she should use the created computeoffering. > > VM will be instantiated in some host of the cluster and GPUs are > > passthrough to VM. > > > > There are no control executed by cloudstack. For example, it can try to > > instantiate a VM in a host when a GPU is already being used (will fail). > > Our management is that the ROOT admin always controls that creation. We > > launch all VMs using all GPUs from the infrastructure. Then we use a > queue > > manager to run jobs in those VMs with GPUs. When a user needs a dedicated > > VM to develop something, we can shutdown a VM already running (that is > part > > of the queue manager as processor node) and then create this dedicated > VM, > > that uses the GPUs isolated. > > > > There are some possibilities when using GPUs. For example, some models > > accept virtualization when we can divide a GPU. In that case, Cloudstack > > would need to support that, so it would manage the driver, creating the > > virtual GPUs based on information input from the user, as memory size. > > Then, it should manage the hypervisor to passthrough the virtual gpu to > VM. > > > > Another possibility that would help us in our scenario is to make some > > control about PCI buses in hosts. For example, if Cloustack could check > if > > a PCI is being used in some host and then use this information in VM > > scheduling, would be great. Cloudstack could launch VMs in a host that > has > > a PCI address free. This would be used not only for GPUs, but any PCI > > device. > > > > I hope this can help in some way, to think of new scenarios etc. > > > > Thank you! > > > > Em qui., 22 de fev. de 2024 às 07:56, Bryan Tiang < > > bryantian...@hotmail.com> > > escreveu: > > > > > Hi Guys, > > > > > > Anyone running Cloudstack with GPU Support in Production? Say NVIDIA > H100 > > > or AMD M1300X? > > > > > > Just want to know if there is any support for this still on going, or > > > anyone who is running a cloud business with GPUs. > > > > > > Regards, > > > Bryan > > > > > > > -- > > __________________________ > > Aviso de confidencialidade > > > > Esta mensagem da > > Empresa Brasileira de Pesquisa Agropecuaria (Embrapa), empresa publica > > federal regida pelo disposto na Lei Federal no. 5.851, de 7 de > dezembro > > de 1972, e enviada exclusivamente a seu destinatario e pode conter > > informacoes confidenciais, protegidas por sigilo profissional. Sua > > utilizacao desautorizada e ilegal e sujeita o infrator as penas da lei. > > Se voce a recebeu indevidamente, queira, por gentileza, reenvia-la ao > > emitente, esclarecendo o equivoco. > > > > Confidentiality note > > > > This message from > > Empresa Brasileira de Pesquisa Agropecuaria (Embrapa), a government > > company established under Brazilian law (5.851/72), is directed > > exclusively to its addressee and may contain confidential data, > > protected under professional secrecy rules. Its unauthorized use is > > illegal and may subject the transgressor to the law's penalties. If you > > are not the addressee, please send it back, elucidating the failure. > > > > > -- > With best regards, Ivan Kudriavtsev > BWSoft Management LLC > Cell AM: +374-43-047-914 > Cell USA: +1-201-257-1512 > WWW: http://bitworks.software/ <http://bw-sw.com/> >