There is no issue delivering VM with passthrough GPU with CloudStack, We've
been doing this using XenServer as hypervisor. It requires an enterprise
license from Citrix to enable the GPU and vGPU support features.

As we don't use KVM , I can't say much about it.

there is some limitation delivering GPU/vGPU to VMs, on XenServer you can
only share a single GPU per VM in passthrough, so if your server have 4 GPU
cards, you can have 4 VMs with passthrough GPU. 1 VM can support multiple
vGPU, not sure for the count.

If you plan to deploy vGPU , you also need the RTX driver installed on the
hypervisor, and licensing service in order to deliver the vGPU. vGPU
definition and naming  seams standard across hypervisors depending on the
GPU model.

I would think that you can deliver GPU in passthrough for H100 GPU,
will know more later this year. What would you like to deliver from single
H100 per VMs?


Extending vGPU support in cloudstack is easy:
https://github.com/apache/cloudstack/blob/6dc3d06037c39019f29686281856443c37a3e6c0/api/src/main/java/com/cloud/gpu/GPU.java#L27

Offering can be created for GPU passthrough that are not listed, this is
mostly for CloudStack UI to list available GPU during compute-offering
creation.




On Fri, Feb 23, 2024 at 9:04 AM Ivan Kud <kudryavtsev...@bw-sw.com> wrote:

> Another way to deal with it is to use KVM agent hooks (this is my code
> implemented specifically to deal with GPUs and VM-dedicated drives):
>
> https://github.com/apache/cloudstack/blob/8f6721ed4c4e1b31081a951c62ffbe5331cf16d4/agent/conf/agent.properties#L123
>
> You can implement the logic in Groovy to modify XML during the start to
> support extra devices out of CloudStack management.
>
> On Fri, Feb 23, 2024 at 2:36 PM Jorge Luiz Correa
> <jorge.l.cor...@embrapa.br.invalid> wrote:
>
> > Hi Bryan! We are using here but in a different way, customized for our
> > environment and using how it is possible the features of CloudStack. In
> > documentation we can see support for some GPU models a little bit old
> > today.
> >
> > We are using pci passthrough. All hosts with GPU are configured to boot
> > with IOMMU and vfio-pci, not loading kernel modules for each GPU.
> >
> > Then, we create a serviceoffering to describe VMs that will have GPU. In
> > this serviceoffering we use the serviceofferingdetails[1].value field to
> > insert a block of configuration related to the GPU. It is something like
> > "<device> ... <hostdev> ... address type=pci" that describes the PCI bus
> > from each GPU. Then, we use tags to force this computeoffering to run
> only
> > in hosts with GPUs.
> >
> > We create a Cloudstack cluster with a lot of hosts equipped with GPUs.
> When
> > a user needs a VM with GPU he/she should use the created computeoffering.
> > VM will be instantiated in some host of the cluster and GPUs are
> > passthrough to VM.
> >
> > There are no control executed by cloudstack. For example, it can try to
> > instantiate a VM in a host when a GPU is already being used (will fail).
> > Our management is that the ROOT admin always controls that creation. We
> > launch all VMs using all GPUs from the infrastructure. Then we use a
> queue
> > manager to run jobs in those VMs with GPUs. When a user needs a dedicated
> > VM to develop something, we can shutdown a VM already running (that is
> part
> > of the queue manager as processor node) and then create this dedicated
> VM,
> > that uses the GPUs isolated.
> >
> > There are some possibilities when using GPUs. For example, some models
> > accept virtualization when we can divide a GPU. In that case, Cloudstack
> > would need to support that, so it would manage the driver, creating the
> > virtual GPUs based on information input from the user, as memory size.
> > Then, it should manage the hypervisor to passthrough the virtual gpu to
> VM.
> >
> > Another possibility that would help us in our scenario is to make some
> > control about PCI buses in hosts. For example, if Cloustack could check
> if
> > a PCI is being used in some host and then use this information in VM
> > scheduling, would be great. Cloudstack could launch VMs in a host that
> has
> > a PCI address free. This would be used not only for GPUs, but any PCI
> > device.
> >
> > I hope this can help in some way, to think of new scenarios etc.
> >
> > Thank you!
> >
> > Em qui., 22 de fev. de 2024 às 07:56, Bryan Tiang <
> > bryantian...@hotmail.com>
> > escreveu:
> >
> > > Hi Guys,
> > >
> > > Anyone running Cloudstack with GPU Support in Production? Say NVIDIA
> H100
> > > or AMD M1300X?
> > >
> > > Just want to know if there is any support for this still on going, or
> > > anyone who is running a cloud business with GPUs.
> > >
> > > Regards,
> > > Bryan
> > >
> >
> > --
> > __________________________
> > Aviso de confidencialidade
> >
> > Esta mensagem da
> > Empresa  Brasileira de Pesquisa  Agropecuaria (Embrapa), empresa publica
> > federal  regida pelo disposto  na Lei Federal no. 5.851,  de 7 de
> dezembro
> > de 1972,  e  enviada exclusivamente  a seu destinatario e pode conter
> > informacoes  confidenciais, protegidas  por sigilo profissional.  Sua
> > utilizacao desautorizada  e ilegal e  sujeita o infrator as penas da lei.
> > Se voce  a recebeu indevidamente, queira, por gentileza, reenvia-la ao
> > emitente, esclarecendo o equivoco.
> >
> > Confidentiality note
> >
> > This message from
> > Empresa  Brasileira de Pesquisa  Agropecuaria (Embrapa), a government
> > company  established under  Brazilian law (5.851/72), is directed
> > exclusively to  its addressee  and may contain confidential data,
> > protected under  professional secrecy  rules. Its unauthorized  use is
> > illegal and  may subject the transgressor to the law's penalties. If you
> > are not the addressee, please send it back, elucidating the failure.
> >
>
>
> --
> With best regards, Ivan Kudriavtsev
> BWSoft Management LLC
> Cell AM: +374-43-047-914
> Cell USA: +1-201-257-1512
> WWW: http://bitworks.software/ <http://bw-sw.com/>
>

Reply via email to