Another way to deal with it is to use KVM agent hooks:
https://github.com/apache/cloudstack/blob/8f6721ed4c4e1b31081a951c62ffbe5331cf16d4/agent/conf/agent.properties#L123

You can implement the logic in Groovy to modify XML during the start to
support extra devices out of CloudStack management.

On Fri, Feb 23, 2024 at 2:36 PM Jorge Luiz Correa
<jorge.l.cor...@embrapa.br.invalid> wrote:

> Hi Bryan! We are using here but in a different way, customized for our
> environment and using how it is possible the features of CloudStack. In
> documentation we can see support for some GPU models a little bit old
> today.
>
> We are using pci passthrough. All hosts with GPU are configured to boot
> with IOMMU and vfio-pci, not loading kernel modules for each GPU.
>
> Then, we create a serviceoffering to describe VMs that will have GPU. In
> this serviceoffering we use the serviceofferingdetails[1].value field to
> insert a block of configuration related to the GPU. It is something like
> "<device> ... <hostdev> ... address type=pci" that describes the PCI bus
> from each GPU. Then, we use tags to force this computeoffering to run only
> in hosts with GPUs.
>
> We create a Cloudstack cluster with a lot of hosts equipped with GPUs. When
> a user needs a VM with GPU he/she should use the created computeoffering.
> VM will be instantiated in some host of the cluster and GPUs are
> passthrough to VM.
>
> There are no control executed by cloudstack. For example, it can try to
> instantiate a VM in a host when a GPU is already being used (will fail).
> Our management is that the ROOT admin always controls that creation. We
> launch all VMs using all GPUs from the infrastructure. Then we use a queue
> manager to run jobs in those VMs with GPUs. When a user needs a dedicated
> VM to develop something, we can shutdown a VM already running (that is part
> of the queue manager as processor node) and then create this dedicated VM,
> that uses the GPUs isolated.
>
> There are some possibilities when using GPUs. For example, some models
> accept virtualization when we can divide a GPU. In that case, Cloudstack
> would need to support that, so it would manage the driver, creating the
> virtual GPUs based on information input from the user, as memory size.
> Then, it should manage the hypervisor to passthrough the virtual gpu to VM.
>
> Another possibility that would help us in our scenario is to make some
> control about PCI buses in hosts. For example, if Cloustack could check if
> a PCI is being used in some host and then use this information in VM
> scheduling, would be great. Cloudstack could launch VMs in a host that has
> a PCI address free. This would be used not only for GPUs, but any PCI
> device.
>
> I hope this can help in some way, to think of new scenarios etc.
>
> Thank you!
>
> Em qui., 22 de fev. de 2024 às 07:56, Bryan Tiang <
> bryantian...@hotmail.com>
> escreveu:
>
> > Hi Guys,
> >
> > Anyone running Cloudstack with GPU Support in Production? Say NVIDIA H100
> > or AMD M1300X?
> >
> > Just want to know if there is any support for this still on going, or
> > anyone who is running a cloud business with GPUs.
> >
> > Regards,
> > Bryan
> >
>
> --
> __________________________
> Aviso de confidencialidade
>
> Esta mensagem da
> Empresa  Brasileira de Pesquisa  Agropecuaria (Embrapa), empresa publica
> federal  regida pelo disposto  na Lei Federal no. 5.851,  de 7 de dezembro
> de 1972,  e  enviada exclusivamente  a seu destinatario e pode conter
> informacoes  confidenciais, protegidas  por sigilo profissional.  Sua
> utilizacao desautorizada  e ilegal e  sujeita o infrator as penas da lei.
> Se voce  a recebeu indevidamente, queira, por gentileza, reenvia-la ao
> emitente, esclarecendo o equivoco.
>
> Confidentiality note
>
> This message from
> Empresa  Brasileira de Pesquisa  Agropecuaria (Embrapa), a government
> company  established under  Brazilian law (5.851/72), is directed
> exclusively to  its addressee  and may contain confidential data,
> protected under  professional secrecy  rules. Its unauthorized  use is
> illegal and  may subject the transgressor to the law's penalties. If you
> are not the addressee, please send it back, elucidating the failure.
>

Reply via email to