GitHub user correajl edited a comment on the discussion: GPU Support forKVM
Hi! I've used Nvidia GPUs with CloudStack in a totally manual way. You will need to do PCI passthrough. Some steps: - enable some global configs like '_allow.additional.vm.configuration.list.kvm = devices,hostdev,driver,source,address,alias_' and '_enable.additional.vm.configuration = true_' in ACS. - enable features in hosts BIOS like _Intel VT-d_ or _AMD IOV_. - use IOMMU to create groups and isolate PCIe bus. ``` IOMMU Group 34: 0000:17:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070] [10de:2484] (rev a1) 0000:17:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1) IOMMU Group 74: 0000:b3:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070] [10de:2484] (rev a1) 0000:b3:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1) - use some parameters in hosts kernel like '_GRUB_CMDLINE_LINUX="video=efifb:off intel_iommu=on iommu=pt kvm.ignore_msrs=1 rd.driver.pre=vfio-pci_"' or '_GRUB_CMDLINE_LINUX="video=efifb:off amd_iommu=on iommu=pt kvm.ignore_msrs=1 rd.driver.pre=vfio-pci_"' - discover the PCI IDs and change grub again to something like '_GRUB_CMDLINE_LINUX="video=efifb:off intel_iommu=on intel_iommu=on kvm.ignore_msrs=1 rd.driver.pre=vfio-pci vfio-pci.ids=10de:2484,10de:228b_"' (then GPUs is not used by host) - create new compute offerings that use "extra-config" `cmk -p profile create serviceoffering name="VM with 2x RTX3070" displaytext="VM with 2x RTX3070" customized=true storagetype=shared provisioningtype=fat cpuspeed=1 mincpunumber=2 maxcpunumber=12 minmemory=2000 maxmemory=30000 offerha=false dynamicscalingenabled=false hosttags=rtx3070-bus17 diskofferingstrictness=false serviceofferingdetails[1].key=extraconfig-1 serviceofferingdetails[1].value="CHANGEINDATABASE" ` - find it in database: `select so.name, so.id, sod.service_offering_id, sod.name, sod.value from service_offering as so left join service_offering_details as sod on so.id = sod.service_offering_id where so.state = "Active" and so.name = "VM with 2x RTX3070";` - chage it to an extra-config that uses your GPU (pci bus, etc): Something like (here I'm using 2 GPUs, 0x17 and 0xb3): ```<devices><hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x17' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x17' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0xb3' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0xb3' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/></hostdev></devices>``` update using something like: ```update service_offering_details set value = "<devices><hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x17' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x17' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0xb3' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0xb3' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x00' function='0x0'/></hostdev></devices>" where service_offering_id = 44 and nam e = "extraconfig-1" and value = "CHANGEINDATABASE";``` Well, at this point you "could" passthrough a GPU to an instance. But, you will need to launch instance as ROOT, so you can choose the host (ACS knows nothing about your host that has GPU). If you are using a compute offering that inject an extra-config that makes the instance use some pci bus, so you should launch it in a host that has this pci bus id. In the end, you should have an instance that access de PCI where is the GPU. Install the nvidia driver in instance and be happy. These steps aren't so simples, there are many other details, but, this is a fast guideline to some way to do this work. :) GitHub link: https://github.com/apache/cloudstack/discussions/10638#discussioncomment-12949988 ---- This is an automatically sent email for users@cloudstack.apache.org. To unsubscribe, please send an email to: users-unsubscr...@cloudstack.apache.org