Just out of interest, I saw this talk from DK Panda a few months ago which covers MPI developments, including for GPU-Direct and for running in virtualised environments:
https://youtu.be/AsFakPJSplo Do you know if this means there is a version of MVAPICH2 that supports GPU-Direct optimised for a virtualised environment, or are they entirely disjoint efforts? Might be tricky - I am not sure how virtual PCI BARs map to the hypervisor’s physical PCI BARs. If the physical PCI ranges are hidden from the VM it may not be possible to initiate a peer-to-peer transfer. Does anyone know if it can be done? Best wishes, Stig > On 26 Jul 2016, at 08:09, Blair Bethwaite <[email protected]> wrote: > > Hi Joe, Jon - > > We seem to be good now on both qemu 2.3 and 2.5 with kernel 3.19 > (lowest we've tried). Also thanks to Jon we had an easy fix for the > snapshot issues! > > Next question - has anyone figured out how to make GPU P2P work? We > haven't tried very hard yet, but with our current setup we're telling > Nova to pass through the GK210GL "3D controller" and that results in > the guest seeing individual GPUs attached to a virtualised PCI bus, > even when e.g. passing through two K80s on the same board. Next > obvious step is to try passing through the on-board PLX PCI bridge, > but wondering whether anyone else has been down this path yet? > > Cheers, > > On 20 July 2016 at 12:57, Blair Bethwaite <[email protected]> wrote: >> Thanks for the confirmation Joe! >> >> On 20 July 2016 at 12:19, Joe Topjian <[email protected]> wrote: >>> Hi Blair, >>> >>> We only updated qemu. We're running the version of libvirt from the Kilo >>> cloudarchive. >>> >>> We've been in production with our K80s for around two weeks now and have had >>> several users report success. >>> >>> Thanks, >>> Joe >>> >>> On Tue, Jul 19, 2016 at 5:06 PM, Blair Bethwaite <[email protected]> >>> wrote: >>>> >>>> Hilariously (or not!) we finally hit the same issue last week once >>>> folks actually started trying to do something (other than build and >>>> load drivers) with the K80s we're passing through. This >>>> >>>> https://devtalk.nvidia.com/default/topic/850833/pci-passthrough-kvm-for-cuda-usage/ >>>> is the best discussion of the issue I've found so far, haven't tracked >>>> down an actual bug yet though. I wonder whether it has something to do >>>> with the memory size of the device, as we've been happy for a long >>>> time with other NVIDIA GPUs (GRID K1, K2, M2070, ...). >>>> >>>> Jon, when you grabbed Mitaka Qemu, did you also update libvirt? We're >>>> just working through this and have tried upgrading both but are >>>> hitting some issues with Nova and Neutron on the compute nodes, >>>> thinking it may libvirt related but debug isn't helping much yet. >>>> >>>> Cheers, >>>> >>>> On 8 July 2016 at 00:54, Jonathan Proulx <[email protected]> wrote: >>>>> On Thu, Jul 07, 2016 at 11:13:29AM +1000, Blair Bethwaite wrote: >>>>> :Jon, >>>>> : >>>>> :Awesome, thanks for sharing. We've just run into an issue with SRIOV >>>>> :VF passthrough that sounds like it might be the same problem (device >>>>> :disappearing after a reboot), but haven't yet investigated deeply - >>>>> :this will help with somewhere to start! >>>>> >>>>> :By the way, the nouveau mention was because we had missed it on some >>>>> :K80 hypervisors recently and seen passthrough apparently work, but >>>>> :then the NVIDIA drivers would not build in the guest as they claimed >>>>> :they could not find a supported device (despite the GPU being visible >>>>> :on the PCI bus). >>>>> >>>>> Definitely sage advice! >>>>> >>>>> :I have also heard passing mention of requiring qemu >>>>> :2.3+ but don't have any specific details of the related issue. >>>>> >>>>> I didn't do a bisection but with qemu 2.2 (from ubuntu cloudarchive >>>>> kilo) I was sad and with 2.5 (from ubuntu cloudarchive mitaka but >>>>> installed on a kilo hypervisor) I am working. >>>>> >>>>> Thanks, >>>>> -Jon >>>>> >>>>> >>>>> :Cheers, >>>>> : >>>>> :On 7 July 2016 at 08:13, Jonathan Proulx <[email protected]> wrote: >>>>> :> On Wed, Jul 06, 2016 at 12:32:26PM -0400, Jonathan D. Proulx wrote: >>>>> :> : >>>>> :> :I do have an odd remaining issue where I can run cuda jobs in the vm >>>>> :> :but snapshots fail and after pause (for snapshotting) the pci device >>>>> :> :can't be reattached (which is where i think it deletes the snapshot >>>>> :> :it took). Got same issue with 3.16 and 4.4 kernels. >>>>> :> : >>>>> :> :Not very well categorized yet, but I'm hoping it's because the VM I >>>>> :> :was hacking on had it's libvirt.xml written out with the older qemu >>>>> :> :maybe? It had been through a couple reboots of the physical system >>>>> :> :though. >>>>> :> : >>>>> :> :Currently building a fresh instance and bashing more keys... >>>>> :> >>>>> :> After an ugly bout of bashing I've solve my failing snapshot issue >>>>> :> which I'll post here in hopes of saving someonelse >>>>> :> >>>>> :> Short version: >>>>> :> >>>>> :> add "/dev/vfio/vfio rw," to >>>>> /etc/apparmor.d/abstractions/libvirt-qemu >>>>> :> add "ulimit -l unlimited" to /etc/init/libvirt-bin.conf >>>>> :> >>>>> :> Longer version: >>>>> :> >>>>> :> What was happening. >>>>> :> >>>>> :> * send snapshot request >>>>> :> * instance pauses while snapshot is pending >>>>> :> * instance attempt to resume >>>>> :> * fails to reattach pci device >>>>> :> * nova-compute.log >>>>> :> Exception during message handling: internal error: unable to >>>>> execute QEMU command 'device_add': Device initialization failedcompute.log >>>>> :> >>>>> :> * qemu/<id>.log >>>>> :> vfio: failed to open /dev/vfio/vfio: Permission denied >>>>> :> vfio: failed to setup container for group 48 >>>>> :> vfio: failed to get group 48 >>>>> :> * snapshot disappears >>>>> :> * instance resumes but without passed through device (hard reboot >>>>> :> reattaches) >>>>> :> >>>>> :> seeing permsission denied I though would be an easy fix but: >>>>> :> >>>>> :> # ls -l /dev/vfio/vfio >>>>> :> crw-rw-rw- 1 root root 10, 196 Jul 6 14:05 /dev/vfio/vfio >>>>> :> >>>>> :> so I'm guessing I'm in apparmor hell, I try adding "/dev/vfio/vfio >>>>> :> rw," to /etc/apparmor.d/abstractions/libvirt-qemu rebooting the >>>>> :> hypervisor and trying again which gets me a different libvirt error >>>>> :> set: >>>>> :> >>>>> :> VFIO_MAP_DMA: -12 >>>>> :> vfio_dma_map(0x5633a5fa69b0, 0x0, 0xa0000, 0x7f4e7be00000) = -12 >>>>> (Cannot allocate memory) >>>>> :> >>>>> :> kern.log (and thus dmesg) showing: >>>>> :> vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded >>>>> :> >>>>> :> Getting rid of this one required inserting 'ulimit -l unlimited' into >>>>> :> /etc/init/libvirt-bin.conf in the 'script' section: >>>>> :> >>>>> :> <previous bits excluded> >>>>> :> script >>>>> :> [ -r /etc/default/libvirt-bin ] && . /etc/default/libvirt-bin >>>>> :> ulimit -l unlimited >>>>> :> exec /usr/sbin/libvirtd $libvirtd_opts >>>>> :> end script >>>>> :> >>>>> :> >>>>> :> -Jon >>>>> :> >>>>> :> _______________________________________________ >>>>> :> OpenStack-operators mailing list >>>>> :> [email protected] >>>>> :> >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >>>>> : >>>>> : >>>>> : >>>>> :-- >>>>> :Cheers, >>>>> :~Blairo >>>>> >>>>> -- >>>> >>>> >>>> >>>> -- >>>> Cheers, >>>> ~Blairo >>>> >>>> _______________________________________________ >>>> OpenStack-operators mailing list >>>> [email protected] >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >>> >>> >> >> >> >> -- >> Cheers, >> ~Blairo > > > > -- > Cheers, > ~Blairo > > _______________________________________________ > OpenStack-operators mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators _______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
