[ovirt-users] Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Vinícius Ferrão via Users
Hello, here we go again. I’m trying to passthrough 4x NVIDIA Tesla V100 GPUs (with NVLink) to a single VM; but things aren’t that good. Only one GPU shows up on the VM. lspci is able to show the GPUs, but three of them are unusable: 08:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100

[ovirt-users] Re: Hosted engine install failure: ipv6.gateway: gateway cannot be set if there are no addresses configured

2020-09-04 Thread Dominik Holler
Sverker, is this bug blocking you, or can you work around it? On Thu, Sep 3, 2020 at 8:52 PM Dominik Holler wrote: > Sverker, thanks! > > On Thu, Sep 3, 2020 at 6:50 PM Sverker Abrahamsson < > sver...@abrahamsson.com> wrote: > >> Hi Dominik, >> bug filed at

[ovirt-users] Re: Failed to connect to server (code: 1006) connecting to second host noVNC console

2020-09-04 Thread James Loker-Steele via Users
Ok, i have resolved this issue On cluster, i had to disable Encrypted VNC and reinstall the host to apply the changes. This would probably not happen if it was a shared cluster. Right now its 2 local clusters ___ Users mailing list -- users@ovirt.org

[ovirt-users] Re: Storage Domain won't activate

2020-09-04 Thread Vojtech Juranek
On čtvrtek 3. září 2020 22:49:17 CEST Gillingham, Eric J (US 393D) via Users wrote: > I recently removed a host from my cluster to upgrade it to 4.4, after I > removed the host from the datacenter VMs started to pause on the second > system they all migrated to. Investigating via the engine

[ovirt-users] Re: [EXTERNAL] Re: Storage Domain won't activate

2020-09-04 Thread Gillingham, Eric J (US 393D) via Users
On 9/4/20, 4:50 AM, "Vojtech Juranek" wrote: On čtvrtek 3. září 2020 22:49:17 CEST Gillingham, Eric J (US 393D) via Users wrote: > I recently removed a host from my cluster to upgrade it to 4.4, after I > removed the host from the datacenter VMs started to pause on the second

[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Arman Khalatyan
hi, with the 2xT4 we haven't seen any trouble. we have no nvlink there. did u try to disable the nvlink? Vinícius Ferrão via Users schrieb am Fr., 4. Sept. 2020, 08:39: > Hello, here we go again. > > I’m trying to passthrough 4x NVIDIA Tesla V100 GPUs (with NVLink) to a > single VM; but

[ovirt-users] Re: VM HostedEngine is down with error

2020-09-04 Thread souvaliotimaria
Hello, This is what I could gather from the gluster logs around the time frame of the HE shutdown. NODE1: [root@ov-no1 glusterfs]# more bricks/gluster_bricks-vmstore-vmstore.log-20200830 |egrep "( W | E )"|more [2020-08-27 15:35:03.090477] W [glusterfsd.c:1570:cleanup_and_exit]

[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Michael Jones
Also use multiple t4, also p4, titans, no issues but never used the nvlink On Fri, 4 Sep 2020, 16:02 Arman Khalatyan, wrote: > hi, > with the 2xT4 we haven't seen any trouble. we have no nvlink there. > > did u try to disable the nvlink? > > > > Vinícius Ferrão via Users schrieb am Fr., 4.

[ovirt-users] Re: Storage Domain won't activate

2020-09-04 Thread Strahil Nikolov via Users
Is this a HCI setup ? If yes, check gluster status (I prefer cli but is also valid in the UI). gluster pool list gluster volume status gluster volume heal info summary Best Regards, Strahil Nikolov В петък, 4 септември 2020 г., 00:38:13 Гринуич+3, Gillingham, Eric J (US 393D) via Users

[ovirt-users] Re: VM HostedEngine is down with error

2020-09-04 Thread Strahil Nikolov via Users
Hi Maria, I am quite puzzled about: >/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x555e3137101b] ) 0-: received >signum (15), shutting down [2020-08-27 15:35:14.890471] E [MSGID: 100018] [glusterfsd.c:2333:glusterfs_pidfile_update] 0-glusterfsd: pidfile /var/run/gluster/vols/data/ov-no1.a

[ovirt-users] Re: [EXTERNAL] Re: Storage Domain won't activate

2020-09-04 Thread Gillingham, Eric J (US 393D) via Users
This is using iscsi storage. I stopped the ovirt broker/agents/vdsm and used sanlock to remove the locks it was complaining about, but as soon as I started the ovirt tools up and the engine came online again the same messages reappeared. After spending more than a day trying to resolve this

[ovirt-users] Re: [EXTERNAL] Re: Storage Domain won't activate

2020-09-04 Thread Nir Soffer
On Fri, Sep 4, 2020 at 5:43 PM Gillingham, Eric J (US 393D) via Users wrote: > > On 9/4/20, 4:50 AM, "Vojtech Juranek" wrote: > > On čtvrtek 3. září 2020 22:49:17 CEST Gillingham, Eric J (US 393D) via > Users > wrote: > > I recently removed a host from my cluster to upgrade it to

[ovirt-users] Re: [EXTERNAL] Re: Storage Domain won't activate

2020-09-04 Thread David Teigland
On Sat, Sep 05, 2020 at 12:25:45AM +0300, Nir Soffer wrote: > > > /var/log/sanlock.log contains a repeating: > > > add_lockspace > > > > > e1270474-108c-4cae-83d6-51698cffebbf:1:/dev/e1270474-108c-4cae-83d6-51698cf > > > febbf/ids:0 conflicts with name of list1 s1 > > > > >

[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Arman Khalatyan
same here ☺️, on Monday will check them. Michael Jones schrieb am Fr., 4. Sept. 2020, 22:01: > Yea pass through, I think vgpu you have to pay for driver upgrade with > nvidia, I've not tried that and don't know the price, didn't find getting > info on it easy last time I tried. > > Have used in

[ovirt-users] Re: [EXTERNAL] Re: Storage Domain won't activate

2020-09-04 Thread Gillingham, Eric J (US 393D) via Users
On 9/4/20, 2:26 PM, "Nir Soffer" wrote: On Fri, Sep 4, 2020 at 5:43 PM Gillingham, Eric J (US 393D) via Users wrote: > > On 9/4/20, 4:50 AM, "Vojtech Juranek" wrote: > > On čtvrtek 3. září 2020 22:49:17 CEST Gillingham, Eric J (US 393D) via Users > wrote:

[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Vinícius Ferrão via Users
Thanks Michael and Arman. To make things clear, you guys are using Passthrough, right? It’s not vGPU. The 4x GPUs are added on the “Host Devices” tab of the VM. What I’m trying to achieve is add the 4x V100 directly to one specific VM. And finally can you guys confirm which BIOS type is being

[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Michael Jones
Yea pass through, I think vgpu you have to pay for driver upgrade with nvidia, I've not tried that and don't know the price, didn't find getting info on it easy last time I tried. Have used in both legacy and uefi boot machines, don't know the chipsets off the top of my head, will look on Monday.

[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Michael Jones
First things I'd check would be what driver is on host and that it's all nvidia driver all the way make sure nouveau is blacklisted throughout On Fri, 4 Sep 2020, 21:01 Michael Jones, wrote: > Yea pass through, I think vgpu you have to pay for driver upgrade with > nvidia, I've not tried that