That did it! When I export VGL_DISPLAY=
...on the system which is NOT acclerating, it all works! [me@gpunode-2-0 ~]$ export VGL_DISPLAY= [me@gpunode-2-0 ~]$ echo $VGL_DISPLAY [me@gpunode-2-0 ~]$ /opt/VirtualGL/bin/glxspheres64 Polygons in scene: 62464 (61 spheres * 1024 polys/spheres) GLX FB config ID of window: 0x7d (8/8/8/0) Visual ID of window: 0x288 Context is Direct OpenGL Renderer: Tesla T4/PCIe/SSE2 158.817812 frames/sec - 177.240678 Mpixels/sec 156.659550 frames/sec - 174.832058 Mpixels/sec 160.199829 frames/sec - 178.783009 Mpixels/sec 158.225050 frames/sec - 176.579156 Mpixels/sec 159.431940 frames/sec - 177.926045 Mpixels/sec 155.437629 frames/sec - 173.468394 Mpixels/sec 168.780167 frames/sec - 188.358666 Mpixels/sec 156.255911 frames/sec - 174.381597 Mpixels/sec 158.956569 frames/sec - 177.395531 Mpixels/sec 156.650750 frames/sec - 174.822237 Mpixels/sec 158.987188 frames/sec - 177.429702 Mpixels/sec 159.112906 frames/sec - 177.570004 Mpixels/sec 162.952582 frames/sec - 181.855082 Mpixels/sec 159.708156 frames/sec - 178.234302 Mpixels/sec So the question is WHY this var gets set to :1 and what I can do about that, I guess... :) On Monday, August 31, 2020 at 6:55:49 AM UTC+10 Jake Carroll wrote: > Mmm. > > So, on the system that is _not_ accelerating: > > [me@gpunode-2-0 ~]$ echo $LD_PRELOAD > libdlfaker.so:libvglfaker.so > [me@gpunode-2-0 ~]$ echo $VGL_DISPLAY > :1 > > When I check the system that IS accelerating correctly: > > [me@gpunode-2-0 ~]$ echo $VGL_DISPLAY > > [me@gpunode-2-0 ~]$ echo $LD_PRELOAD > libdlfaker.so:libvglfaker.so > > Odd huh? > > Does this point to anything specific? I note that on the system that DOES > NOT have the display set in the variable - things work. What the? > > On Monday, August 31, 2020 at 12:16:40 AM UTC+10 DRC wrote: > >> What about the environment? Is VGL_DISPLAY set in one session but not the >> other? What about LD_PRELOAD? If not, then I have no explanation. VirtualGL >> works properly with unmodified TigerVNC, so if you can verify that that is >> the case on your systems, that would give you a baseline against which to >> compare StrudelWeb and determine where the problem is. >> >> On Aug 30, 2020, at 12:50 AM, Jake Carroll <[email protected]> wrote: >> >> And from the FastX session that is/does accelerate correctly... >> >> >> [me@gpunode-2-0 ~]$ vglrun ldd /opt/VirtualGL/bin/glxspheres64 >> linux-vdso.so.1 => (0x00007fff0a4fa000) >> libdlfaker.so => /lib64/libdlfaker.so (0x00007fb90b3ad000) >> libvglfaker.so => /lib64/libvglfaker.so (0x00007fb90b057000) >> libGL.so.1 => /lib64/libGL.so.1 (0x00007fb90adae000) >> libX11.so.6 => /lib64/libX11.so.6 (0x00007fb90aa70000) >> libGLU.so.1 => /lib64/libGLU.so.1 (0x00007fb90a7f0000) >> libm.so.6 => /lib64/libm.so.6 (0x00007fb90a4ee000) >> libc.so.6 => /lib64/libc.so.6 (0x00007fb90a120000) >> libdl.so.2 => /lib64/libdl.so.2 (0x00007fb909f1c000) >> libXv.so.1 => /lib64/libXv.so.1 (0x00007fb909d17000) >> libXext.so.6 => /lib64/libXext.so.6 (0x00007fb909b05000) >> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb9098e9000) >> /lib64/ld-linux-x86-64.so.2 (0x00007fb90b5af000) >> libGLX.so.0 => /lib64/libGLX.so.0 (0x00007fb9096b9000) >> libGLdispatch.so.0 => /lib64/libGLdispatch.so.0 (0x00007fb9093e6000) >> libxcb.so.1 => /lib64/libxcb.so.1 (0x00007fb9091be000) >> libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb908eb7000) >> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb908ca1000) >> libXau.so.6 => /lib64/libXau.so.6 (0x00007fb908a9d000) >> >> >> On Sunday, August 30, 2020 at 3:46:02 PM UTC+10 Jake Carroll wrote: >> >>> From the StrudelWeb/TigerVNC based session, which is currently not >>> accelerated: >>> >>> [me@gpunode-2-0 ~]$ vglrun ldd /opt/VirtualGL/bin/glxspheres64 >>> linux-vdso.so.1 => (0x00007fffc52b4000) >>> libdlfaker.so => /lib64/libdlfaker.so (0x00007ffa35f61000) >>> libvglfaker.so => /lib64/libvglfaker.so (0x00007ffa35c0b000) >>> libGL.so.1 => /lib64/libGL.so.1 (0x00007ffa35962000) >>> libX11.so.6 => /lib64/libX11.so.6 (0x00007ffa35624000) >>> libGLU.so.1 => /lib64/libGLU.so.1 (0x00007ffa353a4000) >>> libm.so.6 => /lib64/libm.so.6 (0x00007ffa350a2000) >>> libc.so.6 => /lib64/libc.so.6 (0x00007ffa34cd4000) >>> libdl.so.2 => /lib64/libdl.so.2 (0x00007ffa34ad0000) >>> libXv.so.1 => /lib64/libXv.so.1 (0x00007ffa348cb000) >>> libXext.so.6 => /lib64/libXext.so.6 (0x00007ffa346b9000) >>> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffa3449d000) >>> /lib64/ld-linux-x86-64.so.2 (0x00007ffa36163000) >>> libGLX.so.0 => /lib64/libGLX.so.0 (0x00007ffa3426d000) >>> libGLdispatch.so.0 => /lib64/libGLdispatch.so.0 (0x00007ffa33f9a000) >>> libxcb.so.1 => /lib64/libxcb.so.1 (0x00007ffa33d72000) >>> libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007ffa33a6b000) >>> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ffa33855000) >>> libXau.so.6 => /lib64/libXau.so.6 (0x00007ffa33651000) >>> >>> >>> On Sunday, August 30, 2020 at 3:03:54 PM UTC+10 DRC wrote: >>> >>>> If the same 3D X server works with FastX and not with TigerVNC, then >>>> the problem is not with the 3D X server. That means that anything related >>>> to xorg.conf and the Xorg modules is probably a red herring. I would focus >>>> on the environment and the dynamic linker. Compare the output of ‘env’ in >>>> a >>>> FastX vs a TigerVNC session. Compare ‘vglrun ldd >>>> /opt/VirtualGL/bin/glxsheres’ in both sessions. Try explicitly setting >>>> VGL_GLLIB=/usr/lib/libGL.so.1 in the environment. >>>> >>>> On Aug 29, 2020, at 11:50 PM, Jake Carroll <[email protected]> >>>> wrote: >>>> >>>> Hi. >>>> >>>> Thanks for getting back to me. So - for clarity - the TigerVNC >>>> host/daemon actually does run on the same nodes - but it is only TigerVNC >>>> that seems to have the problem. FastX (whatever it does/however it works!) >>>> does not seem to have the issue and it happily accelerates OpenGL out of >>>> the box just fine. >>>> >>>> You mentioned the LD_LIBRARY_PATH before and possibly that Tiger is >>>> referencing the wrong libs. I found this floating around... >>>> >>>> From here and a few other places: >>>> >>>> https://gist.github.com/shehzan10/8d36c908af216573a1f0 >>>> >>>> They recommend the following: >>>> >>>> sudo mv /usr/lib/xorg/modules/extensions/libglx.so >>>> /usr/lib/xorg/modules/extensions/libglx.so.orig >>>> sudo ln -s /usr/lib/xorg/modules/extensions/libglx.so.XXX.YY >>>> /usr/lib/xorg/modules/extensions/libglx.so >>>> >>>> Have you ever seen anything like this before? I have not tried it as >>>> yet. >>>> >>>> Thanks again. >>>> >>>> On Sunday, August 30, 2020 at 1:41:13 PM UTC+10 DRC wrote: >>>> >>>>> xorg.conf only affects the 3D X server. It isn’t clear from your >>>>> message whether TigerVNC is running on the same machines as FastX. If it >>>>> is >>>>> not, then a bad xorg.conf could be the problem on the TigerVNC machines. >>>>> The first thing I would try is accessing the GPU through the 3D X server >>>>> on >>>>> those machines without using VGL (see the “Sanity Check” section in the >>>>> User’s Guide.) If you meant that TigerVNC is running on the same machines >>>>> as FastX, then perhaps, for some reason, the TigerVNC customizations set >>>>> LD_LIBRARY_PATH to point to a Mesa implementation of libGL rather than >>>>> the >>>>> GPU-accelerated version. Also double check that the StrudelWeb >>>>> environment >>>>> isn’t doing something stupid like setting VGL_DISPLAY to the 2D X server >>>>> rather than the 3D X server. >>>>> >>>>> On Aug 29, 2020, at 9:56 PM, Jake Carroll <[email protected]> >>>>> wrote: >>>>> >>>>> Hi. >>>>> >>>>> >>>>> I think I need a little bit of VirtualGL help. >>>>> >>>>> We've got an installation of FastX running on our SLURM controlled AMD >>>>> Rome nodes. The systems have 4 * nVidia T4 GPU's contained within. >>>>> >>>>> Using FastX + VirtualGL sessions works perfectly with MATE. So well, >>>>> that users often say how happy they are with it. >>>>> >>>>> However - we also run a custom TigerVNC based platform too, called >>>>> StrudelWeb. This was a local development. The problem we've got is that, >>>>> despite the same xorg.conf and everything else we can think of - the >>>>> TigerVNC sessions launched via Strudel do not seem to be able to use >>>>> anything but the llvmpipe MESA path. We can run some environmental >>>>> variables within such that VGL_LOGO=1 or similar exports absolutely pop >>>>> up >>>>> the "VGL" logo in our X display windows over our Strudel Tiger VNC >>>>> sessions >>>>> (glxspheres shows the VGL logo etc) but it is absolutely using the >>>>> software >>>>> renderer. What we can't figure out is why VirtualGL + Tiger VNC won't >>>>> pick >>>>> up the nvidia hardware or xorg config, but using FastX with an identical >>>>> xorg.conf seems to work perfectly. >>>>> >>>>> I'd post my xorg.conf but I don't want to fill this post with mess >>>>> until someone advice where I should start/what to look for first. >>>>> >>>>> So far I've tried a few things, including this in the xorg.conf: >>>>> >>>>> Option "UseDisplayDevice" "none" >>>>> >>>>> Which seems to have broken everything entirely (the nVidia T4 is a >>>>> headless GPU). >>>>> >>>>> I also looked at this: >>>>> >>>>> https://gist.github.com/shehzan10/8d36c908af216573a1f0 >>>>> >>>>> And thought it might help - but it assumes no implementation of >>>>> something like VirtualGL, so I wondered how relevant it was. >>>>> >>>>> So - I'm trying to work out what might be wrong with my remote >>>>> launched remote TigerVNC session via Strudel. >>>>> >>>>> For reference on what Strudel actually "is"... >>>>> >>>>> https://trac.version.fz-juelich.de/vis/wiki/vnc3d/strudel >>>>> >>>>> Thank you for your time. >>>>> >>>>> Regards, >>>>> >>>>> -jc >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "VirtualGL User Discussion/Support" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/virtualgl-users/d84f788d-3566-4e6f-8dc5-9a31be944d19n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/virtualgl-users/d84f788d-3566-4e6f-8dc5-9a31be944d19n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "VirtualGL User Discussion/Support" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/virtualgl-users/44cd3102-0c60-4265-9f2b-223e43096a41n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/virtualgl-users/44cd3102-0c60-4265-9f2b-223e43096a41n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> -- >> You received this message because you are subscribed to the Google Groups >> "VirtualGL User Discussion/Support" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/virtualgl-users/196622fc-9078-4f8b-be94-276e743c3861n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/virtualgl-users/196622fc-9078-4f8b-be94-276e743c3861n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> -- You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/3077daf4-7d39-4a32-a775-fec9516fafd4n%40googlegroups.com.
