I have since tested this with a clean install of Ubuntu "Bionic" (as that was the current LTS), and with virtualgl_2.6.3_amd64.deb and turbovnc_2.2.4_amd64.deb from their respective SF websites.
I have noticed no change in results (ie vglspheres64 shows a massive improvement with vglrun, others less so). This is also with VGL_FORCEALPHA=1, but thank you for the suggestion. I've accepted that that's just how it is and that my HW just won't provide me with what I want to do. You may be interested in knowing that the default desktop (GNOME?) is broken when I run *vncserver -wm ~/gnome -vgl*, with the same script as before (and no errors in the vnc log). I'm happy to help in debugging/exploring that if you would find that helpful. However my final long term solution will be to use nomachine or similar that allows me to access/mirror my host display, as that appears to provide the remote performance I'm after in the most straightforward way, despite losing the ability to use virtual displays. Thank you for all the help and attention! On Saturday, 18 April 2020 19:33:56 UTC+1, DRC wrote: > > Referring to https://virtualgl.org/Documentation/OSSupport and > https://turbovnc.org/Documentation/OSSupport, if any part of TurboVNC or > VirtualGL doesn’t work on RHEL/CentOS or Ubuntu LTS, then that is > considered a bug, and I will fix it ASAP. Those platforms are the ones that > are most frequently used for commercial and academic TurboVNC/VGL server > deployments, so they are the most thoroughly tested (CentOS 7 and Ubuntu > 18.04 in particular.) I also have the ability to test Fedora, SLES, > FreeBSD, and Solaris 11 in virtual machines, but those platforms receive > less attention. Rolling distros like Arch are difficult to support, because > they’re moving targets, so in general, if a problem only exists on a > rolling distro, it won’t be fixed until it can be reproduced on a more > stable platform. > > On Apr 17, 2020, at 5:27 PM, Shak <[email protected] <javascript:>> wrote: > > > It might be useful to try a known working configuration of Distro+DE using > VGL and TVNC with this HW. > > Could you advise on the most straightforward recipe starting from scratch > that I can use to have the most minimal and foolproof system up and running? > > On Friday, 17 April 2020 23:21:39 UTC+1, Shak wrote: >> >> So despite my Plasma desktop being broken, I am able to use it "blind". I >> managed to run both glxspheres64 (120fps) and glmark2 (200) and gputest >> (300fps), and they all report that they use the IGD (if not with the host >> performance, apart from glxsphere64). So I guess even if I was able to run >> the DE via VGL it wouldn't make a difference to my results (which was >> expected). >> >> Here are the results from glreadtest: >> >> *==== glreadtest ====* >> GLreadtest v2.6.3 (Build 20200214) >> >> /usr/bin/glreadtest -h for advanced usage. >> Rendering to Pbuffer (size = 701 x 701 pixels) >> Using 1-byte row alignment >> >> >>>>>>>>>> PIXEL FORMAT: RGB <<<<<<<<<< >> glDrawPixels(): 107.6 Mpixels/sec >> glReadPixels(): 148.3 Mpixels/sec (min = 115.4, max = 156.2, sdev = >> 3.569) >> glReadPixels() accounted for 100.00% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: RGBA <<<<<<<<<< >> glDrawPixels(): 124.6 Mpixels/sec >> glReadPixels(): 181.2 Mpixels/sec (min = 160.9, max = 197.7, sdev = >> 5.405) >> glReadPixels() accounted for 100.00% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: BGR <<<<<<<<<< >> glDrawPixels(): 107.9 Mpixels/sec >> glReadPixels(): 149.0 Mpixels/sec (min = 135.5, max = 156.1, sdev = >> 2.881) >> glReadPixels() accounted for 100.00% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: BGRA <<<<<<<<<< >> glDrawPixels(): 104.7 Mpixels/sec >> glReadPixels(): 142.6 Mpixels/sec (min = 125.9, max = 150.4, sdev = >> 3.375) >> glReadPixels() accounted for 100.00% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: ABGR <<<<<<<<<< >> glDrawPixels(): 105.0 Mpixels/sec >> glReadPixels(): 143.7 Mpixels/sec (min = 130.7, max = 154.0, sdev = >> 2.879) >> glReadPixels() accounted for 100.00% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: ARGB <<<<<<<<<< >> glDrawPixels(): 118.6 Mpixels/sec >> glReadPixels(): 143.7 Mpixels/sec (min = 112.5, max = 151.9, sdev = >> 4.444) >> glReadPixels() accounted for 100.00% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: RED <<<<<<<<<< >> glDrawPixels(): 110.2 Mpixels/sec >> glReadPixels(): 157.9 Mpixels/sec (min = 122.3, max = 187.8, sdev = >> 6.647) >> glReadPixels() accounted for 100.00% of total readback time >> >> FB Config = 0x6a >> >> *==== glreadtest -pbo ====* >> >> GLreadtest v2.6.3 (Build 20200214) >> Using PBOs for readback >> Rendering to Pbuffer (size = 701 x 701 pixels) >> Using 1-byte row alignment >> >> >>>>>>>>>> PIXEL FORMAT: RGB <<<<<<<<<< >> glDrawPixels(): 112.2 Mpixels/sec >> glReadPixels(): 172.4 Mpixels/sec (min = 113.4, max = 208.4, sdev = >> 20.38) >> glReadPixels() accounted for 96.69% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: RGBA <<<<<<<<<< >> glDrawPixels(): 124.1 Mpixels/sec >> glReadPixels(): 241.5 Mpixels/sec (min = 157.6, max = 271.7, sdev = >> 14.43) >> glReadPixels() accounted for 0.6267% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: BGR <<<<<<<<<< >> glDrawPixels(): 107.6 Mpixels/sec >> glReadPixels(): 143.5 Mpixels/sec (min = 114.4, max = 151.3, sdev = >> 3.703) >> glReadPixels() accounted for 97.27% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: BGRA <<<<<<<<<< >> glDrawPixels(): 104.1 Mpixels/sec >> glReadPixels(): 247.1 Mpixels/sec (min = 197.5, max = 279.2, sdev = >> 13.49) >> glReadPixels() accounted for 0.6108% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: ABGR <<<<<<<<<< >> glDrawPixels(): 104.9 Mpixels/sec >> glReadPixels(): 138.8 Mpixels/sec (min = 122.6, max = 145.3, sdev = >> 3.135) >> glReadPixels() accounted for 96.54% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: ARGB <<<<<<<<<< >> glDrawPixels(): 120.9 Mpixels/sec >> glReadPixels(): 138.8 Mpixels/sec (min = 114.0, max = 147.7, sdev = >> 3.362) >> glReadPixels() accounted for 96.49% of total readback time >> >> >>>>>>>>>> PIXEL FORMAT: RED <<<<<<<<<< >> glDrawPixels(): 111.9 Mpixels/sec >> glReadPixels(): 486.6 Mpixels/sec (min = 236.9, max = 638.9, sdev = >> 85.43) >> glReadPixels() accounted for 1.265% of total readback time >> >> FB Config = 0x6a >> >> >> >> On Friday, 17 April 2020 22:59:40 UTC+1, DRC wrote: >>> >>> I honestly have no idea. I am successfully able to use your ~/gnome >>> script on my CentOS 7 and 8 machines (one has an nVidia GPU, the other >>> AMD), as long as I make the script executable. The WM launches using >>> VirtualGL, as expected. >>> >>> As far as performance, it occurred to me that the Intel GPU might have >>> slow pixel readback. Try running '/opt/VirtualGL/bin/glreadtest' and >>> '/opt/VirtualGL/bin/glreadtest -pbo' on the local display and post the >>> results. If one particular readback mode is slow but others are fast, then >>> we can work around that by using environment variables to tell VirtualGL >>> which mode to use. >>> >>> DRC >>> On 4/17/20 4:48 PM, Shak wrote: >>> >>> *echo $LD_PRELOAD* returns empty, so something is up. But my main >>> measure of failure was that glxspheres64 (and glmark2) say that they render >>> with llvmpipe. Given that I am using the default xstartup.turboscript, am I >>> supposed to do something other than run *vncserver -wm ~/gnome -vgl* (I >>> use a script as I can't figure out how else to pass "dbus-launch >>> gnome-session" >>> to -wm)? >>> >>> Some more benchmarks. I'm quite new to OpenGL so these were just found >>> after some web searches. If there's obvious and useful ones I should do >>> please let me know. >>> >>> gputest on host: 2600fps >>> gputest via VNC: 370fps >>> vglrun -sp gputest via VNC: 400fps >>> >>> gfxbench (car chase) on host: 44fps >>> gfxbench (car chase) via VNC: won't run on llvmpipe >>> vglrun gfxbench (car chase) via VNC: 28fps >>> >>> On Friday, 17 April 2020 21:37:08 UTC+1, DRC wrote: >>>> >>>> Bear in mind that passing -wm and -vgl to the vncserver script does >>>> nothing but set environment variables (TVNC_WM and TVNC_VGL) that are >>>> picked up by the default xstartup.turbovnc script, so make sure you are >>>> using the default xstartup.turbovnc script. It's easy to verify whether >>>> the window manager is using VirtualGL. Just open a terminal in the >>>> TurboVNC session and echo the value of $LD_PRELOAD. It should contain >>>> something like "libdlfaker.so:libvglfaker.so" if VirtualGL is active, and >>>> you should be able to run OpenGL applications in the session without >>>> vglrun, and those applications should show that they are using the Intel >>>> OpenGL renderer. >>>> >>>> As far as the performance, you haven't mentioned any other benchmarks >>>> you have tested, other than glmark2. I've explained why that benchmark >>>> may >>>> be demonstrating lackluster performance. If you have other data points, >>>> then please share them. >>>> On 4/17/20 2:34 PM, Shak wrote: >>>> >>>> I ran the commands you suggested (I went with -p 1m) and am still >>>> seeing a big difference. I just find it strange to see it clearly working >>>> with glxspheres64, but not much else. >>>> >>>> $ *glxspheres64 -p 1000000* >>>> Polygons in scene: 999424 (61 spheres * 16384 polys/spheres) >>>> GLX FB config ID of window: 0xfe (8/8/8/0) >>>> Visual ID of window: 0x2bf >>>> Context is Direct >>>> OpenGL Renderer: llvmpipe (LLVM 9.0.1, 256 bits) >>>> 3.292760 frames/sec - 2.370366 Mpixels/sec >>>> 3.317006 frames/sec - 2.387820 Mpixels/sec >>>> $ *vglrun -sp glxspheres64 -p 1000000* >>>> Polygons in scene: 999424 (61 spheres * 16384 polys/spheres) >>>> GLX FB config ID of window: 0x6b (8/8/8/0) >>>> Visual ID of window: 0x288 >>>> Context is Direct >>>> OpenGL Renderer: Mesa DRI Intel(R) HD Graphics P4600/P4700 (HSW GT2) >>>> 62.859812 frames/sec - 45.251019 Mpixels/sec >>>> 59.975806 frames/sec - 43.174903 Mpixels/sec >>>> >>>> BTW, GNOME is now working (where I ran the above). I'm trying to run >>>> the whole desktop in VGL, but *vncserver -wm ~/gnome -vgl* doesn't >>>> seem to do anything differently than it does without -vgl. Again, my gnome >>>> script is: >>>> >>>> #!/bin/sh >>>> dbus-launch gnome-session >>>> >>>> That said, the desktop isn't broken now so that's an improvement on >>>> KDE. But how can I run the whole of GNOME under VGL? >>>> >>>> I think if I can get the desktop running in VGL and still not see the >>>> performance in apps that I do locally (apart from in glxspheres!) I will >>>> take that as the most I can do with my system over VNC (unless you find it >>>> helpful for me to debug further). >>>> >>>> Thanks, >>>> >>>> >>>> On Friday, 17 April 2020 19:04:48 UTC+1, DRC wrote: >>>>> >>>>> On 4/17/20 10:36 AM, Shak wrote: >>>>> >>>>> I ran glmark on the host display normally and then with software >>>>> rendering. I've attached the results at the end of this message. I've >>>>> attached this for completion rather than to contradict your hunch, but >>>>> they >>>>> do tie up with the numbers I see via VGL so I don't think this is a >>>>> CPU/VNC >>>>> issue. >>>>> >>>>> Hmmm... Well, you definitely are seeing a much greater speedup with >>>>> glmark2 absent VirtualGL, so I can only guess that the benchmark is >>>>> fine-grained enough that it's being affected by VGL's per-frame overhead. >>>>> >>>>> A more realistic way to compare the two drivers would be using '[vglrun >>>>> -sp] /opt/VirtualGL/bin/glxspheres -p {n}', where {n} is a fairly high >>>>> number of polygons (at least 100,000.) >>>>> >>>>> >>>>> I've tried repeating my experiments using gnome, in case the issue is >>>>> with KDE. However I get the following when trying to run vglrun: >>>>> >>>>> $ *vglrun glxspheres64* >>>>> /usr/bin/vglrun: line 191: hostname: command not found >>>>> [VGL] NOTICE: Automatically setting VGL_CLIENT environment variable to >>>>> [VGL] 10.10.7.1, the IP address of your SSH client. >>>>> Polygons in scene: 62464 (61 spheres * 1024 polys/spheres) >>>>> libGL error: failed to authenticate magic 1 >>>>> libGL error: failed to load driver: i965 >>>>> GLX FB config ID of window: 0x6b (8/8/8/0) >>>>> Visual ID of window: 0x21 >>>>> Context is Direct >>>>> OpenGL Renderer: llvmpipe (LLVM 9.0.1, 256 bits) >>>>> 17.228616 frames/sec - 17.859872 Mpixels/sec >>>>> 16.580449 frames/sec - 17.187957 Mpixels/sec >>>>> >>>>> You need to install whatever package provides /usr/bin/hostname for >>>>> your Linux distribution. That will eliminate the vglrun error, although >>>>> it's probably unrelated to this problem. (Because of the error, vglrun is >>>>> falsely detecting an X11-forward SSH environment and setting VGL_CLIENT, >>>>> which would normally be used for the VGL Transport. However, since >>>>> VirtualGL auto-detects an X11 proxy environment and enables the X11 >>>>> Transport, the value of VGL_CLIENT should be ignored in this case.) >>>>> >>>>> I honestly have no clue how to proceed. I haven't observed these >>>>> problems in any of the distributions I officially support, and I have no >>>>> way to test Arch. >>>>> >>>>> I'm not sure what to make of these. I am using *vncserver -wm ~/gnome*, >>>>> where gnome is the following script. >>>>> >>>>> #!/bin/sh >>>>> dbus-launch gnome-session >>>>> >>>>> I feel that I am close but still a way off. >>>>> >>>>> FWIW I have previously tried using nomachine which is able to give me >>>>> the perceived GL acceleration by "mirroring" my host display, but that >>>>> just >>>>> feels like the wrong way to achieve this (not least because it requires a >>>>> monitor attached to use). >>>>> >>>>> Thanks, >>>>> >>>>> ==== RENDER TESTS ==== >>>>> >>>>> $ *glmark2* >>>>> ======================================================= >>>>> glmark2 2017.07 >>>>> ======================================================= >>>>> OpenGL Information >>>>> GL_VENDOR: Intel Open Source Technology Center >>>>> GL_RENDERER: Mesa DRI Intel(R) HD Graphics P4600/P4700 (HSW GT2) >>>>> GL_VERSION: 3.0 Mesa 20.0.4 >>>>> ======================================================= >>>>> [build] use-vbo=false: FPS: 2493 FrameTime: 0.401 ms >>>>> ======================================================= >>>>> glmark2 Score: 2493 >>>>> ======================================================= >>>>> >>>>> $ *LIBGL_ALWAYS_SOFTWARE=1 glmark2* >>>>> ** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control! >>>>> ** Failed to set swap interval. Results may be bounded above by >>>>> refresh rate. >>>>> ======================================================= >>>>> glmark2 2017.07 >>>>> ======================================================= >>>>> OpenGL Information >>>>> GL_VENDOR: VMware, Inc. >>>>> GL_RENDERER: llvmpipe (LLVM 9.0.1, 256 bits) >>>>> GL_VERSION: 3.1 Mesa 20.0.4 >>>>> ======================================================= >>>>> ** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control! >>>>> ** Failed to set swap interval. Results may be bounded above by >>>>> refresh rate. >>>>> [build] use-vbo=false: FPS: 420 FrameTime: 2.381 ms >>>>> ======================================================= >>>>> glmark2 Score: 420 >>>>> ======================================================= >>>>> >>>>> >>>>> On Thursday, 16 April 2020 23:21:59 UTC+1, DRC wrote: >>>>>> >>>>>> On 4/16/20 3:19 PM, Shak wrote: >>>>>> >>>>>> Thank you for the quick tips. I have posted some results at the end >>>>>> of this post, but they seem inconsistent. glxspheres64 shows the correct >>>>>> renderer respectively and the performance shows the 6x results I was >>>>>> expecting. However I do not see the same gains in glmark2, even though >>>>>> it >>>>>> also reports the correct renderer in each case. Again, I see a glmark of >>>>>> 2000+ when running it in display :0. >>>>>> >>>>>> I don't know much about glmark2, but as with any benchmark, Amdahl's >>>>>> Law applies. That means that the total speedup from any enhancement >>>>>> (such >>>>>> as a GPU) is limited by the percentage of clock time during which that >>>>>> enhancement is used. Not all OpenGL workloads are GPU-bound in terms of >>>>>> performance. If the geometry and window size are both really small, >>>>>> then >>>>>> the performance could very well be CPU-bound. That's why, for instance, >>>>>> GLXgears is a poor OpenGL benchmark. Real-world applications these days >>>>>> assume the presence of a GPU, so they're going to have no qualms about >>>>>> trying to render geometries with hundreds of thousands or even millions >>>>>> of >>>>>> polygons. When you try to do that with software OpenGL, you'll see a >>>>>> big >>>>>> difference vs. GPU acceleration-- a difference that won't necessarily >>>>>> show >>>>>> up with tiny geometries. >>>>>> >>>>>> You can confirm that that's the case by running glmark2 on your local >>>>>> display without VirtualGL and forcing the use of the swrast driver. I >>>>>> suspect that the difference between swrast and i965 won't be very great >>>>>> in >>>>>> that scenario, either. (I should also mention that Intel GPUs aren't >>>>>> the >>>>>> fastest in the world, so you're never going to see as much of a >>>>>> speedup-- >>>>>> nor as large of a speedup in as many cases-- as you would see with AMD >>>>>> or >>>>>> nVidia.) >>>>>> >>>>>> The other thing is, if the benchmark is attempting to measure >>>>>> unrealistic frame rates-- like hundreds or thousands of frames per >>>>>> second-- >>>>>> then there is a small amount of per-frame overhead introduced by >>>>>> VirtualGL >>>>>> that may be limiting that frame rate. But the reality is that human >>>>>> vision >>>>>> can't usually detect more than 60 fps anyhow, so the difference between, >>>>>> say, 200 fps and 400 fps is not going to matter to an application user. >>>>>> At >>>>>> more realistic frame rates, VGL's overhead won't be noticeable. >>>>>> >>>>>> Performance measurement in a VirtualGL environment is more >>>>>> complicated than performance measurement in a local display environment, >>>>>> which is why there's a whole section of the VirtualGL User's Guide >>>>>> dedicated to it. Basically, since VGL introduces a small amount of >>>>>> per-frame overhead but no per-vertex overhead, at realistic frame rates >>>>>> and >>>>>> with modern server and client hardware, it will not appear any slower >>>>>> than >>>>>> a local display. However, some synthetic benchmarks may record slower >>>>>> performance due to the aforementioned overhead. >>>>>> >>>>>> >>>>>> In the meantime I have been trying to get the DE as a whole to run >>>>>> under acceleration. I record my findings here as a possible clue to my >>>>>> VGL >>>>>> issues above. In my .vnc/xstartup.turbovnc I use the following command: >>>>>> >>>>>> #normal start - works with llvmpipe and vglrun >>>>>> #exec startplasma-x11 >>>>>> >>>>>> #VGL start >>>>>> exec vglrun +wm startplasma-x11 >>>>>> >>>>>> And I also start tvnc with: >>>>>> >>>>>> $vncserver -3dwm >>>>>> >>>>>> I'm not sure if vglrun, +wm or -3dwm are redundant or working against >>>>>> each other, but I've also tried various combinations to no avail. >>>>>> >>>>>> Just use the default xstartup.turbovnc script ('rm >>>>>> ~/.vnc/xstartup.turbovnc' and re-run /opt/TurboVNC/bin/vncserver to >>>>>> create >>>>>> it) and start TurboVNC with '-wm startplasma-x11 -vgl'. >>>>>> >>>>>> * -3dwm is deprecated. Use -vgl instead. -3dwm/-vgl (or setting >>>>>> '$useVGL = 1;' in /etc/turbovncserver.conf or >>>>>> ~/.vnc/turbovncserver.conf) >>>>>> simply instructs xstartup.turbovnc to run the window manager startup >>>>>> script >>>>>> using 'vglrun +wm'. >>>>>> >>>>>> * Passing -wm to /opt/TurboVNC/bin/vncserver (or setting '$wm = >>>>>> {script};' in turbovncserver.conf) instructs xstartup.turbovnc to >>>>>> execute >>>>>> the specified window manager startup script rather than >>>>>> /etc/X11/xinit/xinitrc. >>>>>> >>>>>> * +wm is a feature of VirtualGL, not TurboVNC. Normally, if >>>>>> VirtualGL detects that an OpenGL application is not monitoring >>>>>> StructureNotify events, VGL will monitor those events on behalf of the >>>>>> application (which allows VGL to be notified when the window changes >>>>>> size, >>>>>> thus allowing VGL to change the size of the corresponding Pbuffer.) >>>>>> This >>>>>> is, however, unnecessary with window managers and interferes with some >>>>>> of >>>>>> them (compiz, specifically), so +wm disables that behavior in VirtualGL. >>>>>> >>>>>> It's also a placeholder in case future issues are discovered that are >>>>>> specific to compositing window managers (+wm could easily be extended to >>>>>> handle those issues as well.) >>>>>> >>>>>> Interestingly I had to update the vglrun script to have the full >>>>>> paths to /usr/lib/libdlfaker.so and the others otherwise I see the >>>>>> following in the TVNC logs: >>>>>> >>>>>> ERROR: ld.so: object 'libdlfaker.so' from LD_PRELOAD cannot be >>>>>> preloaded (cannot open shared object file): ignored. >>>>>> ERROR: ld.so: object 'libvglfaker.so' from LD_PRELOAD cannot be >>>>>> preloaded (cannot open shared object file): ignored. >>>>>> >>>>>> That said, my desktop is still broken even when these errors >>>>>> disappear. >>>>>> >>>>>> Could my various issues be to do with KDE? >>>>>> >>>>>> The LD_PRELOAD issues can be fixed as described here: >>>>>> >>>>>> https://cdn.rawgit.com/VirtualGL/virtualgl/2.6.3/doc/index.html#hd0012 >>>>>> >>>>>> All >>>>>> >>>>> -- > You received this message because you are subscribed to the Google Groups > "VirtualGL User Discussion/Support" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/virtualgl-users/ea90fd9d-ec3e-4d42-a048-342939d96614%40googlegroups.com > > <https://groups.google.com/d/msgid/virtualgl-users/ea90fd9d-ec3e-4d42-a048-342939d96614%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/86997458-d64d-4df9-a1e2-2e018dcdb302%40googlegroups.com.
