Two other data points - my VNC client is TightVNC on windows, and as mentioned before my VGL host is actually a virtual machine (this might make a difference due to some pipeline not used by my "host" display).
On Sunday, 19 April 2020 13:45:13 UTC+1, Shak wrote: > > I have since tested this with a clean install of Ubuntu "Bionic" (as that > was the current LTS), and with virtualgl_2.6.3_amd64.deb > and turbovnc_2.2.4_amd64.deb from their respective SF websites. > > I have noticed no change in results (ie vglspheres64 shows a massive > improvement with vglrun, others less so). This is also > with VGL_FORCEALPHA=1, but thank you for the suggestion. I've accepted that > that's just how it is and that my HW just won't provide me with what I want > to do. > > You may be interested in knowing that the default desktop (GNOME?) is > broken when I run *vncserver -wm ~/gnome -vgl*, with the same script as > before (and no errors in the vnc log). I'm happy to help in > debugging/exploring that if you would find that helpful. > > However my final long term solution will be to use nomachine or similar > that allows me to access/mirror my host display, as that appears to provide > the remote performance I'm after in the most straightforward way, despite > losing the ability to use virtual displays. > > Thank you for all the help and attention! > > On Saturday, 18 April 2020 19:33:56 UTC+1, DRC wrote: >> >> Referring to https://virtualgl.org/Documentation/OSSupport and >> https://turbovnc.org/Documentation/OSSupport, if any part of TurboVNC or >> VirtualGL doesn’t work on RHEL/CentOS or Ubuntu LTS, then that is >> considered a bug, and I will fix it ASAP. Those platforms are the ones that >> are most frequently used for commercial and academic TurboVNC/VGL server >> deployments, so they are the most thoroughly tested (CentOS 7 and Ubuntu >> 18.04 in particular.) I also have the ability to test Fedora, SLES, >> FreeBSD, and Solaris 11 in virtual machines, but those platforms receive >> less attention. Rolling distros like Arch are difficult to support, because >> they’re moving targets, so in general, if a problem only exists on a >> rolling distro, it won’t be fixed until it can be reproduced on a more >> stable platform. >> >> On Apr 17, 2020, at 5:27 PM, Shak <[email protected]> wrote: >> >> >> It might be useful to try a known working configuration of Distro+DE >> using VGL and TVNC with this HW. >> >> Could you advise on the most straightforward recipe starting from scratch >> that I can use to have the most minimal and foolproof system up and running? >> >> On Friday, 17 April 2020 23:21:39 UTC+1, Shak wrote: >>> >>> So despite my Plasma desktop being broken, I am able to use it "blind". >>> I managed to run both glxspheres64 (120fps) and glmark2 (200) and gputest >>> (300fps), and they all report that they use the IGD (if not with the host >>> performance, apart from glxsphere64). So I guess even if I was able to run >>> the DE via VGL it wouldn't make a difference to my results (which was >>> expected). >>> >>> Here are the results from glreadtest: >>> >>> *==== glreadtest ====* >>> GLreadtest v2.6.3 (Build 20200214) >>> >>> /usr/bin/glreadtest -h for advanced usage. >>> Rendering to Pbuffer (size = 701 x 701 pixels) >>> Using 1-byte row alignment >>> >>> >>>>>>>>>> PIXEL FORMAT: RGB <<<<<<<<<< >>> glDrawPixels(): 107.6 Mpixels/sec >>> glReadPixels(): 148.3 Mpixels/sec (min = 115.4, max = 156.2, sdev = >>> 3.569) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: RGBA <<<<<<<<<< >>> glDrawPixels(): 124.6 Mpixels/sec >>> glReadPixels(): 181.2 Mpixels/sec (min = 160.9, max = 197.7, sdev = >>> 5.405) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: BGR <<<<<<<<<< >>> glDrawPixels(): 107.9 Mpixels/sec >>> glReadPixels(): 149.0 Mpixels/sec (min = 135.5, max = 156.1, sdev = >>> 2.881) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: BGRA <<<<<<<<<< >>> glDrawPixels(): 104.7 Mpixels/sec >>> glReadPixels(): 142.6 Mpixels/sec (min = 125.9, max = 150.4, sdev = >>> 3.375) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: ABGR <<<<<<<<<< >>> glDrawPixels(): 105.0 Mpixels/sec >>> glReadPixels(): 143.7 Mpixels/sec (min = 130.7, max = 154.0, sdev = >>> 2.879) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: ARGB <<<<<<<<<< >>> glDrawPixels(): 118.6 Mpixels/sec >>> glReadPixels(): 143.7 Mpixels/sec (min = 112.5, max = 151.9, sdev = >>> 4.444) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: RED <<<<<<<<<< >>> glDrawPixels(): 110.2 Mpixels/sec >>> glReadPixels(): 157.9 Mpixels/sec (min = 122.3, max = 187.8, sdev = >>> 6.647) >>> glReadPixels() accounted for 100.00% of total readback time >>> >>> FB Config = 0x6a >>> >>> *==== glreadtest -pbo ====* >>> >>> GLreadtest v2.6.3 (Build 20200214) >>> Using PBOs for readback >>> Rendering to Pbuffer (size = 701 x 701 pixels) >>> Using 1-byte row alignment >>> >>> >>>>>>>>>> PIXEL FORMAT: RGB <<<<<<<<<< >>> glDrawPixels(): 112.2 Mpixels/sec >>> glReadPixels(): 172.4 Mpixels/sec (min = 113.4, max = 208.4, sdev = >>> 20.38) >>> glReadPixels() accounted for 96.69% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: RGBA <<<<<<<<<< >>> glDrawPixels(): 124.1 Mpixels/sec >>> glReadPixels(): 241.5 Mpixels/sec (min = 157.6, max = 271.7, sdev = >>> 14.43) >>> glReadPixels() accounted for 0.6267% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: BGR <<<<<<<<<< >>> glDrawPixels(): 107.6 Mpixels/sec >>> glReadPixels(): 143.5 Mpixels/sec (min = 114.4, max = 151.3, sdev = >>> 3.703) >>> glReadPixels() accounted for 97.27% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: BGRA <<<<<<<<<< >>> glDrawPixels(): 104.1 Mpixels/sec >>> glReadPixels(): 247.1 Mpixels/sec (min = 197.5, max = 279.2, sdev = >>> 13.49) >>> glReadPixels() accounted for 0.6108% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: ABGR <<<<<<<<<< >>> glDrawPixels(): 104.9 Mpixels/sec >>> glReadPixels(): 138.8 Mpixels/sec (min = 122.6, max = 145.3, sdev = >>> 3.135) >>> glReadPixels() accounted for 96.54% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: ARGB <<<<<<<<<< >>> glDrawPixels(): 120.9 Mpixels/sec >>> glReadPixels(): 138.8 Mpixels/sec (min = 114.0, max = 147.7, sdev = >>> 3.362) >>> glReadPixels() accounted for 96.49% of total readback time >>> >>> >>>>>>>>>> PIXEL FORMAT: RED <<<<<<<<<< >>> glDrawPixels(): 111.9 Mpixels/sec >>> glReadPixels(): 486.6 Mpixels/sec (min = 236.9, max = 638.9, sdev = >>> 85.43) >>> glReadPixels() accounted for 1.265% of total readback time >>> >>> FB Config = 0x6a >>> >>> >>> >>> On Friday, 17 April 2020 22:59:40 UTC+1, DRC wrote: >>>> >>>> I honestly have no idea. I am successfully able to use your ~/gnome >>>> script on my CentOS 7 and 8 machines (one has an nVidia GPU, the other >>>> AMD), as long as I make the script executable. The WM launches using >>>> VirtualGL, as expected. >>>> >>>> As far as performance, it occurred to me that the Intel GPU might have >>>> slow pixel readback. Try running '/opt/VirtualGL/bin/glreadtest' and >>>> '/opt/VirtualGL/bin/glreadtest -pbo' on the local display and post the >>>> results. If one particular readback mode is slow but others are fast, >>>> then >>>> we can work around that by using environment variables to tell VirtualGL >>>> which mode to use. >>>> >>>> DRC >>>> On 4/17/20 4:48 PM, Shak wrote: >>>> >>>> *echo $LD_PRELOAD* returns empty, so something is up. But my main >>>> measure of failure was that glxspheres64 (and glmark2) say that they >>>> render >>>> with llvmpipe. Given that I am using the default xstartup.turboscript, am >>>> I >>>> supposed to do something other than run *vncserver -wm ~/gnome -vgl* (I >>>> use a script as I can't figure out how else to pass "dbus-launch >>>> gnome-session" >>>> to -wm)? >>>> >>>> Some more benchmarks. I'm quite new to OpenGL so these were just found >>>> after some web searches. If there's obvious and useful ones I should do >>>> please let me know. >>>> >>>> gputest on host: 2600fps >>>> gputest via VNC: 370fps >>>> vglrun -sp gputest via VNC: 400fps >>>> >>>> gfxbench (car chase) on host: 44fps >>>> gfxbench (car chase) via VNC: won't run on llvmpipe >>>> vglrun gfxbench (car chase) via VNC: 28fps >>>> >>>> On Friday, 17 April 2020 21:37:08 UTC+1, DRC wrote: >>>>> >>>>> Bear in mind that passing -wm and -vgl to the vncserver script does >>>>> nothing but set environment variables (TVNC_WM and TVNC_VGL) that are >>>>> picked up by the default xstartup.turbovnc script, so make sure you are >>>>> using the default xstartup.turbovnc script. It's easy to verify whether >>>>> the window manager is using VirtualGL. Just open a terminal in the >>>>> TurboVNC session and echo the value of $LD_PRELOAD. It should contain >>>>> something like "libdlfaker.so:libvglfaker.so" if VirtualGL is active, and >>>>> you should be able to run OpenGL applications in the session without >>>>> vglrun, and those applications should show that they are using the Intel >>>>> OpenGL renderer. >>>>> >>>>> As far as the performance, you haven't mentioned any other benchmarks >>>>> you have tested, other than glmark2. I've explained why that benchmark >>>>> may >>>>> be demonstrating lackluster performance. If you have other data points, >>>>> then please share them. >>>>> On 4/17/20 2:34 PM, Shak wrote: >>>>> >>>>> I ran the commands you suggested (I went with -p 1m) and am still >>>>> seeing a big difference. I just find it strange to see it clearly working >>>>> with glxspheres64, but not much else. >>>>> >>>>> $ *glxspheres64 -p 1000000* >>>>> Polygons in scene: 999424 (61 spheres * 16384 polys/spheres) >>>>> GLX FB config ID of window: 0xfe (8/8/8/0) >>>>> Visual ID of window: 0x2bf >>>>> Context is Direct >>>>> OpenGL Renderer: llvmpipe (LLVM 9.0.1, 256 bits) >>>>> 3.292760 frames/sec - 2.370366 Mpixels/sec >>>>> 3.317006 frames/sec - 2.387820 Mpixels/sec >>>>> $ *vglrun -sp glxspheres64 -p 1000000* >>>>> Polygons in scene: 999424 (61 spheres * 16384 polys/spheres) >>>>> GLX FB config ID of window: 0x6b (8/8/8/0) >>>>> Visual ID of window: 0x288 >>>>> Context is Direct >>>>> OpenGL Renderer: Mesa DRI Intel(R) HD Graphics P4600/P4700 (HSW GT2) >>>>> 62.859812 frames/sec - 45.251019 Mpixels/sec >>>>> 59.975806 frames/sec - 43.174903 Mpixels/sec >>>>> >>>>> BTW, GNOME is now working (where I ran the above). I'm trying to run >>>>> the whole desktop in VGL, but *vncserver -wm ~/gnome -vgl* doesn't >>>>> seem to do anything differently than it does without -vgl. Again, my >>>>> gnome >>>>> script is: >>>>> >>>>> #!/bin/sh >>>>> dbus-launch gnome-session >>>>> >>>>> That said, the desktop isn't broken now so that's an improvement on >>>>> KDE. But how can I run the whole of GNOME under VGL? >>>>> >>>>> I think if I can get the desktop running in VGL and still not see the >>>>> performance in apps that I do locally (apart from in glxspheres!) I will >>>>> take that as the most I can do with my system over VNC (unless you find >>>>> it >>>>> helpful for me to debug further). >>>>> >>>>> Thanks, >>>>> >>>>> >>>>> On Friday, 17 April 2020 19:04:48 UTC+1, DRC wrote: >>>>>> >>>>>> On 4/17/20 10:36 AM, Shak wrote: >>>>>> >>>>>> I ran glmark on the host display normally and then with software >>>>>> rendering. I've attached the results at the end of this message. I've >>>>>> attached this for completion rather than to contradict your hunch, but >>>>>> they >>>>>> do tie up with the numbers I see via VGL so I don't think this is a >>>>>> CPU/VNC >>>>>> issue. >>>>>> >>>>>> Hmmm... Well, you definitely are seeing a much greater speedup with >>>>>> glmark2 absent VirtualGL, so I can only guess that the benchmark is >>>>>> fine-grained enough that it's being affected by VGL's per-frame >>>>>> overhead. >>>>>> A more realistic way to compare the two drivers would be using '[vglrun >>>>>> -sp] /opt/VirtualGL/bin/glxspheres -p {n}', where {n} is a fairly high >>>>>> number of polygons (at least 100,000.) >>>>>> >>>>>> >>>>>> I've tried repeating my experiments using gnome, in case the issue is >>>>>> with KDE. However I get the following when trying to run vglrun: >>>>>> >>>>>> $ *vglrun glxspheres64* >>>>>> /usr/bin/vglrun: line 191: hostname: command not found >>>>>> [VGL] NOTICE: Automatically setting VGL_CLIENT environment variable to >>>>>> [VGL] 10.10.7.1, the IP address of your SSH client. >>>>>> Polygons in scene: 62464 (61 spheres * 1024 polys/spheres) >>>>>> libGL error: failed to authenticate magic 1 >>>>>> libGL error: failed to load driver: i965 >>>>>> GLX FB config ID of window: 0x6b (8/8/8/0) >>>>>> Visual ID of window: 0x21 >>>>>> Context is Direct >>>>>> OpenGL Renderer: llvmpipe (LLVM 9.0.1, 256 bits) >>>>>> 17.228616 frames/sec - 17.859872 Mpixels/sec >>>>>> 16.580449 frames/sec - 17.187957 Mpixels/sec >>>>>> >>>>>> You need to install whatever package provides /usr/bin/hostname for >>>>>> your Linux distribution. That will eliminate the vglrun error, although >>>>>> it's probably unrelated to this problem. (Because of the error, vglrun >>>>>> is >>>>>> falsely detecting an X11-forward SSH environment and setting VGL_CLIENT, >>>>>> which would normally be used for the VGL Transport. However, since >>>>>> VirtualGL auto-detects an X11 proxy environment and enables the X11 >>>>>> Transport, the value of VGL_CLIENT should be ignored in this case.) >>>>>> >>>>>> I honestly have no clue how to proceed. I haven't observed these >>>>>> problems in any of the distributions I officially support, and I have no >>>>>> way to test Arch. >>>>>> >>>>>> I'm not sure what to make of these. I am using *vncserver -wm >>>>>> ~/gnome*, where gnome is the following script. >>>>>> >>>>>> #!/bin/sh >>>>>> dbus-launch gnome-session >>>>>> >>>>>> I feel that I am close but still a way off. >>>>>> >>>>>> FWIW I have previously tried using nomachine which is able to give me >>>>>> the perceived GL acceleration by "mirroring" my host display, but that >>>>>> just >>>>>> feels like the wrong way to achieve this (not least because it requires >>>>>> a >>>>>> monitor attached to use). >>>>>> >>>>>> Thanks, >>>>>> >>>>>> ==== RENDER TESTS ==== >>>>>> >>>>>> $ *glmark2* >>>>>> ======================================================= >>>>>> glmark2 2017.07 >>>>>> ======================================================= >>>>>> OpenGL Information >>>>>> GL_VENDOR: Intel Open Source Technology Center >>>>>> GL_RENDERER: Mesa DRI Intel(R) HD Graphics P4600/P4700 (HSW >>>>>> GT2) >>>>>> GL_VERSION: 3.0 Mesa 20.0.4 >>>>>> ======================================================= >>>>>> [build] use-vbo=false: FPS: 2493 FrameTime: 0.401 ms >>>>>> ======================================================= >>>>>> glmark2 Score: 2493 >>>>>> ======================================================= >>>>>> >>>>>> $ *LIBGL_ALWAYS_SOFTWARE=1 glmark2* >>>>>> ** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control! >>>>>> ** Failed to set swap interval. Results may be bounded above by >>>>>> refresh rate. >>>>>> ======================================================= >>>>>> glmark2 2017.07 >>>>>> ======================================================= >>>>>> OpenGL Information >>>>>> GL_VENDOR: VMware, Inc. >>>>>> GL_RENDERER: llvmpipe (LLVM 9.0.1, 256 bits) >>>>>> GL_VERSION: 3.1 Mesa 20.0.4 >>>>>> ======================================================= >>>>>> ** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control! >>>>>> ** Failed to set swap interval. Results may be bounded above by >>>>>> refresh rate. >>>>>> [build] use-vbo=false: FPS: 420 FrameTime: 2.381 ms >>>>>> ======================================================= >>>>>> glmark2 Score: 420 >>>>>> ======================================================= >>>>>> >>>>>> >>>>>> On Thursday, 16 April 2020 23:21:59 UTC+1, DRC wrote: >>>>>>> >>>>>>> On 4/16/20 3:19 PM, Shak wrote: >>>>>>> >>>>>>> Thank you for the quick tips. I have posted some results at the end >>>>>>> of this post, but they seem inconsistent. glxspheres64 shows the >>>>>>> correct >>>>>>> renderer respectively and the performance shows the 6x results I was >>>>>>> expecting. However I do not see the same gains in glmark2, even though >>>>>>> it >>>>>>> also reports the correct renderer in each case. Again, I see a glmark >>>>>>> of >>>>>>> 2000+ when running it in display :0. >>>>>>> >>>>>>> I don't know much about glmark2, but as with any benchmark, Amdahl's >>>>>>> Law applies. That means that the total speedup from any enhancement >>>>>>> (such >>>>>>> as a GPU) is limited by the percentage of clock time during which that >>>>>>> enhancement is used. Not all OpenGL workloads are GPU-bound in terms >>>>>>> of >>>>>>> performance. If the geometry and window size are both really small, >>>>>>> then >>>>>>> the performance could very well be CPU-bound. That's why, for >>>>>>> instance, >>>>>>> GLXgears is a poor OpenGL benchmark. Real-world applications these >>>>>>> days >>>>>>> assume the presence of a GPU, so they're going to have no qualms about >>>>>>> trying to render geometries with hundreds of thousands or even millions >>>>>>> of >>>>>>> polygons. When you try to do that with software OpenGL, you'll see a >>>>>>> big >>>>>>> difference vs. GPU acceleration-- a difference that won't necessarily >>>>>>> show >>>>>>> up with tiny geometries. >>>>>>> >>>>>>> You can confirm that that's the case by running glmark2 on your >>>>>>> local display without VirtualGL and forcing the use of the swrast >>>>>>> driver. >>>>>>> I suspect that the difference between swrast and i965 won't be very >>>>>>> great >>>>>>> in that scenario, either. (I should also mention that Intel GPUs >>>>>>> aren't >>>>>>> the fastest in the world, so you're never going to see as much of a >>>>>>> speedup-- nor as large of a speedup in as many cases-- as you would see >>>>>>> with AMD or nVidia.) >>>>>>> >>>>>>> The other thing is, if the benchmark is attempting to measure >>>>>>> unrealistic frame rates-- like hundreds or thousands of frames per >>>>>>> second-- >>>>>>> then there is a small amount of per-frame overhead introduced by >>>>>>> VirtualGL >>>>>>> that may be limiting that frame rate. But the reality is that human >>>>>>> vision >>>>>>> can't usually detect more than 60 fps anyhow, so the difference >>>>>>> between, >>>>>>> say, 200 fps and 400 fps is not going to matter to an application user. >>>>>>> At >>>>>>> more realistic frame rates, VGL's overhead won't be noticeable. >>>>>>> >>>>>>> Performance measurement in a VirtualGL environment is more >>>>>>> complicated than performance measurement in a local display >>>>>>> environment, >>>>>>> which is why there's a whole section of the VirtualGL User's Guide >>>>>>> dedicated to it. Basically, since VGL introduces a small amount of >>>>>>> per-frame overhead but no per-vertex overhead, at realistic frame rates >>>>>>> and >>>>>>> with modern server and client hardware, it will not appear any slower >>>>>>> than >>>>>>> a local display. However, some synthetic benchmarks may record slower >>>>>>> performance due to the aforementioned overhead. >>>>>>> >>>>>>> >>>>>>> In the meantime I have been trying to get the DE as a whole to run >>>>>>> under acceleration. I record my findings here as a possible clue to my >>>>>>> VGL >>>>>>> issues above. In my .vnc/xstartup.turbovnc I use the following command: >>>>>>> >>>>>>> #normal start - works with llvmpipe and vglrun >>>>>>> #exec startplasma-x11 >>>>>>> >>>>>>> #VGL start >>>>>>> exec vglrun +wm startplasma-x11 >>>>>>> >>>>>>> And I also start tvnc with: >>>>>>> >>>>>>> $vncserver -3dwm >>>>>>> >>>>>>> I'm not sure if vglrun, +wm or -3dwm are redundant or working >>>>>>> against each other, but I've also tried various combinations to no >>>>>>> avail. >>>>>>> >>>>>>> Just use the default xstartup.turbovnc script ('rm >>>>>>> ~/.vnc/xstartup.turbovnc' and re-run /opt/TurboVNC/bin/vncserver to >>>>>>> create >>>>>>> it) and start TurboVNC with '-wm startplasma-x11 -vgl'. >>>>>>> >>>>>>> * -3dwm is deprecated. Use -vgl instead. -3dwm/-vgl (or setting >>>>>>> '$useVGL = 1;' in /etc/turbovncserver.conf or >>>>>>> ~/.vnc/turbovncserver.conf) >>>>>>> simply instructs xstartup.turbovnc to run the window manager startup >>>>>>> script >>>>>>> using 'vglrun +wm'. >>>>>>> >>>>>>> * Passing -wm to /opt/TurboVNC/bin/vncserver (or setting '$wm = >>>>>>> {script};' in turbovncserver.conf) instructs xstartup.turbovnc to >>>>>>> execute >>>>>>> the specified window manager startup script rather than >>>>>>> /etc/X11/xinit/xinitrc. >>>>>>> >>>>>>> * +wm is a feature of VirtualGL, not TurboVNC. Normally, if >>>>>>> VirtualGL >>>>>>> >>>>>> -- You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/500d9067-da2d-420e-9883-04ab9173e0c3%40googlegroups.com.
