It might be useful to try a known working configuration of Distro+DE using
VGL and TVNC with this HW.
Could you advise on the most straightforward recipe starting from scratch
that I can use to have the most minimal and foolproof system up and running?
On Friday, 17 April 2020 23:21:39 UTC+1, Shak wrote:
>
> So despite my Plasma desktop being broken, I am able to use it "blind". I
> managed to run both glxspheres64 (120fps) and glmark2 (200) and gputest
> (300fps), and they all report that they use the IGD (if not with the host
> performance, apart from glxsphere64). So I guess even if I was able to run
> the DE via VGL it wouldn't make a difference to my results (which was
> expected).
>
> Here are the results from glreadtest:
>
> *==== glreadtest ====*
> GLreadtest v2.6.3 (Build 20200214)
>
> /usr/bin/glreadtest -h for advanced usage.
> Rendering to Pbuffer (size = 701 x 701 pixels)
> Using 1-byte row alignment
>
> >>>>>>>>>> PIXEL FORMAT: RGB <<<<<<<<<<
> glDrawPixels(): 107.6 Mpixels/sec
> glReadPixels(): 148.3 Mpixels/sec (min = 115.4, max = 156.2, sdev =
> 3.569)
> glReadPixels() accounted for 100.00% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: RGBA <<<<<<<<<<
> glDrawPixels(): 124.6 Mpixels/sec
> glReadPixels(): 181.2 Mpixels/sec (min = 160.9, max = 197.7, sdev =
> 5.405)
> glReadPixels() accounted for 100.00% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: BGR <<<<<<<<<<
> glDrawPixels(): 107.9 Mpixels/sec
> glReadPixels(): 149.0 Mpixels/sec (min = 135.5, max = 156.1, sdev =
> 2.881)
> glReadPixels() accounted for 100.00% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: BGRA <<<<<<<<<<
> glDrawPixels(): 104.7 Mpixels/sec
> glReadPixels(): 142.6 Mpixels/sec (min = 125.9, max = 150.4, sdev =
> 3.375)
> glReadPixels() accounted for 100.00% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: ABGR <<<<<<<<<<
> glDrawPixels(): 105.0 Mpixels/sec
> glReadPixels(): 143.7 Mpixels/sec (min = 130.7, max = 154.0, sdev =
> 2.879)
> glReadPixels() accounted for 100.00% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: ARGB <<<<<<<<<<
> glDrawPixels(): 118.6 Mpixels/sec
> glReadPixels(): 143.7 Mpixels/sec (min = 112.5, max = 151.9, sdev =
> 4.444)
> glReadPixels() accounted for 100.00% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: RED <<<<<<<<<<
> glDrawPixels(): 110.2 Mpixels/sec
> glReadPixels(): 157.9 Mpixels/sec (min = 122.3, max = 187.8, sdev =
> 6.647)
> glReadPixels() accounted for 100.00% of total readback time
>
> FB Config = 0x6a
>
> *==== glreadtest -pbo ====*
>
> GLreadtest v2.6.3 (Build 20200214)
> Using PBOs for readback
> Rendering to Pbuffer (size = 701 x 701 pixels)
> Using 1-byte row alignment
>
> >>>>>>>>>> PIXEL FORMAT: RGB <<<<<<<<<<
> glDrawPixels(): 112.2 Mpixels/sec
> glReadPixels(): 172.4 Mpixels/sec (min = 113.4, max = 208.4, sdev =
> 20.38)
> glReadPixels() accounted for 96.69% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: RGBA <<<<<<<<<<
> glDrawPixels(): 124.1 Mpixels/sec
> glReadPixels(): 241.5 Mpixels/sec (min = 157.6, max = 271.7, sdev =
> 14.43)
> glReadPixels() accounted for 0.6267% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: BGR <<<<<<<<<<
> glDrawPixels(): 107.6 Mpixels/sec
> glReadPixels(): 143.5 Mpixels/sec (min = 114.4, max = 151.3, sdev =
> 3.703)
> glReadPixels() accounted for 97.27% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: BGRA <<<<<<<<<<
> glDrawPixels(): 104.1 Mpixels/sec
> glReadPixels(): 247.1 Mpixels/sec (min = 197.5, max = 279.2, sdev =
> 13.49)
> glReadPixels() accounted for 0.6108% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: ABGR <<<<<<<<<<
> glDrawPixels(): 104.9 Mpixels/sec
> glReadPixels(): 138.8 Mpixels/sec (min = 122.6, max = 145.3, sdev =
> 3.135)
> glReadPixels() accounted for 96.54% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: ARGB <<<<<<<<<<
> glDrawPixels(): 120.9 Mpixels/sec
> glReadPixels(): 138.8 Mpixels/sec (min = 114.0, max = 147.7, sdev =
> 3.362)
> glReadPixels() accounted for 96.49% of total readback time
>
> >>>>>>>>>> PIXEL FORMAT: RED <<<<<<<<<<
> glDrawPixels(): 111.9 Mpixels/sec
> glReadPixels(): 486.6 Mpixels/sec (min = 236.9, max = 638.9, sdev =
> 85.43)
> glReadPixels() accounted for 1.265% of total readback time
>
> FB Config = 0x6a
>
>
>
> On Friday, 17 April 2020 22:59:40 UTC+1, DRC wrote:
>>
>> I honestly have no idea. I am successfully able to use your ~/gnome
>> script on my CentOS 7 and 8 machines (one has an nVidia GPU, the other
>> AMD), as long as I make the script executable. The WM launches using
>> VirtualGL, as expected.
>>
>> As far as performance, it occurred to me that the Intel GPU might have
>> slow pixel readback. Try running '/opt/VirtualGL/bin/glreadtest' and
>> '/opt/VirtualGL/bin/glreadtest -pbo' on the local display and post the
>> results. If one particular readback mode is slow but others are fast, then
>> we can work around that by using environment variables to tell VirtualGL
>> which mode to use.
>>
>> DRC
>> On 4/17/20 4:48 PM, Shak wrote:
>>
>> *echo $LD_PRELOAD* returns empty, so something is up. But my main
>> measure of failure was that glxspheres64 (and glmark2) say that they render
>> with llvmpipe. Given that I am using the default xstartup.turboscript, am I
>> supposed to do something other than run *vncserver -wm ~/gnome -vgl* (I
>> use a script as I can't figure out how else to pass "dbus-launch
>> gnome-session"
>> to -wm)?
>>
>> Some more benchmarks. I'm quite new to OpenGL so these were just found
>> after some web searches. If there's obvious and useful ones I should do
>> please let me know.
>>
>> gputest on host: 2600fps
>> gputest via VNC: 370fps
>> vglrun -sp gputest via VNC: 400fps
>>
>> gfxbench (car chase) on host: 44fps
>> gfxbench (car chase) via VNC: won't run on llvmpipe
>> vglrun gfxbench (car chase) via VNC: 28fps
>>
>> On Friday, 17 April 2020 21:37:08 UTC+1, DRC wrote:
>>>
>>> Bear in mind that passing -wm and -vgl to the vncserver script does
>>> nothing but set environment variables (TVNC_WM and TVNC_VGL) that are
>>> picked up by the default xstartup.turbovnc script, so make sure you are
>>> using the default xstartup.turbovnc script. It's easy to verify whether
>>> the window manager is using VirtualGL. Just open a terminal in the
>>> TurboVNC session and echo the value of $LD_PRELOAD. It should contain
>>> something like "libdlfaker.so:libvglfaker.so" if VirtualGL is active, and
>>> you should be able to run OpenGL applications in the session without
>>> vglrun, and those applications should show that they are using the Intel
>>> OpenGL renderer.
>>>
>>> As far as the performance, you haven't mentioned any other benchmarks
>>> you have tested, other than glmark2. I've explained why that benchmark may
>>> be demonstrating lackluster performance. If you have other data points,
>>> then please share them.
>>> On 4/17/20 2:34 PM, Shak wrote:
>>>
>>> I ran the commands you suggested (I went with -p 1m) and am still seeing
>>> a big difference. I just find it strange to see it clearly working with
>>> glxspheres64, but not much else.
>>>
>>> $ *glxspheres64 -p 1000000*
>>> Polygons in scene: 999424 (61 spheres * 16384 polys/spheres)
>>> GLX FB config ID of window: 0xfe (8/8/8/0)
>>> Visual ID of window: 0x2bf
>>> Context is Direct
>>> OpenGL Renderer: llvmpipe (LLVM 9.0.1, 256 bits)
>>> 3.292760 frames/sec - 2.370366 Mpixels/sec
>>> 3.317006 frames/sec - 2.387820 Mpixels/sec
>>> $ *vglrun -sp glxspheres64 -p 1000000*
>>> Polygons in scene: 999424 (61 spheres * 16384 polys/spheres)
>>> GLX FB config ID of window: 0x6b (8/8/8/0)
>>> Visual ID of window: 0x288
>>> Context is Direct
>>> OpenGL Renderer: Mesa DRI Intel(R) HD Graphics P4600/P4700 (HSW GT2)
>>> 62.859812 frames/sec - 45.251019 Mpixels/sec
>>> 59.975806 frames/sec - 43.174903 Mpixels/sec
>>>
>>> BTW, GNOME is now working (where I ran the above). I'm trying to run the
>>> whole desktop in VGL, but *vncserver -wm ~/gnome -vgl* doesn't seem to
>>> do anything differently than it does without -vgl. Again, my gnome script
>>> is:
>>>
>>> #!/bin/sh
>>> dbus-launch gnome-session
>>>
>>> That said, the desktop isn't broken now so that's an improvement on KDE.
>>> But how can I run the whole of GNOME under VGL?
>>>
>>> I think if I can get the desktop running in VGL and still not see the
>>> performance in apps that I do locally (apart from in glxspheres!) I will
>>> take that as the most I can do with my system over VNC (unless you find it
>>> helpful for me to debug further).
>>>
>>> Thanks,
>>>
>>>
>>> On Friday, 17 April 2020 19:04:48 UTC+1, DRC wrote:
>>>>
>>>> On 4/17/20 10:36 AM, Shak wrote:
>>>>
>>>> I ran glmark on the host display normally and then with software
>>>> rendering. I've attached the results at the end of this message. I've
>>>> attached this for completion rather than to contradict your hunch, but
>>>> they
>>>> do tie up with the numbers I see via VGL so I don't think this is a
>>>> CPU/VNC
>>>> issue.
>>>>
>>>> Hmmm... Well, you definitely are seeing a much greater speedup with
>>>> glmark2 absent VirtualGL, so I can only guess that the benchmark is
>>>> fine-grained enough that it's being affected by VGL's per-frame overhead.
>>>> A more realistic way to compare the two drivers would be using '[vglrun
>>>> -sp] /opt/VirtualGL/bin/glxspheres -p {n}', where {n} is a fairly high
>>>> number of polygons (at least 100,000.)
>>>>
>>>>
>>>> I've tried repeating my experiments using gnome, in case the issue is
>>>> with KDE. However I get the following when trying to run vglrun:
>>>>
>>>> $ *vglrun glxspheres64*
>>>> /usr/bin/vglrun: line 191: hostname: command not found
>>>> [VGL] NOTICE: Automatically setting VGL_CLIENT environment variable to
>>>> [VGL] 10.10.7.1, the IP address of your SSH client.
>>>> Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
>>>> libGL error: failed to authenticate magic 1
>>>> libGL error: failed to load driver: i965
>>>> GLX FB config ID of window: 0x6b (8/8/8/0)
>>>> Visual ID of window: 0x21
>>>> Context is Direct
>>>> OpenGL Renderer: llvmpipe (LLVM 9.0.1, 256 bits)
>>>> 17.228616 frames/sec - 17.859872 Mpixels/sec
>>>> 16.580449 frames/sec - 17.187957 Mpixels/sec
>>>>
>>>> You need to install whatever package provides /usr/bin/hostname for
>>>> your Linux distribution. That will eliminate the vglrun error, although
>>>> it's probably unrelated to this problem. (Because of the error, vglrun is
>>>> falsely detecting an X11-forward SSH environment and setting VGL_CLIENT,
>>>> which would normally be used for the VGL Transport. However, since
>>>> VirtualGL auto-detects an X11 proxy environment and enables the X11
>>>> Transport, the value of VGL_CLIENT should be ignored in this case.)
>>>>
>>>> I honestly have no clue how to proceed. I haven't observed these
>>>> problems in any of the distributions I officially support, and I have no
>>>> way to test Arch.
>>>>
>>>> I'm not sure what to make of these. I am using *vncserver -wm ~/gnome*,
>>>> where gnome is the following script.
>>>>
>>>> #!/bin/sh
>>>> dbus-launch gnome-session
>>>>
>>>> I feel that I am close but still a way off.
>>>>
>>>> FWIW I have previously tried using nomachine which is able to give me
>>>> the perceived GL acceleration by "mirroring" my host display, but that
>>>> just
>>>> feels like the wrong way to achieve this (not least because it requires a
>>>> monitor attached to use).
>>>>
>>>> Thanks,
>>>>
>>>> ==== RENDER TESTS ====
>>>>
>>>> $ *glmark2*
>>>> =======================================================
>>>> glmark2 2017.07
>>>> =======================================================
>>>> OpenGL Information
>>>> GL_VENDOR: Intel Open Source Technology Center
>>>> GL_RENDERER: Mesa DRI Intel(R) HD Graphics P4600/P4700 (HSW GT2)
>>>> GL_VERSION: 3.0 Mesa 20.0.4
>>>> =======================================================
>>>> [build] use-vbo=false: FPS: 2493 FrameTime: 0.401 ms
>>>> =======================================================
>>>> glmark2 Score: 2493
>>>> =======================================================
>>>>
>>>> $ *LIBGL_ALWAYS_SOFTWARE=1 glmark2*
>>>> ** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control!
>>>> ** Failed to set swap interval. Results may be bounded above by refresh
>>>> rate.
>>>> =======================================================
>>>> glmark2 2017.07
>>>> =======================================================
>>>> OpenGL Information
>>>> GL_VENDOR: VMware, Inc.
>>>> GL_RENDERER: llvmpipe (LLVM 9.0.1, 256 bits)
>>>> GL_VERSION: 3.1 Mesa 20.0.4
>>>> =======================================================
>>>> ** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control!
>>>> ** Failed to set swap interval. Results may be bounded above by refresh
>>>> rate.
>>>> [build] use-vbo=false: FPS: 420 FrameTime: 2.381 ms
>>>> =======================================================
>>>> glmark2 Score: 420
>>>> =======================================================
>>>>
>>>>
>>>> On Thursday, 16 April 2020 23:21:59 UTC+1, DRC wrote:
>>>>>
>>>>> On 4/16/20 3:19 PM, Shak wrote:
>>>>>
>>>>> Thank you for the quick tips. I have posted some results at the end of
>>>>> this post, but they seem inconsistent. glxspheres64 shows the correct
>>>>> renderer respectively and the performance shows the 6x results I was
>>>>> expecting. However I do not see the same gains in glmark2, even though it
>>>>> also reports the correct renderer in each case. Again, I see a glmark of
>>>>> 2000+ when running it in display :0.
>>>>>
>>>>> I don't know much about glmark2, but as with any benchmark, Amdahl's
>>>>> Law applies. That means that the total speedup from any enhancement
>>>>> (such
>>>>> as a GPU) is limited by the percentage of clock time during which that
>>>>> enhancement is used. Not all OpenGL workloads are GPU-bound in terms of
>>>>> performance. If the geometry and window size are both really small, then
>>>>> the performance could very well be CPU-bound. That's why, for instance,
>>>>> GLXgears is a poor OpenGL benchmark. Real-world applications these days
>>>>> assume the presence of a GPU, so they're going to have no qualms about
>>>>> trying to render geometries with hundreds of thousands or even millions
>>>>> of
>>>>> polygons. When you try to do that with software OpenGL, you'll see a big
>>>>> difference vs. GPU acceleration-- a difference that won't necessarily
>>>>> show
>>>>> up with tiny geometries.
>>>>>
>>>>> You can confirm that that's the case by running glmark2 on your local
>>>>> display without VirtualGL and forcing the use of the swrast driver. I
>>>>> suspect that the difference between swrast and i965 won't be very great
>>>>> in
>>>>> that scenario, either. (I should also mention that Intel GPUs aren't the
>>>>> fastest in the world, so you're never going to see as much of a speedup--
>>>>> nor as large of a speedup in as many cases-- as you would see with AMD or
>>>>> nVidia.)
>>>>>
>>>>> The other thing is, if the benchmark is attempting to measure
>>>>> unrealistic frame rates-- like hundreds or thousands of frames per
>>>>> second--
>>>>> then there is a small amount of per-frame overhead introduced by
>>>>> VirtualGL
>>>>> that may be limiting that frame rate. But the reality is that human
>>>>> vision
>>>>> can't usually detect more than 60 fps anyhow, so the difference between,
>>>>> say, 200 fps and 400 fps is not going to matter to an application user.
>>>>> At
>>>>> more realistic frame rates, VGL's overhead won't be noticeable.
>>>>>
>>>>> Performance measurement in a VirtualGL environment is more complicated
>>>>> than performance measurement in a local display environment, which is why
>>>>> there's a whole section of the VirtualGL User's Guide dedicated to it.
>>>>> Basically, since VGL introduces a small amount of per-frame overhead but
>>>>> no
>>>>> per-vertex overhead, at realistic frame rates and with modern server and
>>>>> client hardware, it will not appear any slower than a local display.
>>>>> However, some synthetic benchmarks may record slower performance due to
>>>>> the
>>>>> aforementioned overhead.
>>>>>
>>>>>
>>>>> In the meantime I have been trying to get the DE as a whole to run
>>>>> under acceleration. I record my findings here as a possible clue to my
>>>>> VGL
>>>>> issues above. In my .vnc/xstartup.turbovnc I use the following command:
>>>>>
>>>>> #normal start - works with llvmpipe and vglrun
>>>>> #exec startplasma-x11
>>>>>
>>>>> #VGL start
>>>>> exec vglrun +wm startplasma-x11
>>>>>
>>>>> And I also start tvnc with:
>>>>>
>>>>> $vncserver -3dwm
>>>>>
>>>>> I'm not sure if vglrun, +wm or -3dwm are redundant or working against
>>>>> each other, but I've also tried various combinations to no avail.
>>>>>
>>>>> Just use the default xstartup.turbovnc script ('rm
>>>>> ~/.vnc/xstartup.turbovnc' and re-run /opt/TurboVNC/bin/vncserver to
>>>>> create
>>>>> it) and start TurboVNC with '-wm startplasma-x11 -vgl'.
>>>>>
>>>>> * -3dwm is deprecated. Use -vgl instead. -3dwm/-vgl (or setting
>>>>> '$useVGL = 1;' in /etc/turbovncserver.conf or ~/.vnc/turbovncserver.conf)
>>>>> simply instructs xstartup.turbovnc to run the window manager startup
>>>>> script
>>>>> using 'vglrun +wm'.
>>>>>
>>>>> * Passing -wm to /opt/TurboVNC/bin/vncserver (or setting '$wm =
>>>>> {script};' in turbovncserver.conf) instructs xstartup.turbovnc to execute
>>>>> the specified window manager startup script rather than
>>>>> /etc/X11/xinit/xinitrc.
>>>>>
>>>>> * +wm is a feature of VirtualGL, not TurboVNC. Normally, if VirtualGL
>>>>> detects that an OpenGL application is not monitoring StructureNotify
>>>>> events, VGL will monitor those events on behalf of the application (which
>>>>> allows VGL to be notified when the window changes size, thus allowing VGL
>>>>> to change the size of the corresponding Pbuffer.) This is, however,
>>>>> unnecessary with window managers and interferes with some of them
>>>>> (compiz,
>>>>> specifically), so +wm disables that behavior in VirtualGL. It's also a
>>>>> placeholder in case future issues are discovered that are specific to
>>>>> compositing window managers (+wm could easily be extended to handle those
>>>>> issues as well.)
>>>>>
>>>>> Interestingly I had to update the vglrun script to have the full paths
>>>>> to /usr/lib/libdlfaker.so and the others otherwise I see the following in
>>>>> the TVNC logs:
>>>>>
>>>>> ERROR: ld.so: object 'libdlfaker.so' from LD_PRELOAD cannot be
>>>>> preloaded (cannot open shared object file): ignored.
>>>>> ERROR: ld.so: object 'libvglfaker.so' from LD_PRELOAD cannot be
>>>>> preloaded (cannot open shared object file): ignored.
>>>>>
>>>>> That said, my desktop is still broken even when these errors disappear.
>>>>>
>>>>> Could my various issues be to do with KDE?
>>>>>
>>>>> The LD_PRELOAD issues can be fixed as described here:
>>>>>
>>>>> https://cdn.rawgit.com/VirtualGL/virtualgl/2.6.3/doc/index.html#hd0012
>>>>>
>>>>> All
>>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"VirtualGL User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/virtualgl-users/ea90fd9d-ec3e-4d42-a048-342939d96614%40googlegroups.com.