Re: [TurboVNC-Devel] Max rect size change for tight

'DRC' via TurboVNC Developer Discussion Thu, 17 Aug 2023 10:27:26 -0700

To clarify and tie my conclusions below to the comment I made previouslyabout rectangle sizes:

One reason why the 2D datasets, which mostly represent legacy(primitive-based and single-buffered, as opposed to image-based anddouble-buffered) X11 rendering workloads, perform best with the TurboVNCencoder is that they have relatively small framebuffer updates. To putnumbers on this, here are the average framebuffer update rectangle sizes(in pixels) of the various datasets:


slashdot-24:  3467
photos-24:  2962
kde-hearts-24:  1854

3dsmax-04-24: 998105
catia-02-24: 1005257
ensight-03-24: 837883
light-08-24: 681317
maya-02-24: 985356
proe-04-24: 843144
sw-01-24: 915438
tcvis-01-24: 786859
ugnx-01-24:  793831
glxspheres-24:  478242
googleearth-24:  8898
q3demo-24:  14875

The smaller the rectangle, the greater the chance that it will have alow enough number of unique colors to qualify for indexed colorsubencoding. More modern image-based workloads usually double buffer,so the rendering occurs off-screen, and the entire back buffer isswapped to the X11 display in one throw. Such workloads generally havelarge framebuffer update rectangles. That type of workload is verysimilar to what VirtualGL does, so the 3D datasets (which consist ofOpenGL applications and Viewperf datasets running with VirtualGL) aremore reflective of modern X11 applications. However, some of thoseViewperf datasets simulate wireframe modes in certain CAD applications,so they have relatively low numbers of unique colors and still benefitfrom the TurboVNC encoder (relative to pure JPEG or modern video codecssuch as H.264.) Other Viewperf datasets have large areas of solid colorand also benefit from the TurboVNC encoder. Applications such as gamesor Google Earth that fill the whole screen, render a large number ofunique colors, and render few areas of solid color, are the bestcandidates for pure JPEG or video codecs. The small rectangle size inthe Google Earth and Quake 3 datasets is likely a result of theaforementioned ancient session capture infrastructure. However, evenwith those small rectangles, both datasets generally benefited from pureJPEG encoding because of their high color counts. On the flip side,some of the 3D datasets (Catia and Teamcenter Vis, for instance) neverbenefited from pure JPEG, despite having large rectangle sizes, becauseof their low color counts. It is also worth mentioning that JPEG isdesigned to compress continuous-tone images, so it does a relativelypoor job of compressing sharp features, such as those generated bywireframe modes in CAD applications. Wireframe modes were once morecommon, because they provided a way to smoothly interact with modelsthat couldn't otherwise be rendered in real time by the slow 3Daccelerators available at the time. (The first 3D accelerators I workedwith in the mid 1990s, based on the 3Dlabs GLINT chip, could renderabout 300k polys/sec.) Those modes are less common these days, but theystill exist.

tl;dr: The TurboVNC encoder is a compromise that maximizes performanceacross all of those application categories as best it can, but there arespecific application categories for which a more video-like encoder is abetter solution. The 2D datasets are the same datasets that Constantinused when designing the TightVNC encoder, so one of the goals of theTurboVNC encoder overhaul in 2008 was to provide similar compressionratios on those datasets relative to TightVNC 1.3.x (to convinceTightVNC users that they could switch to TurboVNC without losing anyperformance on low-bandwidth networks) while providing optimalcompression ratios and performance for 3D applications running withVirtualGL.


DRC

On 8/16/23 5:11 PM, DRC wrote:

I did some low-level experiments with the TurboVNC Benchmark Tools(https://github.com/TurboVNC/vncbenchtools), comparing the existingTurboVNC encoder, accelerated with the Intel zlib implementation andusing the "Tight + Perceptually Lossless JPEG" and "Tight +Medium-Quality JPEG" presets, against pure JPEG encoding with the sameJPEG quality and subsampling levels. The results were interesting.
Perceptually Lossless JPEG:
As expected based on prior research(https://turbovnc.org/pmwiki/uploads/About/tighttoturbo.pdf), pureJPEG with no other modifications produced worse (often much worse)compression with all datasets except Lightscape, GLXSpheres, and Quake3, which compressed about 4-5% better. (Catia, Teamcenter Vis,Unigraphics NX, and Google Earth regressed by less than 10%.) PureJPEG with no other modifications also produced worse (often muchworse) performance with all datasets except Pro/E (+8%) and Quake 3(+4%). (3D Studio Max, Ensight, Lightscape, and Google Earthregressed by less than 10%.)
However, modifying the TurboJPEG API library so that it generates"abbreviated image datastreams", i.e. JPEG images with no embeddedtables (the equivalent of motion-JPEG frames), improved theperformance of pure JPEG somewhat. Now Lightscape (+5%), GLXSpheres(+6%), Google Earth (+9%), and Quake 3 (+16%) compressed better withpure JPEG than with the TurboVNC encoder. (photos-24, kde-hearts-24,Catia, Teamcenter Vis, and Unigraphics NX regressed by less than10%.) As above, only Pro/E (+8%) and Quake 3 (+5%) were faster withpure JPEG. (3D Studio Max, Ensight, Lightscape, and Google Earthregressed by less than 10%.)
I then repeated the same tests with the Interframe Comparison Engineenabled. Interframe comparison significantly improves compressionwith the 2D datasets, with mixed results for the 3D datasets, but therelative differences between the TurboVNC encoder and pure JPEG werepretty similar with interframe comparison enabled vs. disabled. Withinterframe comparison enabled, Pro/E now compressed better with pureJPEG than with the TurboVNC encoder, and Unigraphics NX compressedabout the same. However, all of the other datasets were generally inthe same relative ranges as above (give or take.)
Medium-Quality JPEG:
When you reduce the JPEG quality, the size of JPEG rectangles andsubrectangles decreases, but the size of indexed-color rectanglesstays the same. Thus, pure JPEG was more advantageous withmedium-quality JPEG than it was with perceptually lossless JPEG. Inthis case, I tested only with interframe comparison enabled (since the"Tight + Medium-Quality JPEG" preset always enables it) and only withabbreviated image datastreams.
kde-hearts-16 (+4%), photos-24 (+3%), kde-hearts-24 (+11%), Lightscape(+5%), Pro/E (+6%), Unigraphics NX (+4%), GLXSpheres (+12%), GoogleEarth (+24%), and Quake 3 (+39%) compressed better with pure JPEG thanwith the TurboVNC Encoder. (Catia and Teamcenter Vis regressed byless than 10%.) 3D Studio Max (+13%), Ensight (+38%), and Pro/E(+13%) were faster with pure JPEG. (Lightscape, SolidWorks, GoogleEarth, and Quake 3 regressed by less than 10%.)
AVX2 Instructions:
The initial tests were conducted on an older machine that lacks AVX2instructions, so I re-ran the tests on a newer machine. This gavepure JPEG more of a performance advantage, since the Intel zlibimplementation cannot use AVX2 instructions but libjpeg-turbo can.
With perceptually lossless JPEG: Ensight (+3%), Pro/E (+11%), andQuake 3 (+7%) were faster with pure JPEG than with the TurboVNCencoder, and 3D Studio Max was about the same. (Lightscape and GoogleEarth regressed by less than 10%.)
With medium-quality JPEG: 3D Studio Max (+26%), Ensight (+57%), Pro/E(+18%), SolidWorks (+8%), and Quake 3 (+3%) were faster with pureJPEG, and Maya was about the same. (kde-hearts-24, Catia, Lightscape,and Google Earth regressed by less than 10%.)
Caveat:
The datasets in question were captured in the early 2000s (the 3Ddatasets in 2008 and the 2D datasets years earlier), so many of themrepresent outdated workloads. (Most modern X11 applications use someform of image-based rendering rather than X11 primitives.) Also,because of limitations in the benchmark tools (which were inheritedfrom TightVNC), the datasets had to be generated using a very old VNCserver (TightVNC 1.3.9) and viewer (RealVNC 3.3.6) and an RFB proxythat sat between the two. That infrastructure was slow and effectivelydropped a lot of frames, so the session captures and benchmark toolsare not the best simulation of the TurboVNC Server. It would notsurprise me if pure JPEG performs better in real-world usage than isreflected above.
General Conclusions:
Unsurprisingly, the TurboVNC encoder is the most advantageous,relative to pure JPEG, on older (X11-primitive-based) workloads,workloads with fewer unique colors, and workloads with large areas ofsolid color. Pure JPEG is the most advantageous on image-basedworkloads, workloads with more unique colors, and workloads with fewareas of solid color. Pure JPEG also has more of an advantage whenthe JPEG quality is decreased and when AVX2 instructions are available.
It seems as if pure JPEG encoding is advantageous enough in enoughcases to justify its existence. I will look at including it in thenext major release of TurboVNC, along with GUI modifications(https://github.com/TurboVNC/turbovnc/issues/70, as well as exposingthe CompatGUI parameter in the GUI) that will make it morestraightforward to enable non-Tight encodings.
I suspect that, if I were to completely revisit my analysis from 2008and develop entirely new datasets, I would find little justificationfor indexed color subencoding with modern applications. That wouldmean that most of the advantage of the TurboVNC encoder these dayscomes from its ability to send large areas of solid color using only afew bytes. Both X11 and RFB were designed around the limitations of1980s systems (including the need to support single-buffered graphicssystems.) Wayland jettisons the X11 legacy, but there is also aburning need for a more modern open source/open standard remotedisplay protocol that is not beholden to the RFB legacy, preferably aprotocol that is a better fit for image-based workloads, Wayland,GPU-resident framebuffers, and modern video codecs. Seehttps://www.reddit.com/r/linux_gaming/comments/yvjqby/comment/jvricah/?utm_source=reddit&utm_medium=web2x&context=3,https://github.com/TurboVNC/turbovnc/issues/18, andhttps://github.com/TurboVNC/turbovnc/issues/19, andhttps://github.com/TurboVNC/turbovnc/issues/373 for more of my musingson that topic. Do I think that anyone will ever fund that kind ofblue-sky research in an open source project such as this? Probablynot. TurboVNC is innovative compared to other VNC solutions and maybecompared to most (but not all) open source remote display solutions,but there are proprietary solutions these days that do a lot of thingsthat VNC will never be able to do. (Let's start with streaming overUDP, which the RFB protocol could never support.) People mostly useTurboVNC because it's free and good enough, so I don't foresee beingable to do much more with the protocol other than minor tweaks likethis that allow it to get out of the way of certain use cases.
On 7/7/23 6:26 AM, RG wrote:
Thanks for this in-depth answer,
I have been playing with RAW and found that it has a similarperformance as single tight rectangle + JPEG 80. But I see morelatency variation from the network/communication (Likely due to mysystem having also some other activities on localhost and that thereis a lot more data to transfert).
I am trying to implement the pure JPEG, as, based on yourexplanation, I think it will be a cleaner solution than increasingthe max rect size.
Also RealVNC seem to have already a pure JPEG rectangle (As Encoding21) if you ever decide to implement it.
Rémi

Le jeudi 15 juin 2023 à 17:34:08 UTC+2, DRC a écrit :

    The Tight encoding specification requires rectangles to be <=
    2048 pixels in width, but there isn't any documented limit on the
    rectangle size.  I don't think that you're playing with fire
    necessarily, although the Tight encoder has never been tested
    with rectangles > 64k, so I can't guarantee that there aren't
    hidden bugs.  However, I question whether Tight encoding is the
    most appropriate way to transfer pixels in a loopback
    environment.  It seems like you might be better served by
    transferring the pixels using Raw encoding, which would require
    more bus bandwidth but less CPU time.

    Referring to
    https://turbovnc.org/pmwiki/uploads/About/tighttoturbo.pdf, a big
    reason why TurboVNC itself doesn't use larger rectangles is that
    there's a tradeoff in terms of encoding efficiency.  The larger
    the rectangle, the more likely it is that the number of unique
    colors in the rectangle will exceed the Tight palette threshold. 
    Thus, as the rectangle size increases, you will reach a point at
    which only JPEG is used to encode the non-solid subrectangles. 
    At that point, there isn't much benefit to the complexity of
    Tight encoding, and you'd probably get better performance by
    simply encoding every rectangle as a pure JPEG image. I am not
    sure whether 1 megapixel is beyond that point, but given that the
    palette threshold is low (24 or 96 colors, depending on the Tight
    compression level), it wouldn't surprise me if 1-megapixel
    rectangles are almost always encoded as JPEG.  In that case, the
    only real benefit you'd get from Tight encoding is a slight
    reduction in the bitstream size if there are huge areas of solid
    color, since the Tight encoder can encode those as a bounding box
    and fill color (whereas JPEG has a not-insignificant amount of
    overhead, both in terms of compute time and bitstream size, when
    encoding a single-color image.) However, I don't know whether
    that benefit is worth the additional computational overhead of
    analyzing the rectangle, nor whether it is worth the additional
    bitstream size overhead of dividing non-solid areas of the
    rectangle into multiple JPEG subrectangles (as opposed to sending
    the whole rectangle as a single JPEG image.)

    That explanation also serves as an explanation for why I wouldn't
    be willing to add a Tight compression level with a rectangle size
    of 1048576.  I would, however, be willing to support the pure RFB
    JPEG encoding type in TurboVNC, if it proves to be of any
    benefit.  That encoding type is dead simple and would involve
    merely passing every RFB rectangle directly to libjpeg-turbo.

    DRC

    On 6/15/23 10:17 AM, RG wrote:
    Hi,

    I have been trying to improve performances of TVNC + NoVNC in a
    loopback environement. I have 1000x1000 image updates at 30fps
    and found that the tight compression created ~15 frames of
    1000x64. This in turn make NoVNC take some time to read and
    write each image as well as some garbage collector issue due to
    to many image creation.

    I changed the maxRectSize from to 65'536 to 1'048'576 in
    tightConf (in tight.c) which send the the full image to NoVNC et
    improve both timing and garbage collector issue.

    I was wondering if I was playing with fire and risked some
    unintended effects ? Is there another solution to force TVNC to
    send bigger chunks ?

    Regards,
    Rémi
--You received this message because you are subscribed to the
    Google Groups "TurboVNC Developer Discussion" group.
    To unsubscribe from this group and stop receiving emails from
    it, send an email to turbovnc-deve...@googlegroups.com.
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/turbovnc-devel/df98e4d5-7d4c-4f8c-acc1-4cca24b59174n%40googlegroups.com
    
<https://groups.google.com/d/msgid/turbovnc-devel/df98e4d5-7d4c-4f8c-acc1-4cca24b59174n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the GoogleGroups "TurboVNC Developer Discussion" group.To unsubscribe from this group and stop receiving emails from it,send an email to turbovnc-devel+unsubscr...@googlegroups.com.To view this discussion on the web visithttps://groups.google.com/d/msgid/turbovnc-devel/ed615d0d-4a33-440b-821e-39e4bd75f00en%40googlegroups.com<https://groups.google.com/d/msgid/turbovnc-devel/ed615d0d-4a33-440b-821e-39e4bd75f00en%40googlegroups.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups "TurboVNC 
Developer Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to turbovnc-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/turbovnc-devel/ff7c4e47-cead-1b87-c3a9-e77df709864d%40virtualgl.org.

Re: [TurboVNC-Devel] Max rect size change for tight

Reply via email to