To clarify and tie my conclusions below to the comment I made previously
about rectangle sizes:
One reason why the 2D datasets, which mostly represent legacy
(primitive-based and single-buffered, as opposed to image-based and
double-buffered) X11 rendering workloads, perform best with the TurboVNC
encoder is that they have relatively small framebuffer updates. To put
numbers on this, here are the average framebuffer update rectangle sizes
(in pixels) of the various datasets:
slashdot-24: 3467
photos-24: 2962
kde-hearts-24: 1854
3dsmax-04-24: 998105
catia-02-24: 1005257
ensight-03-24: 837883
light-08-24: 681317
maya-02-24: 985356
proe-04-24: 843144
sw-01-24: 915438
tcvis-01-24: 786859
ugnx-01-24: 793831
glxspheres-24: 478242
googleearth-24: 8898
q3demo-24: 14875
The smaller the rectangle, the greater the chance that it will have a
low enough number of unique colors to qualify for indexed color
subencoding. More modern image-based workloads usually double buffer,
so the rendering occurs off-screen, and the entire back buffer is
swapped to the X11 display in one throw. Such workloads generally have
large framebuffer update rectangles. That type of workload is very
similar to what VirtualGL does, so the 3D datasets (which consist of
OpenGL applications and Viewperf datasets running with VirtualGL) are
more reflective of modern X11 applications. However, some of those
Viewperf datasets simulate wireframe modes in certain CAD applications,
so they have relatively low numbers of unique colors and still benefit
from the TurboVNC encoder (relative to pure JPEG or modern video codecs
such as H.264.) Other Viewperf datasets have large areas of solid color
and also benefit from the TurboVNC encoder. Applications such as games
or Google Earth that fill the whole screen, render a large number of
unique colors, and render few areas of solid color, are the best
candidates for pure JPEG or video codecs. The small rectangle size in
the Google Earth and Quake 3 datasets is likely a result of the
aforementioned ancient session capture infrastructure. However, even
with those small rectangles, both datasets generally benefited from pure
JPEG encoding because of their high color counts. On the flip side,
some of the 3D datasets (Catia and Teamcenter Vis, for instance) never
benefited from pure JPEG, despite having large rectangle sizes, because
of their low color counts. It is also worth mentioning that JPEG is
designed to compress continuous-tone images, so it does a relatively
poor job of compressing sharp features, such as those generated by
wireframe modes in CAD applications. Wireframe modes were once more
common, because they provided a way to smoothly interact with models
that couldn't otherwise be rendered in real time by the slow 3D
accelerators available at the time. (The first 3D accelerators I worked
with in the mid 1990s, based on the 3Dlabs GLINT chip, could render
about 300k polys/sec.) Those modes are less common these days, but they
still exist.
tl;dr: The TurboVNC encoder is a compromise that maximizes performance
across all of those application categories as best it can, but there are
specific application categories for which a more video-like encoder is a
better solution. The 2D datasets are the same datasets that Constantin
used when designing the TightVNC encoder, so one of the goals of the
TurboVNC encoder overhaul in 2008 was to provide similar compression
ratios on those datasets relative to TightVNC 1.3.x (to convince
TightVNC users that they could switch to TurboVNC without losing any
performance on low-bandwidth networks) while providing optimal
compression ratios and performance for 3D applications running with
VirtualGL.
DRC
On 8/16/23 5:11 PM, DRC wrote:
I did some low-level experiments with the TurboVNC Benchmark Tools
(https://github.com/TurboVNC/vncbenchtools), comparing the existing
TurboVNC encoder, accelerated with the Intel zlib implementation and
using the "Tight + Perceptually Lossless JPEG" and "Tight +
Medium-Quality JPEG" presets, against pure JPEG encoding with the same
JPEG quality and subsampling levels. The results were interesting.
Perceptually Lossless JPEG:
As expected based on prior research
(https://turbovnc.org/pmwiki/uploads/About/tighttoturbo.pdf), pure
JPEG with no other modifications produced worse (often much worse)
compression with all datasets except Lightscape, GLXSpheres, and Quake
3, which compressed about 4-5% better. (Catia, Teamcenter Vis,
Unigraphics NX, and Google Earth regressed by less than 10%.) Pure
JPEG with no other modifications also produced worse (often much
worse) performance with all datasets except Pro/E (+8%) and Quake 3
(+4%). (3D Studio Max, Ensight, Lightscape, and Google Earth
regressed by less than 10%.)
However, modifying the TurboJPEG API library so that it generates
"abbreviated image datastreams", i.e. JPEG images with no embedded
tables (the equivalent of motion-JPEG frames), improved the
performance of pure JPEG somewhat. Now Lightscape (+5%), GLXSpheres
(+6%), Google Earth (+9%), and Quake 3 (+16%) compressed better with
pure JPEG than with the TurboVNC encoder. (photos-24, kde-hearts-24,
Catia, Teamcenter Vis, and Unigraphics NX regressed by less than
10%.) As above, only Pro/E (+8%) and Quake 3 (+5%) were faster with
pure JPEG. (3D Studio Max, Ensight, Lightscape, and Google Earth
regressed by less than 10%.)
I then repeated the same tests with the Interframe Comparison Engine
enabled. Interframe comparison significantly improves compression
with the 2D datasets, with mixed results for the 3D datasets, but the
relative differences between the TurboVNC encoder and pure JPEG were
pretty similar with interframe comparison enabled vs. disabled. With
interframe comparison enabled, Pro/E now compressed better with pure
JPEG than with the TurboVNC encoder, and Unigraphics NX compressed
about the same. However, all of the other datasets were generally in
the same relative ranges as above (give or take.)
Medium-Quality JPEG:
When you reduce the JPEG quality, the size of JPEG rectangles and
subrectangles decreases, but the size of indexed-color rectangles
stays the same. Thus, pure JPEG was more advantageous with
medium-quality JPEG than it was with perceptually lossless JPEG. In
this case, I tested only with interframe comparison enabled (since the
"Tight + Medium-Quality JPEG" preset always enables it) and only with
abbreviated image datastreams.
kde-hearts-16 (+4%), photos-24 (+3%), kde-hearts-24 (+11%), Lightscape
(+5%), Pro/E (+6%), Unigraphics NX (+4%), GLXSpheres (+12%), Google
Earth (+24%), and Quake 3 (+39%) compressed better with pure JPEG than
with the TurboVNC Encoder. (Catia and Teamcenter Vis regressed by
less than 10%.) 3D Studio Max (+13%), Ensight (+38%), and Pro/E
(+13%) were faster with pure JPEG. (Lightscape, SolidWorks, Google
Earth, and Quake 3 regressed by less than 10%.)
AVX2 Instructions:
The initial tests were conducted on an older machine that lacks AVX2
instructions, so I re-ran the tests on a newer machine. This gave
pure JPEG more of a performance advantage, since the Intel zlib
implementation cannot use AVX2 instructions but libjpeg-turbo can.
With perceptually lossless JPEG: Ensight (+3%), Pro/E (+11%), and
Quake 3 (+7%) were faster with pure JPEG than with the TurboVNC
encoder, and 3D Studio Max was about the same. (Lightscape and Google
Earth regressed by less than 10%.)
With medium-quality JPEG: 3D Studio Max (+26%), Ensight (+57%), Pro/E
(+18%), SolidWorks (+8%), and Quake 3 (+3%) were faster with pure
JPEG, and Maya was about the same. (kde-hearts-24, Catia, Lightscape,
and Google Earth regressed by less than 10%.)
Caveat:
The datasets in question were captured in the early 2000s (the 3D
datasets in 2008 and the 2D datasets years earlier), so many of them
represent outdated workloads. (Most modern X11 applications use some
form of image-based rendering rather than X11 primitives.) Also,
because of limitations in the benchmark tools (which were inherited
from TightVNC), the datasets had to be generated using a very old VNC
server (TightVNC 1.3.9) and viewer (RealVNC 3.3.6) and an RFB proxy
that sat between the two. That infrastructure was slow and effectively
dropped a lot of frames, so the session captures and benchmark tools
are not the best simulation of the TurboVNC Server. It would not
surprise me if pure JPEG performs better in real-world usage than is
reflected above.
General Conclusions:
Unsurprisingly, the TurboVNC encoder is the most advantageous,
relative to pure JPEG, on older (X11-primitive-based) workloads,
workloads with fewer unique colors, and workloads with large areas of
solid color. Pure JPEG is the most advantageous on image-based
workloads, workloads with more unique colors, and workloads with few
areas of solid color. Pure JPEG also has more of an advantage when
the JPEG quality is decreased and when AVX2 instructions are available.
It seems as if pure JPEG encoding is advantageous enough in enough
cases to justify its existence. I will look at including it in the
next major release of TurboVNC, along with GUI modifications
(https://github.com/TurboVNC/turbovnc/issues/70, as well as exposing
the CompatGUI parameter in the GUI) that will make it more
straightforward to enable non-Tight encodings.
I suspect that, if I were to completely revisit my analysis from 2008
and develop entirely new datasets, I would find little justification
for indexed color subencoding with modern applications. That would
mean that most of the advantage of the TurboVNC encoder these days
comes from its ability to send large areas of solid color using only a
few bytes. Both X11 and RFB were designed around the limitations of
1980s systems (including the need to support single-buffered graphics
systems.) Wayland jettisons the X11 legacy, but there is also a
burning need for a more modern open source/open standard remote
display protocol that is not beholden to the RFB legacy, preferably a
protocol that is a better fit for image-based workloads, Wayland,
GPU-resident framebuffers, and modern video codecs. See
https://www.reddit.com/r/linux_gaming/comments/yvjqby/comment/jvricah/?utm_source=reddit&utm_medium=web2x&context=3,
https://github.com/TurboVNC/turbovnc/issues/18, and
https://github.com/TurboVNC/turbovnc/issues/19, and
https://github.com/TurboVNC/turbovnc/issues/373 for more of my musings
on that topic. Do I think that anyone will ever fund that kind of
blue-sky research in an open source project such as this? Probably
not. TurboVNC is innovative compared to other VNC solutions and maybe
compared to most (but not all) open source remote display solutions,
but there are proprietary solutions these days that do a lot of things
that VNC will never be able to do. (Let's start with streaming over
UDP, which the RFB protocol could never support.) People mostly use
TurboVNC because it's free and good enough, so I don't foresee being
able to do much more with the protocol other than minor tweaks like
this that allow it to get out of the way of certain use cases.
On 7/7/23 6:26 AM, RG wrote:
Thanks for this in-depth answer,
I have been playing with RAW and found that it has a similar
performance as single tight rectangle + JPEG 80. But I see more
latency variation from the network/communication (Likely due to my
system having also some other activities on localhost and that there
is a lot more data to transfert).
I am trying to implement the pure JPEG, as, based on your
explanation, I think it will be a cleaner solution than increasing
the max rect size.
Also RealVNC seem to have already a pure JPEG rectangle (As Encoding
21) if you ever decide to implement it.
Rémi
Le jeudi 15 juin 2023 à 17:34:08 UTC+2, DRC a écrit :
The Tight encoding specification requires rectangles to be <=
2048 pixels in width, but there isn't any documented limit on the
rectangle size. I don't think that you're playing with fire
necessarily, although the Tight encoder has never been tested
with rectangles > 64k, so I can't guarantee that there aren't
hidden bugs. However, I question whether Tight encoding is the
most appropriate way to transfer pixels in a loopback
environment. It seems like you might be better served by
transferring the pixels using Raw encoding, which would require
more bus bandwidth but less CPU time.
Referring to
https://turbovnc.org/pmwiki/uploads/About/tighttoturbo.pdf, a big
reason why TurboVNC itself doesn't use larger rectangles is that
there's a tradeoff in terms of encoding efficiency. The larger
the rectangle, the more likely it is that the number of unique
colors in the rectangle will exceed the Tight palette threshold.
Thus, as the rectangle size increases, you will reach a point at
which only JPEG is used to encode the non-solid subrectangles.
At that point, there isn't much benefit to the complexity of
Tight encoding, and you'd probably get better performance by
simply encoding every rectangle as a pure JPEG image. I am not
sure whether 1 megapixel is beyond that point, but given that the
palette threshold is low (24 or 96 colors, depending on the Tight
compression level), it wouldn't surprise me if 1-megapixel
rectangles are almost always encoded as JPEG. In that case, the
only real benefit you'd get from Tight encoding is a slight
reduction in the bitstream size if there are huge areas of solid
color, since the Tight encoder can encode those as a bounding box
and fill color (whereas JPEG has a not-insignificant amount of
overhead, both in terms of compute time and bitstream size, when
encoding a single-color image.) However, I don't know whether
that benefit is worth the additional computational overhead of
analyzing the rectangle, nor whether it is worth the additional
bitstream size overhead of dividing non-solid areas of the
rectangle into multiple JPEG subrectangles (as opposed to sending
the whole rectangle as a single JPEG image.)
That explanation also serves as an explanation for why I wouldn't
be willing to add a Tight compression level with a rectangle size
of 1048576. I would, however, be willing to support the pure RFB
JPEG encoding type in TurboVNC, if it proves to be of any
benefit. That encoding type is dead simple and would involve
merely passing every RFB rectangle directly to libjpeg-turbo.
DRC
On 6/15/23 10:17 AM, RG wrote:
Hi,
I have been trying to improve performances of TVNC + NoVNC in a
loopback environement. I have 1000x1000 image updates at 30fps
and found that the tight compression created ~15 frames of
1000x64. This in turn make NoVNC take some time to read and
write each image as well as some garbage collector issue due to
to many image creation.
I changed the maxRectSize from to 65'536 to 1'048'576 in
tightConf (in tight.c) which send the the full image to NoVNC et
improve both timing and garbage collector issue.
I was wondering if I was playing with fire and risked some
unintended effects ? Is there another solution to force TVNC to
send bigger chunks ?
Regards,
Rémi
--
You received this message because you are subscribed to the
Google Groups "TurboVNC Developer Discussion" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to turbovnc-deve...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/turbovnc-devel/df98e4d5-7d4c-4f8c-acc1-4cca24b59174n%40googlegroups.com
<https://groups.google.com/d/msgid/turbovnc-devel/df98e4d5-7d4c-4f8c-acc1-4cca24b59174n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google
Groups "TurboVNC Developer Discussion" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to turbovnc-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/turbovnc-devel/ed615d0d-4a33-440b-821e-39e4bd75f00en%40googlegroups.com
<https://groups.google.com/d/msgid/turbovnc-devel/ed615d0d-4a33-440b-821e-39e4bd75f00en%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "TurboVNC
Developer Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to turbovnc-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/turbovnc-devel/ff7c4e47-cead-1b87-c3a9-e77df709864d%40virtualgl.org.