Major correction to the below. Actually, I was measuring the performance of TigerVNC incorrectly. When benchmarking vncviewer, it's necessary to synchronize the decode and the blitting in order to accurately break down the time spent in each, and that's what both of the TurboVNC viewers do, so I had to do likewise with the TigerVNC viewer in order to obtain an apples-to-apples comparison. Here are the amended results:
Dell Precision T3500, quad-core 2.8 GHz Intel Xeon W3530, nVidia Quadro 600 (310.14), CentOS 5.8: X11 TurboVNC Viewer 1.2 beta1: Decode / Total = 10.6 / 15.6 s Java TurboVNC Viewer 1.2 beta1: Decode / Total = 12.5 / 23.2 s Native TigerVNC Viewer r5051: Decode / Total = 12.2 / 23.7 s HP Pavilion Slimline s5100z, dual-core 2.6 GHz AMD Athlon 64 X2 5050e, nVidia GeForce 6150SE (304.64), CentOS 6.3: X11 TurboVNC Viewer 1.2 beta1: Decode / Total = 17.2 / 24.0 s Java TurboVNC Viewer 1.2 beta1: Decode / Total = 25.3 / 64.4 s Native TigerVNC Viewer r5051: Decode / Total = 20.6 / 36.7 s Mac Mini 2.0 GHz Intel Core 2 Duo, nVidia GeForce 9400, OS X 10.8.2: X11 TurboVNC Viewer 1.2 beta1: Decode / Total = 16.2 / 48.8 s Java TurboVNC Viewer 1.2 beta1: Decode / Total = 19.0 / 39.7 s Native TigerVNC Viewer r5051: Decode / Total = 25.0 / 381 s NOTES: -- The FLTK performance on OS X still seems like a bug. Hard to believe that it would really be 10x slower than Java. If any TigerVNC developers are reading this, you may want to look into it. -- As you can see, our Java viewer actually performs almost the same as the TigerVNC native viewer on the Dell machine. The disparity in performance between those two viewers and the X11 TurboVNC Viewer is entirely due to differing double buffering strategies. As explained below, the X11 TurboVNC Viewer doesn't do true double buffering. It instead waits until all rectangles in a framebuffer update have been decoded and draws them all in rapid succession to the screen. TigerVNC and the Java TurboVNC Viewer, on the other hand, draw a bounding box containing all of the updated rectangles. In some cases, this causes those two viewers to draw more pixels. For instance, if the user is typing text into a console application, like Emacs, the X11 TurboVNC Viewer could get away with drawing only the tiny rectangles representing the character being typed, the previous character, and the updated status bar text at the bottom. TigerVNC (and Java TurboVNC), however, will redraw most of the window in that situation, because the bounding box of the status bar (with updated text at the bottom right) and the text being typed at the upper left encompasses most of the window. I don't think this represents a noticeable performance issue from the point of view of the user, unless the user is able to type 100 characters per second. :) When you look at just the 3D datasets, the blitting performance under Java is very much in line with the X11 TurboVNC Viewer, because the framebuffer updates are typically large and monolithic. -- The remaining issue is that, for whatever reason, Java2D is not being accelerated on my HP machine. We're looking into that, and I will post amended results once we find a solution. There are a couple of other minor usability tweaks that need to happen with the Java viewer before it could fully replace the X11 viewer on Linux (for instance, figuring out how to implement keyboard grabbing, extending the -via/-tunnel feature to allow using an external SSH command rather than the built-in Java SSH client, etc.), but it seems like the performance is there, assuming we can figure out the Java2D acceleration issue. When I accelerated the TigerVNC codecs in 2011, I did extensive work to convince myself that, from an end user point of view, the TigerVNC Viewer would appear as fast as the TurboVNC Viewer, so I'm confident that if our Java viewer can hit that same baseline, we're golden. On 2/26/13 8:16 PM, DRC wrote: > For curiosity, I ported the benchmarking system from the TurboVNC Viewer > into the TigerVNC native (FLTK) Viewer. This was partly done to get a > better idea of how the new JNI-accelerated Java TurboVNC Viewer compared > -- since that viewer is architecturally more similar to TigerVNC, it's a > bit more of an apples to apples comparison than comparing it to the > TurboVNC native viewers. Also, this research served as a baseline for > an upcoming project to multi-thread the TurboVNC decoder. > > For those who aren't familiar with this benchmarking system, basically > what it does is take the set of 20 session captures that were originally > used to design the TurboVNC encoding methods > (http://www.virtualgl.org/pmwiki/uploads/About/tighttoturbo.pdf), > pre-encodes them using the TurboVNC Benchmark Tools > (http://virtualgl.svn.sourceforge.net/viewvc/virtualgl/vncbenchtools/trunk/) > using the turbo-1.1 encoder with "Perceptually Lossless JPEG", and runs > the encoded sessions through the viewer unimpeded (basically replacing > socket reads with reads from a session capture file.) The benchmarking > system can make multiple runs of the same dataset and report the average > of the total time, decoding time, and blitting time for each iteration > (I typically take the average of 5 runs with 2 "throw-away" runs at the > beginning.) The system also subtracts out any time spent reading the > session capture from disk. This was the same system we used to figure > out that the new Java TurboVNC Viewer is actually faster than the X11 > TurboVNC Viewer on OS X, which led to replacing that viewer in TurboVNC > 1.2. > > The results comparing with TigerVNC were interesting. Prior studies > with the X11 TurboVNC Viewer had revealed that the decoding performance > of that viewer on Linux was higher than that of the new Java TurboVNC > Viewer, and I suspected that it was due to the different way in which > both solutions handle solid-colored subrectangles. When decoding the > Tight stream, the X11 TurboVNC Viewer uses an XImage to store the > decoded non-solid subrectangles, but it stores just the coordinates and > fill color of the solid-colored rectangles. Thus, when it comes time to > draw the frame, the viewer uses a series of XShmPutImage() calls to draw > the non-solid subrectangles and a series of XFillRectangle() calls to > draw the solid subrectangles. This is not technically double buffering, > but those X11 calls are processed so fast that effectively it appears > double-buffered. The TigerVNC Viewer and the new Java TurboVNC Viewer > use true double buffering, and as such, solid-colored subrectangles have > to be rendered to the back buffer whenever they are decoded. Thus, the > Java TurboVNC Viewer turns in slower decoding performance than the X11 > TurboVNC Viewer on Linux, but in fact, on that platform, the decoding > performance of the Java TurboVNC Viewer is about equal with the native > TigerVNC Viewer. > > Results follow. "Total" is the total (wall) time taken to fully > decode/draw all 20 datasets, averaged over 5 runs with 2 "warmup" runs. > "Decode" is the portion of that wall time spent in the Tight decoder. > 64-bit code (or a 64-bit JVM) was used in all cases. J2SE or OpenJDK > 1.6.0 was used for all Java cases. -O3 with GCC 4 was used for the > C/C++ code. > > Dell Precision T3500, quad-core 2.8 GHz Intel Xeon W3530, nVidia Quadro > 600 (310.14), CentOS 5.8: > > X11 TurboVNC Viewer 1.2 beta1: > Decode / Total = 10.6 / 15.6 s > Java TurboVNC Viewer 1.2 beta1: > Decode / Total = 12.5 / 23.2 s > Native TigerVNC Viewer r5051: > Decode / Total = 12.4 / 16.6 s > > HP Pavilion Slimline s5100z, dual-core 2.6 GHz AMD Athlon 64 X2 5050e, > nVidia GeForce 6150SE (304.64), CentOS 6.3: > > X11 TurboVNC Viewer 1.2 beta1: > Decode / Total = 17.2 / 24.0 s > Java TurboVNC Viewer 1.2 beta1: > Decode / Total = 25.3 / 64.4 s > Native TigerVNC Viewer r5051: > Decode / Total = 24.9 / 30.4 s > > Mac Mini 2.0 GHz Intel Core 2 Duo, nVidia GeForce 9400, OS X 10.8.2: > > X11 TurboVNC Viewer 1.2 beta1: > Decode / Total = 16.2 / 48.8 s > Java TurboVNC Viewer 1.2 beta1: > Decode / Total = 19.0 / 39.7 s > Native TigerVNC Viewer r5051: > Decode / Total = 24.9 / 219 s > > NOTES: > > -- I am interested, in the long term, in replacing the X11 TurboVNC > Viewer on Linux with the Java viewer, but we need to first figure out > how to get the drawing performance up to snuff on that platform. Note > that the drawing performance (Total - Decode) on Linux is similar for > the native TurboVNC and TigerVNC viewers. The Java viewer's drawing > performance, compared to native, is about half on the Dell and about 1/6 > on the HP, for reasons unknown. > > -- The drawing performance of FLTK is awful on OS X. Not sure why. But > assuming this is a legitimate result (it seems to be-- I mean, I watched > it, and it was visibly very slow), it further validates the notion that > Java may be the fastest solution for Mac, at least at the moment. > > -- As mentioned above, the decoding performance of the Java TurboVNC > Viewer is about the same as the native TigerVNC Viewer on Linux, and > it's actually better than TigerVNC on OS X. > > > I'll be doing similar comparisons with the Windows Viewer over the next > couple of months and will keep everyone posted on that. > > If anyone wants to peer review this work, I'm happy to provide the > TigerVNC patches that implement the benchmark functionality. > > DRC ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ VirtualGL-Users mailing list VirtualGL-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtualgl-users