On 2/10/15 7:52 AM, Sergey Bylokhov wrote:
You can run this test on jdk 8u31 and 8u40 to see a difference:
http://cr.openjdk.java.net/~serb/8029253/webrev.04/test/java/awt/image/DrawImage/UnmanagedDrawImagePerformance.java.html

And the test from this bug report:
https://bugs.openjdk.java.net/browse/JDK-8017247

After looking at those tests, they are definitely not related to the issue I'm seeing here. Although the TurboVNC Viewer (my application) does use bilinear interpolation if desktop scaling is enabled, that is not the "common" usage case. Normally, it's just going to be drawing a BufferedImage with no interpolation, so that at least clarifies that I shouldn't be expecting any different behavior with Java 9. The question now becomes: how to optimally take advantage of the OpenGL pipeline. As you pointed out (and I agree, based on my research) reducing the software-to-surface blits is key, although I don't have a firm grasp on how to do that. My code is basically just doing the following:

  public void paintComponent(Graphics g) {
    Graphics2D g2 = (Graphics2D) g;
    if (scaling enabled) {
      g2.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
                          RenderingHints.VALUE_INTERPOLATION_BILINEAR);
      g2.drawImage(im.getImage(), 0, 0, scaledWidth, scaledHeight, null);
    } else {
      g2.drawImage(im.getImage(), 0, 0, null);
    }
    g2.dispose();
  }

  public void updateWindow() {
    Rect r = damage;
    if (!r.isEmpty()) {
      if (scaling enabled) {
        blah blah blah (adjust coordinates, mainly)
        paintImmediately(x, y, width, height);
      } else {
        paintImmediately(x, y, width, height);
      }
      damage.clear();
    }
  }

As VNC rectangles from the server are decoded, the "damage" rectangle gets updated to reflect the extent of the "damaged" pixels, and that extent is passed into paintImmediately(). In examining the OpenJDK source, however, it appears that glDrawPixels() is always called with the full extent of the BufferedImage, regardless of whether only a small portion of that image has actually changed. If there is something else I can do to help debug this, please let me know. I have a working JDK build. I fully admit that I may be doing something wrong or suboptimally, but bear in mind that I've spent probably over 100 hours on this, so it's not as if I'm a naive n00b here. If there's something I'm missing, then trust me that it isn't obvious!


Can you share standalone jar file of this workload?

Here is everything you need to reproduce the issue:
http://www.turbovnc.org/turbovnc_mac_performance_stuff.tar.gz

Untar, then do
cd turbovnc_mac_performance_stuff
java -server -d64 -Dsun.java2d.trace=count -cp VncViewer.jar 
com.turbovnc.vncviewer.ImageDrawTest
  (let it run for 20 seconds or so, then CTRL-C it.)
java -server -d64 -jar VncViewer.jar -bench compilation-16.rfb -benchiter 3 
-benchwarmup 2
  (let it run to completion.)

Results from Java 6u51 on my Mac Mini (2009 vintage, 2 GHz Intel Core Duo, nVidia GeForce 9400):
  ImageDrawTest:   ~100 Mpixels/sec
(all calls are to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, IntArgbPre)) compilation-16: Average 1.392763 s (Decode = 0.198173 s, Blit = 1.005974 s)

  Results from Java 8u31 on my Mac Mini:
  ImageDrawTest:   ~70 Mpixels/sec
    (Calls are split between
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha, "OpenGL Surface")) compilation-16: Average 6.216550 s (Decode = 0.194989 s, Blit = 5.534781 s)

Results from Java 8u31 on my Mac Mini without alpha-enabled image (-Dturbovnc.forcealpha=false):
  ImageDrawTest:   ~18 Mpixels/sec
    (Calls are split between:
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha, "OpenGL Surface")) compilation-16: Average 27.153480 s (Decode = 0.200333 s, Blit = 26.523137 s)

So, as you can see, using an alpha-enabled image improved the performance under Java 7/8 by about 4x, both when drawing large images (ImageDrawTest) and when doing smaller image updates (compilation-16.) However, the blitting performance under Java 7/8 for small image workloads is still about 5x slower than it was under Java 6. Results from a different machine:

Results from Java 6u51 on my Macbook Pro (2011 vintage, 2.4 GHz Intel Core i5, Intel HD Graphics 3000):
  ImageDrawTest:   ~100 Mpixels/sec
    (all calls to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, IntArgbPre))
compilation-16: Average 0.592772 s (Decode = 0.113879 s, Blit = 0.351596 s)

  Results from Java 8u31 on my Macbook Pro:
  ImageDrawTest:   ~66 Mpixels/sec
    (Calls split between
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha, "OpenGL Surface")) compilation-16: Average 6.806324 s (Decode = 0.188252 s, Blit = 6.457852 s)

Results from Java 8u31 on my Macbook Pro without alpha-enabled image (-Dturbovnc.forcealpha=false):
  ImageDrawTest:   ~50 Mpixels/sec
    (Calls split between
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha, "OpenGL Surface")) compilation-16: Average 10.272508 s (Decode = 0.147805 s, Blit = 9.955666 s)

Using an ARGB_PRE BufferedImage didn't help out nearly as much on this machine, and whereas the large image performance looks similar to that of the Mac Mini, the small image blitting performance still suffers by nearly a factor of 20 (although it is improved-- before the use of ARGB_PRE images, it was about a factor of 30 slower.)

The architecture of this solution makes the use of VolatileImages impractical-- basically, I have to decode the VNC rectangles in real time as they arrive, so if the VolatileImage were to go away, I would have no way of rebuilding it.

Reply via email to