Clarification:

"Quartz is -7% to +24% (avg. 4%) as fast as OpenGL"

should read:

"Quartz is -7% to +24% (avg. 4%) faster than OpenGL"

That is, on the 2009 Mini, the two drawing methods perform similarly
when displaying the images generated by 3D applications (bear in mind
that my application is a high-speed remote display/remote access tool.)


On 10/28/16 11:36 AM, DRC wrote:
> I've been able to optimize my code somewhat, such that it restricts
> OpenGL drawing only to the changed regions of the framebuffer.  That at
> least made OpenGL Java 2D blitting usable for my application.  However,
> the Quartz Java 2D implementation under Java 6 is still much faster in
> many cases.  I have three machines I can test against:
> 
> - a 2009 Mac Mini, Intel Core 2 Duo, nVidia GeForce 9400, Mountain Lion
> - a 2011 Macbook Pro, Intel Core i5, Intel HD Graphics 3000, Mavericks
> - a 2014 Mac Mini, Intel Core i7, Intel Iris Graphics, Yosemite
> 
> On the 2009 Mini, Quartz is 2-11x (avg. 4.7x) as fast as OpenGL on the
> eight 2D application workloads that I'm testing (these workloads draw a
> lot of very small regions of the framebuffer.)  On the twelve 3D
> application workloads (which tend to mostly draw large areas of the
> framebuffer), Quartz is -7% to +24% (avg. 4%) as fast as OpenGL.
> 
> On the 2011 MB Pro, Quartz is 4-25x (avg. 10x) as fast as OpenGL on the
> eight 2D application workloads and 1.5-2.3x (avg. 1.9x) as fast as
> OpenGL on the twelve 3D application workloads.
> 
> Here's the kicker, though-- it appears that the Quartz-accelerated Java
> 2D blitter is disabled under Yosemite and later, so on my 2014 Mini
> (which requires at least Yosemite), Java 8 (which always uses OpenGL) is
> always much faster than Java 6 (which appears to use an unaccelerated
> Java 2D blitter under OS X 10.10+.)  I verified with virtual machines
> that this phenomenon is O/S-related and not hardware-related.  It seems
> that Java 6 always disables Quartz blitting on Yosemite and later,
> regardless of the machine.
> 
> Unfortunately, because this Quartz-accelerated Java 2D blitter never
> made it into OpenJDK, because Apple discontinued Java for OS X, and
> because-- even on older hardware-- you can't use the Quartz-accelerated
> blitter on newer macOS releases, our only choice now is OpenGL.  That
> isn't always the fastest drawing method on Macs.  For instance,
> comparing the two Mac Mini models with their fastest drawing method, I
> observe that the 2009 Mini (Quartz, Java 6) is about twice as fast as
> the 2014 Mini (OpenGL, Java 8) on the 2D application workloads.  On the
> 3D application workloads, the 2014 Mini (OpenGL, Java 8) is about 40%
> faster than the 2009 Mini (Quartz, Java 6.)
> 
> In short, this is still an issue.  Under certain workloads, my modern
> machine is performing half as fast as a 2009 machine, because of the
> inability to use Quartz for blitting.
> 
> 
> On 10/28/16 4:54 AM, Tobi wrote:
>> Any news here Sergey?
>>
>>
>>
>>
>>> Am 17.02.2015 um 15:01 schrieb Sergey Bylokhov <sergey.bylok...@oracle.com>:
>>>
>>> Hello,
>>> Thanks for the provided info! I am able to reproduce this bug even on 
>>> windows: gdi vs ogl. I will take a look at it.
>>>
>>> On 12.02.2015 8:28, DRC wrote:
>>>> On 2/10/15 7:52 AM, Sergey Bylokhov wrote:
>>>>> You can run this test on jdk 8u31 and 8u40 to see a difference:
>>>>> http://cr.openjdk.java.net/~serb/8029253/webrev.04/test/java/awt/image/DrawImage/UnmanagedDrawImagePerformance.java.html
>>>>>  
>>>>>
>>>>> And the test from this bug report:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8017247
>>>>
>>>> After looking at those tests, they are definitely not related to the issue 
>>>> I'm seeing here.  Although the TurboVNC Viewer (my application) does use 
>>>> bilinear interpolation if desktop scaling is enabled, that is not the 
>>>> "common" usage case.  Normally, it's just going to be drawing a 
>>>> BufferedImage with no interpolation, so that at least clarifies that I 
>>>> shouldn't be expecting any different behavior with Java 9.  The question 
>>>> now becomes:  how to optimally take advantage of the OpenGL pipeline. As 
>>>> you pointed out (and I agree, based on my research) reducing the 
>>>> software-to-surface blits is key, although I don't have a firm grasp on 
>>>> how to do that.  My code is basically just doing the following:
>>>>
>>>>  public void paintComponent(Graphics g) {
>>>>    Graphics2D g2 = (Graphics2D) g;
>>>>    if (scaling enabled) {
>>>>      g2.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
>>>> RenderingHints.VALUE_INTERPOLATION_BILINEAR);
>>>>      g2.drawImage(im.getImage(), 0, 0, scaledWidth, scaledHeight, null);
>>>>    } else {
>>>>      g2.drawImage(im.getImage(), 0, 0, null);
>>>>    }
>>>>    g2.dispose();
>>>>  }
>>>>
>>>>  public void updateWindow() {
>>>>    Rect r = damage;
>>>>    if (!r.isEmpty()) {
>>>>      if (scaling enabled) {
>>>>        blah blah blah (adjust coordinates, mainly)
>>>>        paintImmediately(x, y, width, height);
>>>>      } else {
>>>>        paintImmediately(x, y, width, height);
>>>>      }
>>>>      damage.clear();
>>>>    }
>>>>  }
>>>>
>>>> As VNC rectangles from the server are decoded, the "damage" rectangle gets 
>>>> updated to reflect the extent of the "damaged" pixels, and that extent is 
>>>> passed into paintImmediately().  In examining the OpenJDK source, however, 
>>>> it appears that glDrawPixels() is always called with the full extent of 
>>>> the BufferedImage, regardless of whether only a small portion of that 
>>>> image has actually changed.  If there is something else I can do to help 
>>>> debug this, please let me know.  I have a working JDK build.  I fully 
>>>> admit that I may be doing something wrong or suboptimally, but bear in 
>>>> mind that I've spent probably over 100 hours on this, so it's not as if 
>>>> I'm a naive n00b here.  If there's something I'm missing, then trust me 
>>>> that it isn't obvious!
>>>>
>>>>
>>>>> Can you share standalone jar file of this workload?
>>>>
>>>> Here is everything you need to reproduce the issue:
>>>> http://www.turbovnc.org/turbovnc_mac_performance_stuff.tar.gz
>>>>
>>>> Untar, then do
>>>>> cd turbovnc_mac_performance_stuff
>>>>> java -server -d64 -Dsun.java2d.trace=count -cp VncViewer.jar 
>>>>> com.turbovnc.vncviewer.ImageDrawTest
>>>>  (let it run for 20 seconds or so, then CTRL-C it.)
>>>>> java -server -d64 -jar VncViewer.jar -bench compilation-16.rfb -benchiter 
>>>>> 3 -benchwarmup 2
>>>>  (let it run to completion.)
>>>>
>>>>  Results from Java 6u51 on my Mac Mini (2009 vintage, 2 GHz Intel Core 
>>>> Duo, nVidia GeForce 9400):
>>>>  ImageDrawTest:   ~100 Mpixels/sec
>>>>    (all calls are to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, 
>>>> IntArgbPre))
>>>>  compilation-16:  Average 1.392763 s (Decode = 0.198173 s, Blit = 1.005974 
>>>> s)
>>>>
>>>>  Results from Java 8u31 on my Mac Mini:
>>>>  ImageDrawTest:   ~70 Mpixels/sec
>>>>    (Calls are split between
>>>>     sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface 
>>>> (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>>>>     sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha, 
>>>> "OpenGL Surface"))
>>>>  compilation-16:  Average 6.216550 s (Decode = 0.194989 s, Blit = 5.534781 
>>>> s)
>>>>
>>>>  Results from Java 8u31 on my Mac Mini without alpha-enabled image 
>>>> (-Dturbovnc.forcealpha=false):
>>>>  ImageDrawTest:   ~18 Mpixels/sec
>>>>    (Calls are split between:
>>>>     sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface 
>>>> (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>>>>     sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha, "OpenGL 
>>>> Surface"))
>>>>  compilation-16:  Average 27.153480 s (Decode = 0.200333 s, Blit = 
>>>> 26.523137 s)
>>>>
>>>> So, as you can see, using an alpha-enabled image improved the performance 
>>>> under Java 7/8 by about 4x, both when drawing large images (ImageDrawTest) 
>>>> and when doing smaller image updates (compilation-16.) However, the 
>>>> blitting performance under Java 7/8 for small image workloads is still 
>>>> about 5x slower than it was under Java 6.  Results from a different 
>>>> machine:
>>>>
>>>>  Results from Java 6u51 on my Macbook Pro (2011 vintage, 2.4 GHz Intel 
>>>> Core i5, Intel HD Graphics 3000):
>>>>  ImageDrawTest:   ~100 Mpixels/sec
>>>>    (all calls to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, IntArgbPre))
>>>>  compilation-16:  Average 0.592772 s (Decode = 0.113879 s, Blit = 0.351596 
>>>> s)
>>>>
>>>>  Results from Java 8u31 on my Macbook Pro:
>>>>  ImageDrawTest:   ~66 Mpixels/sec
>>>>    (Calls split between
>>>>     sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface 
>>>> (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>>>>     sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha, 
>>>> "OpenGL Surface"))
>>>>  compilation-16:  Average 6.806324 s (Decode = 0.188252 s, Blit = 6.457852 
>>>> s)
>>>>
>>>>  Results from Java 8u31 on my Macbook Pro without alpha-enabled image 
>>>> (-Dturbovnc.forcealpha=false):
>>>>  ImageDrawTest:   ~50 Mpixels/sec
>>>>    (Calls split between
>>>>     sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface 
>>>> (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>>>>     sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha, "OpenGL 
>>>> Surface"))
>>>>  compilation-16:  Average 10.272508 s (Decode = 0.147805 s, Blit = 
>>>> 9.955666 s)
>>>>
>>>> Using an ARGB_PRE BufferedImage didn't help out nearly as much on this 
>>>> machine, and whereas the large image performance looks similar to that of 
>>>> the Mac Mini, the small image blitting performance still suffers by nearly 
>>>> a factor of 20 (although it is improved-- before the use of ARGB_PRE 
>>>> images, it was about a factor of 30 slower.)
>>>>
>>>> The architecture of this solution makes the use of VolatileImages 
>>>> impractical-- basically, I have to decode the VNC rectangles in real time 
>>>> as they arrive, so if the VolatileImage were to go away, I would have no 
>>>> way of rebuilding it.
>>>
>>>
>>> -- 
>>> Best regards, Sergey.
>>>
>>

Reply via email to