J2DBench test option files attached to 
[JDK-8288948](https://bugs.openjdk.org/browse/JDK-8288948) indicate lower 
drawing performance on macOS with Metal rendering pipeline as compared to 
OpenGL rendering pipeline.

**Analysis :** 
Current implementation of 2D primitives (Line, Rectangle, Parallelogram - 
Draw/Fill operations) in Metal rendering pipeline follow below structure-
1) End points (vertices) required for the primitive drawing are put in a buffer.
2) The data prepared in above step is sent to GPU using MTLRenderCommandEncoder 
`setVertexBytes()` call
3) A draw command is issued using MTLRenderCommandEncoder `drawPrimitives()` 
call
4) Primitive Color is set (repeated when encoder or color changes) using 
MTLRenderCommandEncoder `setFragmentBytes()` call in MTLRenderCommandEncoder 
state update.

**Root Cause of slower performance :**
It is found that the multiple calls to MTLRenderCommandEncoder 
`drawPrimitives()` by using MTLRenderCommandEncoder `setVertexBytes()` to send 
a tiny amount of data each time slows down the rendering.

**Fix :** 
MTLRenderCommandEncoder `setVertexBytes()` can accept 4KB of buffer at a time.
The primitive drawing logic is modified to collate adjacent draw calls as below 
-
1) A buffer of size approximately equal to 4KB is created - this is treated as 
common vertex buffer which is reused again and again
2) For each primitive draw call - the vertices needed for that draw call are 
added to the above buffer
3) When the buffer is full OR some other condition occurs ( e.g. breakage of 
draw primitive sequence, some other operation as change of color etc) - 
     a) Vertex data buffer is sent to the GPU using MTLRenderCommandEncoder 
`setVertexBytes()` call.
     b) A single (or multiple) draw command(s) are issued using 
MTLRenderCommandEncoder `drawPrimitives()` call.


**More insight :** 
In general, an application requires a mix of 2D shapes, images and text of 
different color and sizes.
The performance test that we have measure rendering performance of extreme 
cases such as -
1) J2DBench - tests the repeated drawing of the same type and same color in a 
time period - e.g. Find the rendering speed of repeated 2D Line draw operation 
in X mSec?
2) RenderPerf test - tests the drawing of N primitives of the same type but 
each instance with a different color and capture FPS.

This PR optimizes the Java2D Metal rendering pipeline implementation for the 
first case where the same primitive is drawn repeatedly without changing its 
color. Our current architecture needs to be tweaked to address slower 
performance shown by RenderPerf tests.  If needed, that needs to be done 
separately. 

**Results :**
The performance results are attached to the JBS.

-------------

Commit messages:
 - add a comment
 - collate adjacent draw calls

Changes: https://git.openjdk.org/jdk/pull/9245/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=9245&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8288948
  Stats: 257 lines in 4 files changed: 149 ins; 27 del; 81 mod
  Patch: https://git.openjdk.org/jdk/pull/9245.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/9245/head:pull/9245

PR: https://git.openjdk.org/jdk/pull/9245

Reply via email to