On Wed, 22 Jun 2022 16:32:11 GMT, Ajit Ghaisas <[email protected]> wrote:
> J2DBench test option files attached to > [JDK-8288948](https://bugs.openjdk.org/browse/JDK-8288948) indicate lower > drawing performance on macOS with Metal rendering pipeline as compared to > OpenGL rendering pipeline. > > **Analysis :** > Current implementation of 2D primitives (Line, Rectangle, Parallelogram - > Draw/Fill operations) in Metal rendering pipeline follow below structure- > 1) End points (vertices) required for the primitive drawing are put in a > buffer. > 2) The data prepared in above step is sent to GPU using > MTLRenderCommandEncoder `setVertexBytes()` call > 3) A draw command is issued using MTLRenderCommandEncoder `drawPrimitives()` > call > 4) Primitive Color is set (repeated when encoder or color changes) using > MTLRenderCommandEncoder `setFragmentBytes()` call in MTLRenderCommandEncoder > state update. > > **Root Cause of slower performance :** > It is found that the multiple calls to MTLRenderCommandEncoder > `drawPrimitives()` by using MTLRenderCommandEncoder `setVertexBytes()` to > send a tiny amount of data each time slows down the rendering. > > **Fix :** > MTLRenderCommandEncoder `setVertexBytes()` can accept 4KB of buffer at a time. > The primitive drawing logic is modified to collate adjacent draw calls as > below - > 1) A buffer of size approximately equal to 4KB is created - this is treated > as common vertex buffer which is reused again and again > 2) For each primitive draw call - the vertices needed for that draw call are > added to the above buffer > 3) When the buffer is full OR some other condition occurs ( e.g. breakage of > draw primitive sequence, some other operation as change of color etc) - > a) Vertex data buffer is sent to the GPU using MTLRenderCommandEncoder > `setVertexBytes()` call. > b) A single (or multiple) draw command(s) are issued using > MTLRenderCommandEncoder `drawPrimitives()` call. > > > **More insight :** > In general, an application requires a mix of 2D shapes, images and text of > different color and sizes. > The performance test that we have measure rendering performance of extreme > cases such as - > 1) J2DBench - tests the repeated drawing of the same type and same color in a > time period - e.g. Find the rendering speed of repeated 2D Line draw > operation in X mSec? > 2) RenderPerf test - tests the drawing of N primitives of the same type but > each instance with a different color and capture FPS. > > This PR optimizes the Java2D Metal rendering pipeline implementation for the > first case where the same primitive is drawn repeatedly without changing its > color. Our current architecture needs to be tweaked to address slower > performance shown by RenderPerf tests. If needed, that needs to be done > separately. > > **Results :** > The performance results are attached to the JBS. Overal looks good src/java.desktop/macosx/native/libawt_lwawt/java2d/metal/MTLRenderer.m line 261: > 259: // Translate each vertex by a fraction so > 260: // that we hit pixel centers. > 261: //const int verticesCount = 5; I suppose we can remove this commented code ------------- Marked as reviewed by avu (Committer). PR: https://git.openjdk.org/jdk/pull/9245
