Re: [OpenJDK Rasterizer] RFR: Marlin renderer #2

Jim Graham Tue, 09 Jun 2015 15:13:46 -0700

Hi Laurent,

BTW, I wanted to point out that this is something we can look at afterwe get the current set of fixes in. I just have to find an hour where Ican look over the last webrev and review all of the static methods andit looks like it could go in.

I'm not necessarily thinking of solving everyone's caching concerns,though that would be nice, I'm mainly looking at making the cachingsolution used here easy to verify, easy to work on without making amistake in the future, and easy to tune without having to play with abunch of variants of static methods. In the end, all of them boil downto 3 actions - allocate, grow, and dispose - but since it is usingstatic methods then that means an explosion to deal with things likearray type, dirtiness and, potentially in the future, different growthmetrics. I think that luckily the current difference in metrics happensto align with whether the arrays are dirty or not, so the matrix issmaller than it might otherwise be, but there would be an exponentialgrowth in the number of "very similarly named" static methods should weintroduce any other reason for one particular growable array instance tobe different...


                        ...jim

On 6/9/2015 1:36 PM, Laurent Bourgès wrote:

Hi Jim,

Your proposal is interesting: I would have loved doing such smart array
cache for general use !
Do you think such a general ArrayCache would be helpful for other java2d
or jdk algorithms needing intensive array allocation ?

However, I focused mostly my work on getting the maximum performance and
also not changing too much the original pisces code (to let you review
it more easily).

    Here's a suggestion on how I would set up the array caching code,
    mainly with regard to making it easier to make sure we are managing
    then consistently, but also to allow for more flexible tuning later.


I agree the array management code may look complicated (but less than
C/C++).
As I am using netbeans, I am abusing of its 'Find Usages' feature :
Usages of Renderer.edgeBuckets [11 occurrences]
sun.java2d.marlin.Renderer
addLine
   394:  final int[] _edgeBuckets      = edgeBuckets;
Renderer
   472:  edgeBuckets = edgeBuckets_initial;
init
*  523:  edgeBuckets = rdrCtx.getIntArray(edgeBucketsLength);
*dispose
   589:  if (edgeBuckets == edgeBuckets_initial) {
   591:  IntArrayCache.fill(edgeBuckets,      buckets_minY,
*  597:  rdrCtx.putIntArray(edgeBuckets,      buckets_minY,
*  599:  edgeBuckets = edgeBuckets_initial;
   605:  } else if (edgeBuckets != edgeBuckets_initial) {
*  607:  rdrCtx.putIntArray(edgeBuckets, 0, 0);
*  608:  edgeBuckets = edgeBuckets_initial;
_endRendering
   708:  final int[] _edgeBuckets = edgeBuckets;

So I can quickly check if I am using the appropriate IntArrayCache
(clean variant) !

    First, make all interactions go through the cache object itself -
    get, grow, put.  This gets rid of a number of static methods with a
    bunch of similar names and groups the management methods in a class
    hierarchy.


It seems a good approach but requires me a big refactoring: a Cache_V2
will gather many Cache_V1 (including bucket / size management).
Moreover, it raises another issue: for now all caches are gathered into
the ArrayCacheHolder to wrap them with a single WeakReference => small
memory footprint + early GC cleanup.

That's the aim of the RendererContext.getArrayCachesHolder() method
which permits dynamic ArrayCachesHolder retrieval or creation. Actually
a Cache is needed to both get/grow but also release an array and
meanwhile, it is possible that GC had freed the Cache memory !

But this point is more problematic below:

    Second, have each consumer of arrays use its own cache object.  Some
    of them may be the same object reference between different "users",
    and some of them may be different objects but share their underlying
    arrays with each other, but a given piece of code could be traced to
    always get/grow/put its arrays into the same cache object.  That way
    it's easier to see that you aren't mixing/matching clean/dirty arrays.


It seems a bit verbose: 1 cache field per array = adding a lot of class
fields ! Maybe the same cache reference could be still used for "friend'
arrays like: edgeBuckets/edgeBucketCounts, crossings/aux_crossings, ...

    DirtyFooCache
         foo[] get(size)
         foo[] grow(foo[], newsize)  // or widen?
         void put(foo[])

    CleanFooCache
         foo[] get(size)
         foo[] grow(foo[], newsize, <used parts>)
         void put(foo[], <used parts>)


Looks like a good design.


    Then code that manages an array named bar would be managed as:
    (It may be the same instance as another FooCache
      or it may share its arrays with another FooCache
      but the "barCache" reference is only ever used for the bar array...)
         [Clean|Dirty]FooCache barCache = make/get/share a Cache instance;
         foo[] bar;
         ...
         bar = barCache.get(initSize);
         ...
         bar = barCache.grow/widen(bar, newSize[, usage if clean]);
         ...
         barCache.put(bar[, usage if clean]);
         bar = null;

    It's easier to verify that bar is being managed consistently if it
    always goes back to its own bar object.  You can then decide if too
    different uses of foo[] arrays should share the same arrays or not
    based on profiling.  You can also potentially have different growth
    algorithms for 2 different arrays of the same type, but possibly
    even sharing the same underlying arrays.  Any degree of
    customization is possible underneath the covers behind a FooCache
    basic interface.


I like your proposal but I maximized the array sharing ie Cache
instances for following reasons:
- maximize array reuse (bucket approach) = reduce the memory footprint
=> lower GC overhead because less memory is wasted by cached arrays
- each RendererContext has its own Cache instances: in multithreaded
environment (web server), it becomes critical to have the smaller memory
footprint and avoid concurrency issue = no shared cache between threads
- all that work can be seen as an Thread Local Array Allocation (TLAA)
that could be part of the GC algorithm itself to benefit to the all VM:
dirty vs clean arrays, reusing vs allocation + GC, partial cleaning
(Arrays 2.0 ?) ...

    I'd also either make the growth heuristics an argument to
    creating/fetching a FooCache instance or make it subject to a helper
    "CacheSizeManager" or "CacheSizeStrategy" object in such a way that
    different uses of IntCache might be able to have different growth
    strategies - and to possibly even manage different sizing strategies
    on top of the same set of underlying arrays (i.e. one has buckets
    that grow by <<2 and another by <<1 and for the even buckets, they
    share arrays, for instance).


It led me to imagine a nice Array Cache framework at the jdk or jvm
level ... at least not exclusive to Marlin !

    The main thing for the near term, is that it would be nice to have
    each array use its own cache object so that we can easily verify
    that it is being consistent with its handling of arrays...


If you are not happy with the latest webrev (better naming convention),
we should discuss again how to do all these changes in a progressive manner.

Cheers,
Laurent

Re: [OpenJDK Rasterizer] RFR: Marlin renderer #2

Reply via email to