Hi Jim, Here is below the webrev I prepared last saturday night.
However, I made progress since as I inlined few methods and now use Unsafe for rowAAChunk storage (to save few percents avoiding bound checks). *So please, just have a look to see the new hybrid approach but do not make a full review !* I will try sending another patch 4.2 asap... Webrev 4.1: http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.1/ 1. I simplified the previous patch to have only 2 variants (raw or RLE with blockFlags) but as an hybrid approach as each pixel row can use either encoding=raw or encoding=rle depending on its complexity (heuristics). 2. I fixed fore-mentioned bugs related to crossing array resizing (ptrEnd) but also added a simple overflow check to the edge array: indices (pointer like in edgeBuckets and edge.next) are only integer so it only works if edges array is smaller than 2Gb. It works very well on both jdk8 and openjdk9: Marlin 0.7.1 OpenJDK9(with Sergey gcc hack): Test Threads Ops Med Pct95 Avg StdDev Min Max TotalOps CircleTests.ser 1 162 64.827 65.118 64.855 0.123 64.597 65.474 162 *EllipseTests-fill-false.ser * *1* *36* *289.896* *290.251* *289.941* *0.157* *289.759* *290.487* *36* *EllipseTests-fill-true.ser * *1* *25* *442.567* *442.784* *442.601* *0.197* *442.373* *443.356* *25* dc_boulder_2013-13-30-06-13-17.ser 1 116 90.328 90.849 90.371 0.278 89.904 91.448 116 dc_boulder_2013-13-30-06-13-20.ser 1 222 46.882 47.23 46.897 0.197 46.377 47.689 222 dc_shp_alllayers_2013-00-30-07-00-43.ser 1 268 39.101 39.307 39.116 0.121 38.913 40.088 268 dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 772.936 774.375 773.059 0.736 771.858 774.596 25 dc_spearfish_2013-11-30-06-11-15.ser 1 823 12.676 12.807 12.705 0.076 12.653 13.285 823 dc_spearfish_2013-11-30-06-11-19.ser 1 1640 6.401 6.467 6.41 0.036 6.385 6.74 1640 dc_topp:states_2013-11-30-06-11-06.ser 1 853 12.299 12.382 12.314 0.033 12.278 12.453 853 dc_topp:states_2013-11-30-06-11-07.ser 1 1402 7.502 7.57 7.507 0.038 7.445 7.755 1402 test_z_625k.ser 1 68 152.561 153.037 152.582 0.261 152.179 153.549 68 Ductus JDK8: Test Threads Ops Med Pct95 Avg StdDev Min Max TotalOps CircleTests.ser 1 148 69.971 71.418 70.068 0.719 68.369 72.031 148 *EllipseTests-fill-false.ser * *1* *35* *297.56* *299.328* *297.48* *1.093* *295.417* *299.59* *35* *EllipseTests-fill-true.ser * *1* *25* *453.612* *456.29* *453.589* *1.813* *448.936* *456.817* *25* dc_boulder_2013-13-30-06-13-17.ser 1 93 112.865 113.419 112.88 0.277 112.377 113.459 93 dc_boulder_2013-13-30-06-13-20.ser 1 183 56.944 57.521 56.987 0.26 56.528 58.187 183 dc_shp_alllayers_2013-00-30-07-00-43.ser 1 220 47.955 48.555 47.975 0.346 47.223 49.203 220 dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 1056.025 1058.306 1056.215 1.079 1054.813 1058.515 25 dc_spearfish_2013-11-30-06-11-15.ser 1 628 16.798 17.095 16.837 0.125 16.633 17.343 628 dc_spearfish_2013-11-30-06-11-19.ser 1 1354 7.605 7.896 7.663 0.104 7.553 8.217 1354 dc_topp:states_2013-11-30-06-11-06.ser 1 616 16.988 17.097 16.98 0.086 16.737 17.513 616 dc_topp:states_2013-11-30-06-11-07.ser 1 931 11.319 11.397 11.304 0.066 11.052 11.479 931 test_z_625k.ser 1 50 208.874 209.563 208.85 0.439 206.91 209.9 50 I tested the new patch with J2DBench (having my warmup patch) using my default profile (size=1 to 1000, stokes=1,5, dash=off/on) in single-threaded tests: - pisces vs marlin: http://cr.openjdk.java.net/~lbourges/marlin/j2dBench_reports/html_pisces_marlin_071/Testcase_Summary_Report.html - ductus vs marlin: http://cr.openjdk.java.net/~lbourges/marlin/j2dBench_reports/html_ductus_marlin_071/Testcase_Summary_Report.html Marlin is always largely faster than pisces and a bit faster than ductus, except for some tests with dash1_5 as you can see in the complete report: http://cr.openjdk.java.net/~lbourges/marlin/j2dBench_reports/html_ductus_marlin_071/J2DBench_Complete_Report.html Finally it is very promising and worth the effort I made during last weeks. > One thing that occurred to me is that the 2 strategies - RLE vs > uncompressed - might be easier to follow and manage if they were broken out > into separate classes: > > MarlinCache > +--- MarlinRLECache > +--- MarlinUncompressedCache > It was a good idea but I finally adopted an hybrid approach (sharing the same data storage): the same shape can use both strategies (mixed, not exclusive anymore). Cheers, Laurent