Hi, As OpenJDK12 RDP1 is coming soon, I propose this plan: - integrate this basic fix in ShapeSpanIterator.c code to use stdlib sort (mergesort on linux) - integrate a very simple patch in Marlin renderer to disable insertion sort for large arrays: 0.5s to 0.25s, few LOC - postpone my changes to Marlin sort & Marlin nonAA renderer integration in OpenJDK 13
Will you have time to review 2 small patchs on time ? Cheers, Laurent Le mar. 23 oct. 2018 à 22:37, Laurent Bourgès <bourges.laur...@gmail.com> a écrit : > Phil, > I quickly modified the final update & sort loop to: > - move sort in another block > - use qsort() using a new comparator sortSegmentsByCurX > > This improves performance in PolyLineTest by 3 times: ~1s vs 3.5s ! > Apparently qsort() is not optimal (comparator can not be inlined by c) so > it may explain why Marlin (0x0 sampling) is still 2 times faster with its > custom merge-sort (in-place). > > Any idea to improve C sort ? > Is it good enough ? > > - USE_QSORT_X: 1 > oct. 23, 2018 10:15:29 PM polylinetest.Canvas paintComponent > INFOS: Paint Time: 1,081s > INFOS: Paint Time: 1,058s > INFOS: Paint Time: 1,067s > > - USE_QSORT_X: 0 > oct. 23, 2018 10:18:50 PM polylinetest.Canvas paintComponent > INFOS: Paint Time: 3,318s > INFOS: Paint Time: 3,258s > INFOS: Paint Time: 3,273s > > Patch: > diff -r 297450fcab26 > src/java.desktop/share/native/libawt/java2d/pipe/ShapeSpanIterator.c > --- > a/src/java.desktop/share/native/libawt/java2d/pipe/ShapeSpanIterator.c > Tue Oct 16 23:21:05 2018 +0530 > +++ > b/src/java.desktop/share/native/libawt/java2d/pipe/ShapeSpanIterator.c > Tue Oct 23 22:31:00 2018 +0200 > @@ -1243,6 +1243,18 @@ > } > } > > +/* LBO: enable (1) / disable (0) qsort on curx */ > +#define USE_QSORT_X (0) > + > +static int CDECL > +sortSegmentsByCurX(const void *elem1, const void *elem2) > +{ > + jint x1 = (*(segmentData **)elem1)->curx; > + jint x2 = (*(segmentData **)elem2)->curx; > + > + return (x1 - x2); > +} > + > static jboolean > ShapeSINextSpan(void *state, jint spanbox[]) > { > @@ -1378,16 +1390,28 @@ > seg->curx = x0; > seg->cury = y0; > seg->error = err; > + } > > - /* Then make sure the segment is sorted by x0 */ > - for (new = cur; new > lo; new--) { > - segmentData *seg2 = segmentTable[new - 1]; > - if (seg2->curx <= x0) { > - break; > + if (USE_QSORT_X && (hi - lo) > 100) > + { > + /* use quick sort on [lo - hi] range */ > + qsort(&(segmentTable[lo]), (hi - lo), sizeof(segmentData *), > + sortSegmentsByCurX); > + } else { > + for (cur = lo; cur < hi; cur++) { > + seg = segmentTable[cur]; > + x0 = seg->curx; > + > + /* Then make sure the segment is sorted by x0 */ > + for (new = cur; new > lo; new--) { > + segmentData *seg2 = segmentTable[new - 1]; > + if (seg2->curx <= x0) { > + break; > + } > + segmentTable[new] = seg2; > } > - segmentTable[new] = seg2; > + segmentTable[new] = seg; > } > - segmentTable[new] = seg; > } > cur = lo; > } > > Cheers, > Laurent > > Le mar. 23 oct. 2018 à 08:30, Laurent Bourgès <bourges.laur...@gmail.com> > a écrit : > >> Phil, >> Yesterday I started hacking ShapeSpanIterator.c to add stats: the last >> stage (sort by x0) is the bottleneck. >> In this case every sort takes up to 15ms per pixel row ! >> >> I will see if I can adapt Marlin's MergeSort.java to C to have an >> efficient sort in-place. >> Do you know if libawt has already an efficient sort instead of porting >> mine ? >> >> PS: "To save the planet, make software more efficient" is my quote of the >> day ! >> >> Cheers, >> Laurent >> >