Hi Andy,

On 27/08/2023 10:36, Andy Seaborne wrote:

On 25/08/2023 15:18, Dave Reynolds wrote: [1]
 > We've being testing some of our troublesome queries on 4.9.0 on java
 > 11 vs java 17 and see a 10-15% performance hit on java 17 (even after
 > we take control of the GC by forcing both to use the old parallel GC
 > instead of G1). No idea why, seems wrong! Makes us inclined to stick
 > with java 11 and thus jena 4.x series as long as we can.

Dave,

Is this 4.9.0 specific or across multiple Jena versions?

Seems to be multiple versions (at least 4.8.0 and 4.9.0), but not tested exhaustively.

Is G1 worse than the old parallel GC on Java17?

It is definitely worse on Java11 for a particular narrow type of query that is an issue for us. Believe the same is true on Java17 but haven't collected definitive data on this.

It may be possible to tune G1 to better match our particular test case but the testing and tuning is time consuming and the parallel GC does the trick.

Our aim was to replace a system running on 3.x era fuseki with a 4.x era one without significant loss of performance. Out of box there was a 20% hit. Switching GC reduced much of that, switching to java11 instead of 17 brought us basically to parity - for this special case. This is a case where legitimate queries get close to the timeout threshold we run at, so a 20% performance drop is particularly visible in having currently working queries timeout on a newer version.

The query itself is trivial - return large numbers of resources (10k-1m) found by a simple lucene query along with a few (~15) properties of each. Performance in this case seems to be dominated by the time to render the large results stream rather than lucene or TDB query performance. So it makes some sense that in this specific case a GC tuned for throughput rather than pause time would help.

No suggestion that our case is representative of any broader pattern.

Dave

Reply via email to