Re: Performance regressions in Jena and TDB2

Osma Suominen Mon, 07 Dec 2020 06:47:04 -0800

Replying to myself, as I did some follow-up tests.

Osma Suominen kirjoitti 4.12.2020 klo 18.42:

Now this turned into a rather interesting exercise in using git bisect.I was able to track down the change that caused the slowdown. It's thismerge commit:
[f93fdbad7aa8d6ddb46693395e3bfb5ea487bf16] JENA-1648: Merge commit'refs/pull/507/head' of https://github.com/apache/jena
which refers to this pull request:

https://github.com/apache/jena/pull/507
I don't have time for very deep analysis right now but it doesn'tsurprise me that a substantial change to the query result serializationslows down the queries.
Things to check: (mostly as a TODO list for myself)
1. Does this depend on the query result format? For example, is only thetext format (default) slower than before?2. Is there something suspicious in the PR 507 code that would explainwhy it's so much slower?

This affects at least the CSV format too, so it's not just the textoutput format.

But I figured out that the real change here is simply that the warmupperformed when using the --repeat parameter with two arguments hasbecome less effective starting with Jena 3.10.0. When no warmup is used,the performance is the same for the different Jena versions.

And now that Andy implemented JENA-2007 which improves the warmup, Ithink the problem has already been solved.


Case closed.

-Osma


--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi

Re: Performance regressions in Jena and TDB2

Reply via email to