Measuring cold-boot query execution times. Unexplained 2 seconds difference with official release.

Thomas Krijnen Wed, 04 Jan 2017 06:50:50 -0800

Dear all,

I am trying to benchmark cold-boot query times so to say. So not only the
actual execution of the query, but also parsing the model, and maybe even
the time for the JVM et al. to boot up and load the jars.


For that purpose I have the following trivial code base [1], which I
thought is more or less equivalent to the arq.sparql package, with the
exception it can also work on top of HDT [0]. Load model; query; execute;
format results. As far as I can tell nothing surprising. This works rather
well, the ease of use of both Jena and HDT is really encouraging.

I compile this using this script [2] (sorry not so familiar with the java
ecosystem and build tools).

The reported query times from my script (tq1-tq0) and arq.sparql (official
apache-jena-3.1.1 release) (between modTime.startTimer() modTime.endTimer()
it seems) match rather closely on my test model (~0.4s). Yet I get a 2
seconds difference in the overall runtime of the entire application,
arq.sparql being the faster one! I cannot seem to tell if this is due to
differences in parsing the model (as that time is not reported by
arq.sparql). I also tried to run strace with timestamps to see whether the
differences in class path entries makes a significant difference, but that
seems to be in the order of milliseconds. Is there some sort of
optimization applied to the official release JARs?

Does anybody know what can account for this two second time difference? Any
help greatly appreciated.

[0] https://github.com/rdfhdt/hdt-java
[1] https://gist.github.com/anonymous/8e73584121c043b2e2914e2eaf314942
[2] https://gist.github.com/anonymous/d6de2cdaf05100a2b76894901c170a43

Kind regards,
Thomas

Measuring cold-boot query execution times. Unexplained 2 seconds difference with official release.

Reply via email to