Dear all, I am trying to benchmark cold-boot query times so to say. So not only the actual execution of the query, but also parsing the model, and maybe even the time for the JVM et al. to boot up and load the jars.
For that purpose I have the following trivial code base [1], which I thought is more or less equivalent to the arq.sparql package, with the exception it can also work on top of HDT [0]. Load model; query; execute; format results. As far as I can tell nothing surprising. This works rather well, the ease of use of both Jena and HDT is really encouraging. I compile this using this script [2] (sorry not so familiar with the java ecosystem and build tools). The reported query times from my script (tq1-tq0) and arq.sparql (official apache-jena-3.1.1 release) (between modTime.startTimer() modTime.endTimer() it seems) match rather closely on my test model (~0.4s). Yet I get a 2 seconds difference in the overall runtime of the entire application, arq.sparql being the faster one! I cannot seem to tell if this is due to differences in parsing the model (as that time is not reported by arq.sparql). I also tried to run strace with timestamps to see whether the differences in class path entries makes a significant difference, but that seems to be in the order of milliseconds. Is there some sort of optimization applied to the official release JARs? Does anybody know what can account for this two second time difference? Any help greatly appreciated. [0] https://github.com/rdfhdt/hdt-java [1] https://gist.github.com/anonymous/8e73584121c043b2e2914e2eaf314942 [2] https://gist.github.com/anonymous/d6de2cdaf05100a2b76894901c170a43 Kind regards, Thomas
