Hi Andy, I think that you got the cause of the problem;
> It looks like Graph::getPrefixes, which is called in the Model constructor, > is being a bit expensive. if I remove the prefixes from my dataset (the one I have very slow times with), I then get the same times as with the datasets I created for the test, which do not have prefixes defined (and, basically, the same times as the ones returned by your test in execGraph(ds1, 1000 * 1000, false, true)) fps > Le 24 oct. 2016 à 22:42, Andy Seaborne <[email protected]> a écrit : > > Hi, > > > There is a relevant change - the model wrappers in DatasetImpl aren't cached > any more (it broke with multiple threads and transactions). A model and the > graph get created each time. > > So far ... > > It looks like Graph::getPrefixes, which is called in the Model constructor, > is being a bit expensive. > > Andy > > package dev; > > import org.apache.jena.graph.Graph; > import org.apache.jena.query.Dataset; > import org.apache.jena.query.ReadWrite; > import org.apache.jena.shared.PrefixMapping; > import org.apache.jena.sparql.core.DatasetGraph; > import org.apache.jena.tdb.TDBFactory; > > public class ReportPerf { > > public static void main(String ...x) { > Dataset ds1 = TDBFactory.createDataset(); > execGraph(ds1, 1000 * 1000, false, false); > execGraph(ds1, 1000 * 1000, false, false); > execGraph(ds1, 1000 * 1000, false, true); > execGraph(ds1, 1000 * 1000, false, true); > execGraph(ds1, 1000 * 1000, true, true); > execGraph(ds1, 1000 * 1000, true, true); > } > > private static void execGraph(Dataset ds, int n, > boolean txn, boolean prefixes) { > System.out.printf > ("** Graph(n=%d,txn=%s,prefixes=%s)\n",n,txn,prefixes) ; > DatasetGraph dsg = ds.asDatasetGraph() ; > if ( txn ) > dsg.begin(ReadWrite.READ); > > long start = System.currentTimeMillis(); > for (int i = 0 ; i < n; i++) { > Graph g = dsg.getDefaultGraph() ; > if ( prefixes ) { > PrefixMapping pmap = g.getPrefixMapping() ; > } > } > long time = System.currentTimeMillis() - start; > double usPerCall = (1000.0*time)/n ; > System.out.printf > ("%.3fus/call : %dms for %d calls\n", usPerCall, time, n) ; > if ( txn ) > ds.end() ; > } > } > > > > > On 24/10/16 16:25, François-Paul Servant wrote: >> Hi, >> >> first, apologies for an inaccurate report. I was trying to understand why I >> have some slowdowns in my app with 3.1.1-SNAPSHOT vs 3.1.0, and I had a bug >> in my recording of durations: Model.listStatements seems to be out of >> concerns. >> >> However, it seems that there is something strange with >> Dataset.getDefaultModel(). >> >> Find below a junit test that simply calls Dataset.getDefaultModel on a given >> dataset. I’ve run it on my machine, from within eclipse, on 3 different TDB >> datasets (the one I’m using in my app, and 2 that I created for the purpose >> of the test) >> >> With 3.1.0, durations for Dataset.getDefaultModel doesn’t depend on the >> tested dataset, and they are fast: 5ms to call getDefaultModel 1.000.000 >> times in a loop. >> >> With 3.1.1 : in any case, times are not so good. With the 2 datasets >> constructed for the purpose of the test times are similar: >> between 1500 and 2000 ms to call getDefaultModel 1.000.000 times >> (both datasets have 5 named graphs. In one of them, there is 1 triple in >> each named graph, and also 1 in the default graph. In the other dataset, >> there are 10.000.000 triples in the default graph, and 1.000.000 in each of >> the named graphs). >> >> That’s significantly slower than with 3.1.0 (between 300 and 400 times). >> >> What I find more annoying is the time that I get with my real dataset (30 >> millions triples in the default graph, 3 named graphs with a few millions >> triples each): >> ~180.000 ms for 1.000.000 calls >> Now, this is rather slow… >> >> I hope that I don’t make any mistake this time. You can check what you get >> with you own datasets, or with the one that can be created for the purpose >> of the test (just change the tdb dir names in the code) >> >> Best, >> >> fps >> >> Here the junit test code, and the outputs that I get >> >> ————————— >> >> package testperfs; >> >> import java.io.File; >> import java.util.Iterator; >> >> import org.apache.jena.query.Dataset; >> import org.apache.jena.rdf.model.Model; >> import org.apache.jena.rdf.model.ModelFactory; >> import org.apache.jena.rdf.model.Property; >> import org.apache.jena.rdf.model.Resource; >> import org.apache.jena.tdb.TDBFactory; >> import org.junit.Test; >> >> public class GetDefaultModelPerfTest { >> >> @Test >> public final void testGetDefaultModel() { >> File dir = new File("/Users/fps/Desktop/jenatest"); >> doIt(dir); >> } >> >> @Test >> public final void testGetDefaultModel_MyTDB() { >> File dir = new File("/Users/fps/z-no-svg/Rasse-Data/bin/tdb"); >> doIt(dir); >> } >> >> // @Test >> /** create and write to disk a new TDB store */ >> public final void createDS() { >> createDS(new File("/Users/fps/Desktop/jenatest"), 1, 5, 1); >> // createDS(new File("/Users/fps/Desktop/jenatest"), 10*1000*1000, 5, >> 1000*1000); >> } >> >> private void doIt(File dir) { >> Dataset ds = TDBFactory.createDataset(dir.getAbsolutePath()); >> // print infos about ds >> dsInfos(ds); >> test(ds, 1000*1000); >> ds.close(); >> } >> >> /** >> * @param ds >> * @param n nb of calls >> */ >> private void test(Dataset ds, int n) { >> // warmup >> System.out.println("warmup..."); >> for (int i = 0 ; i < 1000; i++) { >> Model m = ds.getDefaultModel(); >> } >> // doit >> System.out.println("please wait..."); >> long start = System.currentTimeMillis(); >> for (int i = 0 ; i < n; i++) { >> Model m = ds.getDefaultModel(); >> } >> long time = System.currentTimeMillis() - start; >> System.out.println(time + " ms for " + n + " calls"); >> } >> >> private void dsInfos(Dataset ds) { >> Model m = ds.getDefaultModel(); >> System.out.println("Default model size: " + m.size()); >> Iterator<String> gns = ds.listNames(); >> for (;gns.hasNext();) { >> String gn = gns.next(); >> System.out.println("Named graph " + gn + " size: " + >> ds.getNamedModel(gn).size()); >> } >> } >> >> // >> // to create a TDB store >> // >> >> // create a TDB store with n triples in default model, kNamedGraphs named >> graphs, and n2 triples in each named graph >> private void createDS(File dir, int n, int kNamedGraphs, int n2) { >> if (dir.exists()) { >> throw new RuntimeException("choose a non-existent dir"); >> } >> boolean ok = dir.mkdirs(); >> if (!ok) throw new RuntimeException(); >> Dataset ds = TDBFactory.createDataset(dir.getAbsolutePath()); >> Model m = ds.getDefaultModel(); >> populateModel(m, n); >> for (int i = 0 ; i < kNamedGraphs ; i++) { >> m = ModelFactory.createDefaultModel(); >> populateModel(m, n2); >> ds.addNamedModel("http://ex.com/g" + Integer.toString(i), m); >> } >> ds.close(); >> } >> >> private void populateModel(Model m, int n) { >> Property p = m.createProperty("http://ex.com/p"); >> for (int i = 0; i < n; i++) { >> Resource s = m.createResource("http://ex.com/s" + >> Integer.toString(i)); >> Resource o = m.createResource("http://ex.com/o" + >> Integer.toString(i)); >> m.add(s,p,o); >> } >> } >> } >> >> >> ————————— >> # >> # 311-SNAPSHOT >> # >> >> # ds created with 3.1.1 >> >> Default model size: 1 >> Named graph http://ex.com/g0 size: 1 >> Named graph http://ex.com/g1 size: 1 >> Named graph http://ex.com/g2 size: 1 >> Named graph http://ex.com/g3 size: 1 >> Named graph http://ex.com/g4 size: 1 >> warmup... >> please wait... >> 1926 ms for 1000000 calls >> >> # ds created with 3.1.1 >> >> Default model size: 10000000 >> Named graph http://ex.com/g0 size: 1000000 >> Named graph http://ex.com/g1 size: 1000000 >> Named graph http://ex.com/g2 size: 1000000 >> Named graph http://ex.com/g3 size: 1000000 >> Named graph http://ex.com/g4 size: 1000000 >> warmup... >> please wait... >> 1511 ms for 1000000 calls >> >> # my TDB >> >> Default model size: 31769838 >> Named graph http://www.renault.com/rasse/docArboLeafindex size: 6843619 >> Named graph http://www.renault.com/rasse/elementfamilyindex size: 2956557 >> Named graph http://www.renault.com/rasse/graph/xappl size: 4271172 >> warmup... >> please wait... >> 189920 ms for 1000000 calls >> >> # >> # 3.1.0 >> # >> >> # ds created with 3.1.0 >> >> Default model size: 1 >> Named graph http://ex.com/g0 size: 1 >> Named graph http://ex.com/g1 size: 1 >> Named graph http://ex.com/g2 size: 1 >> Named graph http://ex.com/g3 size: 1 >> Named graph http://ex.com/g4 size: 1 >> warmup... >> please wait... >> 5 ms for 1000000 calls >> >> # ds created with 3.1.1 >> >> Default model size: 10000000 >> Named graph http://ex.com/g0 size: 1000000 >> Named graph http://ex.com/g1 size: 1000000 >> Named graph http://ex.com/g2 size: 1000000 >> Named graph http://ex.com/g3 size: 1000000 >> Named graph http://ex.com/g4 size: 1000000 >> warmup... >> please wait... >> 5 ms for 1000000 calls >> >> # my TDB >> >> Default model size: 31769838 >> Named graph http://www.renault.com/rasse/docArboLeafindex size: 6843619 >> Named graph http://www.renault.com/rasse/elementfamilyindex size: 2956557 >> Named graph http://www.renault.com/rasse/graph/xappl size: 4271172 >> warmup... >> please wait... >> 5 ms for 1000000 calls >> >> >>
