Andy, > Le 7 sept. 2015 à 18:49, Andy Seaborne <[email protected]> a écrit : > > Hi there, > > Thanks for the jvisual info - a CSV file only goes so far though. It seems a > lot of time is waiting but that might be an artifact of profile. There are a > number for formats jvisual can load (see the file dialog under "load"). > > I did see one possible oddity - the latest development build (build > 20150907.115322-22) has a fix for the form of storage of the in-memory data. > That might help.
it does help indeed. Times with fuseki's latest dev build [1] are in the same range as those with the simple servlet (maybe a little bit slower, but reasonably - same order of magnitude, see below) Thanks! fps [1] https://repository.apache.org/content/repositories/snapshots/org/apache/jena/apache-jena-fuseki/2.3.1-SNAPSHOT/apache-jena-fuseki-2.3.1-20150907.115415-22.zip ----- PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT ?tag WHERE { ?tag skos:broader tag:semantic_web. } SIMPLE FIRST CALL: 0.013 FUSEKI FIRST CALL: 0.037 SIMPLE MEAN: 0.0089 FUSEKI MEAN: 0.0153 PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> DESCRIBE ?tag WHERE { ?tag skos:broader tag:afrique. } MODEL SIZE: 140 SIMPLE FIRST CALL: 0.012 FUSEKI FIRST CALL: 0.025 SIMPLE MEAN: 0.019 FUSEKI MEAN: 0.02 PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT ?tag WHERE { ?tag skos:broader* tag:science. } SIMPLE FIRST CALL: 0.025 FUSEKI FIRST CALL: 0.097 SIMPLE MEAN: 0.0137 FUSEKI MEAN: 0.029699999999999997 PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> DESCRIBE ?tag WHERE { ?tag skos:broader* tag:linked_data. } MODEL SIZE: 919 SIMPLE FIRST CALL: 0.019 FUSEKI FIRST CALL: 0.096 SIMPLE MEAN: 0.0195 FUSEKI MEAN: 0.039400000000000004 PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT ?tag WHERE { ?tag a <http://www.semanlink.net/2001/00/semanlink-schema#Tag>. } LIMIT 1000 SIMPLE FIRST CALL: 0.014 FUSEKI FIRST CALL: 0.078 SIMPLE MEAN: 0.016800000000000002 FUSEKI MEAN: 0.0257 PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> DESCRIBE ?tag WHERE { ?tag a <http://www.semanlink.net/2001/00/semanlink-schema#Tag>. } LIMIT 1000 MODEL SIZE: 4746 SIMPLE FIRST CALL: 0.082 FUSEKI FIRST CALL: 0.156 SIMPLE MEAN: 0.0801 FUSEKI MEAN: 0.1193 > > Andy > > On 07/09/15 10:49, François-Paul Servant wrote: >> Andy, >> >> >>> Le 5 sept. 2015 à 18:18, Andy Seaborne <[email protected]> a écrit : >>> >>> On 05/09/15 16:19, François-Paul Servant wrote: >>>>> Le 4 sept. 2015 à 10:21, Rob Vesse <[email protected]> a écrit : >>>>> >>>>> You haven't shown your code so I can only guess at what may/may not be >>>>> going on >>>> >>>> Hi Rob, >>>> >>>> note that while the difference in performance is surprising, and that the >>>> most plausible cause is an error on my side, I’m still concerned with >>>> fuseki’s performances: if it can do better with the queries I made, it >>>> doesn’t seem to be a viable solution for me. One of these queries is just >>>> part of what is displayed at: >>>> http://www.semanlink.net/tag/linked_data.html >>>> (developed years ago with jena) >>>> and I won’t be able to use fuseki if the response time for such a query is >>>> is the range of seconds. >>>> So I hope that the final answer will be: “here is how to use fuseki >>>> correctly, and then it will be fast” :-) >>> >>> It is DESCRIBE that your figures point to not SELECT so let's focus on >>> those. >> >> OK. >> Note however that, depending on the query, we may also have significant >> differences with select, cf.: >> SELECT ?tag WHERE { >> ?tag skos:broader* tag:science. >> } >> SIMPLE FIST CALL: 0.172 >> SIMPLE MEAN: 0.0225 >> FUSEKI FIST CALL: 3.981 >> FUSEKI MEAN: 3.1274 >> >>> >>> Could you please run a profiler on fuseki and run some DESCRIBE tests? >> >> yes I can, and I did, using jvisualvm (but it doesn’t work with too long >> queries). What do you want me to do exactly? I send some output in another >> message to your address >> >>> >>> Also - Rob had some questions about the client-side handling of results >>> that are important here. >> >> if I understood correctly, the point is to be sure that the client does read >> a complete answer. Here is the code that I use to read the data from one URI >> (I can send the complete test class if you want). >> >> /** >> * get uri and return the result as a string. >> * Increment time in chrono */ >> public static String getIt(String uri, Client client, MediaType mediaType, >> Chrono ch) { >> if (ch != null) ch.start(); >> WebTarget webTarget = client.target(uri); >> Invocation.Builder invocationBuilder = >> webTarget.request(mediaType); >> invocationBuilder.header("Cache-Control", "no-cache"); >> invocationBuilder.header("Pragma", "no-cache"); >> >> Response response = invocationBuilder.get(); >> int status = response.getStatus(); >> if (status != 200) { >> throw new RuntimeException("Unexpected status: " + >> status + " getting " + uri); // TODO >> } >> String s = response.readEntity(String.class); >> if (ch != null) ch.stop(); >> return s; >> } >> >> When it is a rdf query, I then convert the string to a jena model, I check >> that it contains a decent number of triples, and I check that I get the same >> number of triples returned by my servlet and by fuseki (when the query is >> supposed to: not when it contains a limit clause) >> >> fps >> >> >>> >>> Andy >>> >>>> >>>> fps >>>> >>>> >>>>> >>>>> Firstly did you actually consume the result set in your servlet? >>>>> >>>>> A ResultSet is typically streamed so the fact that execSelect() returned >>>>> doesn't mean the actual query was fully evaluated simply that the first >>>>> result is available. So if you did something like the following: >>>>> >>>>> long start = System.currentTimeMillis(); >>>>> qe.execSelect() >>>>> long elapsed = System.currentTimeMillis() - start; >>>>> >>>>> Then all your have measured is the time to first solution not the time to >>>>> get all results so if this is the case you need to ensure you fully >>>>> consume the ResultSet somehow (whether by iterating over it, passing it to >>>>> some IO method that writes it out, call ResultSetFormatter.consume() on it >>>>> etc.) thus forcing ARQ to fully evaluate the query >>>>> >>>>> On the point of IO, did your servlet actually write the results back to >>>>> the client since depending on the size of the results that can add >>>>> significant overhead relative to the actual query execution and Fuseki is >>>>> always going to do this. >>>>> >>>>> Finally most of the queries exhibiting large differences are DESCRIBE >>>>> queries which are two pass evaluation, firstly the WHERE clause is >>>>> evaluated (via execSelect() internally) and then the description is built. >>>>> If your servlet is only calling execSelect() for those queries then it is >>>>> only timing the first pass of the WHERE clause (and possibly subject to >>>>> timing only the first result as noted above) rather than timing the full >>>>> query evaluation which Fuseki will be doing. >>>>> >>>>> Rob >>>>> >>>>> On 03/09/2015 23:19, "François-Paul Servant" >>>>> <[email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> shouldn’t we have the same level of performance with Fuseki and with a >>>>>> simple servlet that calls ARQ? >>>>>> >>>>>> I hadn’t try fuseki until now. Yesterday, I downloaded the 2.3.0 release, >>>>>> started the server in a terminal window of my mac (osx 10.10.5) with: >>>>>> ./fuseki-server --mem /ds >>>>>> I uploaded a rdf file (skos-like data, 21K triples), and I began to make >>>>>> some queries. I’m used to play with that data in jena memory models, and >>>>>> to query it. Getting results in Fuseki GUI seemed slow to me, I decided >>>>>> to compare with a simple servlet that loads a memory model with the same >>>>>> data on init, and calls ARQ in its doGet method. >>>>>> >>>>>> I loaded both fuseki and my simple servlet in an instance of tomcat 8, >>>>>> both loaded with the same data (default graph, memory model), and I >>>>>> measured the time for some GET queries as seen by a client I wrote using >>>>>> jersey. >>>>>> >>>>>> Here are the results. For each sparql query, times with the simple >>>>>> servlet, and with fuseki: the time for the first call, and the mean when >>>>>> calling it 10 times (with the simple servlet, it is generally much faster >>>>>> after the first call, but this is not related to HTTP caching: I took >>>>>> attention to it, and I verified, in the case of the simple servlet, that >>>>>> its doGet method gets actually called) >>>>>> Depending on the query, differences are small, or huge. >>>>>> >>>>>> PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> >>>>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> >>>>>> SELECT ?tag WHERE { >>>>>> ?tag skos:broader tag:semantic_web. >>>>>> } >>>>>> SIMPLE FIST CALL: 0.039 >>>>>> SIMPLE MEAN: 0.0213 >>>>>> FUSEKI FIST CALL: 0.025 >>>>>> FUSEKI MEAN: 0.0215 >>>>>> >>>>>> PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> >>>>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> >>>>>> DESCRIBE ?tag WHERE { >>>>>> ?tag skos:broader tag:afrique. >>>>>> } >>>>>> SIMPLE FIST CALL: 0.039 >>>>>> SIMPLE MEAN: 0.0216 >>>>>> FUSEKI FIST CALL: 0.485 >>>>>> FUSEKI MEAN: 0.2284 >>>>>> >>>>>> PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> >>>>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> >>>>>> SELECT ?tag WHERE { >>>>>> ?tag skos:broader* tag:science. >>>>>> } >>>>>> SIMPLE FIST CALL: 0.172 >>>>>> SIMPLE MEAN: 0.0225 >>>>>> FUSEKI FIST CALL: 3.981 >>>>>> FUSEKI MEAN: 3.1274 >>>>>> >>>>>> PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> >>>>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> >>>>>> DESCRIBE ?tag WHERE { >>>>>> ?tag skos:broader* tag:linked_data. >>>>>> } >>>>>> SIMPLE FIST CALL: 0.131 >>>>>> SIMPLE MEAN: 0.0417 >>>>>> FUSEKI FIST CALL: 1.46 >>>>>> FUSEKI MEAN: 1.3244 >>>>>> >>>>>> PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> >>>>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> >>>>>> SELECT ?tag WHERE { >>>>>> ?tag a <http://www.semanlink.net/2001/00/semanlink-schema#Tag>. >>>>>> } >>>>>> LIMIT 1000 >>>>>> SIMPLE FIST CALL: 0.07 >>>>>> SIMPLE MEAN: 0.0269 >>>>>> FUSEKI FIST CALL: 0.037 >>>>>> FUSEKI MEAN: 0.024399999999999998 >>>>>> >>>>>> PREFIX tag: <http://127.0.0.1:8080/fuseki/ds/> >>>>>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> >>>>>> DESCRIBE ?tag WHERE { >>>>>> ?tag a <http://www.semanlink.net/2001/00/semanlink-schema#Tag>. >>>>>> } >>>>>> LIMIT 1000 >>>>>> SIMPLE FIST CALL: 0.181 >>>>>> SIMPLE MEAN: 0.13440000000000002 >>>>>> FUSEKI FIST CALL: 6.471 >>>>>> FUSEKI MEAN: 5.497999999999999 >>>>>> >>>>>> Do you have an explanation? >>>>>> >>>>>> Best Regards, >>>>>> >>>>>> fps >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >
