Hello Stian, at the time I first wrote the query I was not familiar with VALUES but I have now changed it to use VALUES. UNION is not optimal because getting N counts with UNION returns N*N values with only N of them bound.
Regards, Michael Brunnbauer On Mon, Feb 16, 2015 at 04:59:29PM +0000, Stian Soiland-Reyes wrote: > Can you not select that graph ?gCount ?topic programmatically so each > row is another graph URI and its count? There should be no need for > subqueries, and that could in theory stream back - presumably you > probably don't really care about the order here? > > > You can use VALUES for matching the ?topic to your fixed list of 1200 URIs. > http://www.w3.org/TR/sparql11-query/#inline-data > > > I've not tested if this would be faster or slower.. > > > Something like: > > SELECT ?topic (count(*) as ?count) WHERE { > VALUES ?topic { <http://dbpedia.org/resource/Thomas_Mills_Wood> > <http://sws.geonames.org/2633653/> } > GRAPH ?g { ?s ?p ?topic . } > } > GROUP BY ?topic > > (this seems slower at the virtuoso at http://dbpedia.org/sparql but I > have found Virtuoso doesn't do the streaming back as nicely as Fuseki) > > > BTW -- do you want out a count of total number of triples across all > graphs that have the ?topic as an object - or a count pr graph that > has ?topic as an object? The query above is the first, but you can > SELECT and GROUP BY ?g as well if you want the second. > > On 16 February 2015 at 13:09, Andy Seaborne <[email protected]> wrote: > > On 16/02/15 12:42, Rob Vesse wrote: > >> > >> Michael > >> > >> Was about to say roughly the same as Andy, btw an incomplete fragment is > >> hard to say much about. > >> > >> Are the sub-queries literally just joined together I.e. no UNION in which > >> cases Andy's comment are accurate > >> > >> If so then you are calculating a cross product of all your counts so with > >> 1200 sub-queries you are going to get 1200! which is a number so big it > >> overflows the OS X calculator and the Google calculator reports infinity > > > > > > So as not to overload Google as it's calculates that number, a cross product > > of N and M rows is N*M. > > > > As a count without group returns one row, that's 1*1*1... = 1^^1200 which is > > a bit more manageable. > > > > Just don't add a GROUP BY to the subqueries! > > > > But it also means the query does all the work before returning a single row > > and I would not be surprised if some intermediary timed it out. > > > > A UNION will result in 1200 rows, with a single different variable in each > > row. You could even label the rows as to what the variable means. > > > > { select (count(*) as ?s1091) "Thomas_Mills_Wood" { ... } } > > > > The query speed will be affected by the number triples in each graph > > matching the inner WHERE clause. > > > > Andy > > > > > >> > >> Either do separate queries or use UNION to combine the sub-queries which > >> will yield you one row with 1200 columns > >> > >> Rob > >> > >> > >> > >> On 16/02/2015 12:11, "Andy Seaborne" <[email protected]> wrote: > >> > >>> On 16/02/15 12:00, Michael Brunnbauer wrote: > >>>> > >>>> > >>>> re > >>>> > >>>> thanks Rob and Andy - that makes sense! > >>>> > >>>> What is a bit strange is that the query takes so long. It consists of > >>>> ca. 1200 > >>>> subselects - each defining a separate binding with a simple query of the > >>>> form { graph ?g { ?s ?g <uri> }}. Like this: > >>>> > >>>> { select (count(*) as ?s1091) where { graph ?g { ?s ?p > >>>> <http://dbpedia.org/resource/Thomas_Mills_Wood>. } }} > >>>> { select (count(*) as ?s1092) where { graph ?g { ?s ?p > >>>> <http://dbpedia.org/resource/John_Wood_(explorer)>. } }} > >>>> { select (count(*) as ?s1093) where { graph ?g { ?s ?p > >>>> <http://sws.geonames.org/2633653/>. } }} > >>>> > >>>> There are not many named graphs (ca. 7000) and not many triples that > >>>> would > >>>> match (I guess < 200). Do you think it would be faster to make 1200 > >>>> separate > >>>> queries instead? > >>> > >>> > >>> That's a cross-product (of one item) for what is essential a bunch of > >>> separate queries. I'd look at the optimized algebra to see what execute > >>> plan it decided on. > >>> > >>> And a stream of queries is worth trying out. > >>> > >>> Andy > >>> > >>>> > >>>> Regards, > >>>> > >>>> Michael Brunnbauer > >>>> > >>>> On Mon, Feb 16, 2015 at 11:39:20AM +0000, Rob Vesse wrote: > >>>>> > >>>>> Michael > >>>>> > >>>>> The error is coming from Jetty, it simply means that one end of the > >>>>> connection was closed. I assume you see this in the Fuseki log? > >>>>> > >>>>> Most likely your client is timing out and closing the connection since > >>>>> 15 > >>>>> minutes for a query to complete is longer than the default timeouts of > >>>>> most browsers and HTTP clients/API. > >>>>> > >>>>> Try upping the timeout significantly in your client/browser/API to > >>>>> something higher than the expected runtime of the query. > >>>>> > >>>>> Rob > >>>>> > >>>>> On 16/02/2015 11:01, "Michael Brunnbauer" <[email protected]> wrote: > >>>>> > >>>>>> > >>>>>> hi all, > >>>>>> > >>>>>> I see this Exception with a very big SPARQL select query (182825 > >>>>>> bytes) > >>>>>> on a TDB with jena-fuseki-1.0.2-20140315.080253-36. After an upgrade > >>>>>> to > >>>>>> jena-fuseki-1.1.1, the problem is still there. The query used to work > >>>>>> before > >>>>>> but took very long (ca. 900s). > >>>>>> > >>>>>> What does the error mean? Is my TDB corrupted? > >>>>>> > >>>>>> 02:50:58 INFO [1] exec/select > >>>>>> org.eclipse.jetty.io.EofException > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:914 > >>>>>> ) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.http.AbstractGenerator.flush(AbstractGenerator.java:4 > >>>>>> 43) > >>>>>> at > >>>>>> org.eclipse.jetty.server.HttpOutput.flush(HttpOutput.java:100) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection$Output.flush(AbstractHt > >>>>>> tpC > >>>>>> onnection.java:1123) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.ResponseResultSet.output(ResponseResult > >>>>>> Set > >>>>>> .java:303) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.ResponseResultSet.textOutput(ResponseRe > >>>>>> sul > >>>>>> tSet.java:249) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.ResponseResultSet.doResponseResultSet$( > >>>>>> Res > >>>>>> ponseResultSet.java:148) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.ResponseResultSet.doResponseResultSet(R > >>>>>> esp > >>>>>> onseResultSet.java:89) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_Query.sendResults(SPARQL_Query.j > >>>>>> ava > >>>>>> :345) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java: > >>>>>> 242 > >>>>>> ) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQ > >>>>>> L_Q > >>>>>> uery.java:195) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java: > >>>>>> 96) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeLifecycle(SPA > >>>>>> RQL > >>>>>> _ServletBase.java:171) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeAction(SPARQL > >>>>>> _Se > >>>>>> rvletBase.java:152) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_ServletBase.execCommonWorker(SPA > >>>>>> RQL > >>>>>> _ServletBase.java:140) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_Serv > >>>>>> let > >>>>>> Base.java:69) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_Query.doPost(SPARQL_Query.java:5 > >>>>>> 7) > >>>>>> at > >>>>>> javax.servlet.http.HttpServlet.service(HttpServlet.java:755) > >>>>>> at > >>>>>> javax.servlet.http.HttpServlet.service(HttpServlet.java:848) > >>>>>> at > >>>>>> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa > >>>>>> ndl > >>>>>> er.java:1496) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.jav > >>>>>> a:8 > >>>>>> 2) > >>>>>> at > >>>>>> org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:256) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa > >>>>>> ndl > >>>>>> er.java:1467) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:4 > >>>>>> 99) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler > >>>>>> .ja > >>>>>> va:229) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler > >>>>>> .ja > >>>>>> va:1086) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:42 > >>>>>> 8) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler. > >>>>>> jav > >>>>>> a:193) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler. > >>>>>> jav > >>>>>> a:1020) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav > >>>>>> a:1 > >>>>>> 35) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.j > >>>>>> ava > >>>>>> :116) > >>>>>> at org.eclipse.jetty.server.Server.handle(Server.java:370) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractH > >>>>>> ttp > >>>>>> Connection.java:494) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpCon > >>>>>> nec > >>>>>> tion.java:982) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content( > >>>>>> Abs > >>>>>> tractHttpConnection.java:1043) > >>>>>> at > >>>>>> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) > >>>>>> at > >>>>>> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection > >>>>>> .ja > >>>>>> va:82) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndP > >>>>>> oin > >>>>>> t.java:667) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPo > >>>>>> int > >>>>>> .java:52) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool. > >>>>>> jav > >>>>>> a:608) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.j > >>>>>> ava > >>>>>> :543) > >>>>>> at java.lang.Thread.run(Thread.java:745) > >>>>>> Caused by: java.io.IOException: Broken pipe > >>>>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > >>>>>> at > >>>>>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > >>>>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > >>>>>> at sun.nio.ch.IOUtil.write(IOUtil.java:51) > >>>>>> at > >>>>>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:293 > >>>>>> ) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPo > >>>>>> int > >>>>>> .java:401) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:850 > >>>>>> ) > >>>>>> ... 43 more > >>>>>> 02:51:01 ERROR Internal error > >>>>>> java.lang.IllegalStateException: Committed > >>>>>> at > >>>>>> org.eclipse.jetty.server.Response.resetBuffer(Response.java:1154) > >>>>>> at > >>>>>> org.eclipse.jetty.server.Response.sendError(Response.java:317) > >>>>>> at > >>>>>> org.eclipse.jetty.server.Response.sendError(Response.java:419) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> javax.servlet.http.HttpServletResponseWrapper.sendError(HttpServletResp > >>>>>> ons > >>>>>> eWrapper.java:164) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.HttpServletResponseTracker.sendError(Ht > >>>>>> tpS > >>>>>> ervletResponseTracker.java:53) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.ServletBase.responseSendError(ServletBa > >>>>>> se. > >>>>>> java:73) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_Serv > >>>>>> let > >>>>>> Base.java:82) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.apache.jena.fuseki.servlets.SPARQL_Query.doPost(SPARQL_Query.java:5 > >>>>>> 7) > >>>>>> at > >>>>>> javax.servlet.http.HttpServlet.service(HttpServlet.java:755) > >>>>>> at > >>>>>> javax.servlet.http.HttpServlet.service(HttpServlet.java:848) > >>>>>> at > >>>>>> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa > >>>>>> ndl > >>>>>> er.java:1496) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.jav > >>>>>> a:8 > >>>>>> 2) > >>>>>> at > >>>>>> org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:256) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa > >>>>>> ndl > >>>>>> er.java:1467) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:4 > >>>>>> 99) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler > >>>>>> .ja > >>>>>> va:229) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler > >>>>>> .ja > >>>>>> va:1086) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:42 > >>>>>> 8) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler. > >>>>>> jav > >>>>>> a:193) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler. > >>>>>> jav > >>>>>> a:1020) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav > >>>>>> a:1 > >>>>>> 35) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.j > >>>>>> ava > >>>>>> :116) > >>>>>> at org.eclipse.jetty.server.Server.handle(Server.java:370) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractH > >>>>>> ttp > >>>>>> Connection.java:494) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpCon > >>>>>> nec > >>>>>> tion.java:982) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content( > >>>>>> Abs > >>>>>> tractHttpConnection.java:1043) > >>>>>> at > >>>>>> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) > >>>>>> at > >>>>>> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection > >>>>>> .ja > >>>>>> va:82) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndP > >>>>>> oin > >>>>>> t.java:667) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPo > >>>>>> int > >>>>>> .java:52) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool. > >>>>>> jav > >>>>>> a:608) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.j > >>>>>> ava > >>>>>> :543) > >>>>>> at java.lang.Thread.run(Thread.java:745) > >>>>>> > >>>>>> Regards, > >>>>>> > >>>>>> Michael Brunnbauer > >>>>>> > >>>>>> -- > >>>>>> ++ Michael Brunnbauer > >>>>>> ++ netEstate GmbH > >>>>>> ++ Geisenhausener Straße 11a > >>>>>> ++ 81379 München > >>>>>> ++ Tel +49 89 32 19 77 80 > >>>>>> ++ Fax +49 89 32 19 77 89 > >>>>>> ++ E-Mail [email protected] > >>>>>> ++ http://www.netestate.de/ > >>>>>> ++ > >>>>>> ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) > >>>>>> ++ USt-IdNr. DE221033342 > >>>>>> ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer > >>>>>> ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>> > >> > >> > >> > >> > > > > > > -- > Stian Soiland-Reyes > Apache Taverna (incubating) > http://orcid.org/0000-0001-9842-9718 -- ++ Michael Brunnbauer ++ netEstate GmbH ++ Geisenhausener Straße 11a ++ 81379 München ++ Tel +49 89 32 19 77 80 ++ Fax +49 89 32 19 77 89 ++ E-Mail [email protected] ++ http://www.netestate.de/ ++ ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) ++ USt-IdNr. DE221033342 ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
pgpKasbCQ0RNc.pgp
Description: PGP signature
