Le 1 déc. 2017 8:08 PM, "Andy Seaborne" <[email protected]> a écrit :
On 01/12/17 11:49, Jean-Marc Vanel wrote: > Hi > > The time for a SPARQL + Lucene query is more than 1 mn. > It used to be around one second, and then the database grew. > Here is a typical query run for a lookup service similar to dbpedia lookup: > > PREFIX text: <http://jena.apache.org/text#> > PREFIX form: < > http://raw.githubusercontent.com/jmvanel/semantic_forms/mast > er/vocabulary/forms.owl.ttl# > >> >> SELECT DISTINCT ?thing ?COUNT WHERE { > graph ?g { > ?thing text:query ( 'Jean-Marc' ) . > } > This is going to loop on each graph and make a text:query for each one. Is that what you intended? Remove the "graph ?g {". (and then remove the DISTINCT) When I do this, the result is empty. The way the text index is initialized must be wrong: https://github.com/jmvanel/semantic_forms/blob/master/scala/forms/src/main/scala/deductions/runtime/jena/lucene/LuceneIndex.scala On fact, the graph where where a literal triple belongs is not relevant un this application. graph ?g1 { > ?thing a ?CLASS . > Unnecessary? } > OPTIONAL { > graph ?grCount { > ?thing form:linksCount ?COUNT. > } } > } ORDER BY DESC(?COUNT) > LIMIT 10 > > Here is a simpler query that is also slow: > > PREFIX text: <http://jena.apache.org/text#> > SELECT DISTINCT ?thing ?COUNT WHERE { > graph ?g { > ?thing text:query ( 'Jean-Marc' ) . > } > } ORDER BY DESC(?COUNT) > LIMIT 10 > > Run it with: > time wget -O semantic-forms.cc_select-ui.txt > http://semantic-forms.cc:9112/select-ui?query=PREFIX+text%3A > +%3Chttp%3A%2F%2Fjena.apache.org%2Ftext%23%3E+%0D%0ASELECT+ > DISTINCT+%3Fthing+%3FCOUNT+WHERE+%7B%0D%0A++graph+%3Fg+% > 7B%0D%0A++++%3Fthing+text%3Aquery+%28+%27Jean-Marc%27+% > 29+.%0D%0A++%7D%0D%0A%7D%0D%0AORDER+BY+DESC%28%3FCOUNT%29%0D%0ALIMIT+10 > > Or, if you want to use YasGUI , the endpoint is > http://semantic-forms.cc:9112/sparql > > *Statistics on the database* > > 268 graphs and 588 864 triples. > > # Count graphs and triples > SELECT (COUNT(?s) AS ?trc) (COUNT(?GR) AS ?grc) > WHERE { > { GRAPH ?GR { } } > UNION > { GRAPH ?GR1 { ?s ?p ?o . } } > } > > Result: 2 rows > "grc" "trc" > "268"^^http://www.w3.org/2001/XMLSchema#integer "588864"^^ > http://www.w3.org/2001/XMLSchema#integer > > (I'm not sure this the right way to count, but it gives figures :) ) > > You can reproduce the query with this UI : > > http://semantic-forms.cc:9112/select-ui?query=%23+Count+grap > hs%0D%0ASELECT+%28COUNT%28%3Fs%29+AS+%3Ftrc%29+%28COUNT% > 28%3FGR%29+AS+%3Fgrc%29%0D%0A++++WHERE+%7B%0D%0A+++++%7B+ > GRAPH+%3FGR+%7B+%7D+%7D%0D%0A++UNION%0D%0A++++%7B+GRAPH+% > 3FGR1+%7B+%3Fs+%3Fp+%3Fo+.+%7D+%7D%0D%0A%7D > > This is using Jena 3.5.0. with TDB 1 . > Here is a stack made with kill -3 when the app. was working hard; > I put in bold a suspect line. > > "application-akka.actor.default-dispatcher-1351" #2683 prio=5 os_prio=0 > tid=0x00007f07a801c000 nid=0x9b7 runnable [0x00007f06f25ab000] > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.NativeThread.current(Native Method) > at sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46) > at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737) > at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727) > at > org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput. > readInternal(NIOFSDirectory.java:179) > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIn > dexInput.java:342) > at > org.apache.lucene.store.BufferedIndexInput.readByte(Buffered > IndexInput.java:54) > at org.apache.lucene.store.DataInput.readInt(DataInput.java:101) > at > org.apache.lucene.store.BufferedIndexInput.readInt(BufferedI > ndexInput.java:183) > at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:194) > at org.apache.lucene.util.fst.FST.<init>(FST.java:327) > at org.apache.lucene.util.fst.FST.<init>(FST.java:313) > at > org.apache.lucene.codecs.blocktree.FieldReader.<init>(FieldReader.java:91) > at > org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.< > init>(BlockTreeTermsReader.java:234) > at > org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat.fie > ldsProducer(Lucene50PostingsFormat.java:445) > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$Fie > ldsReader.<init>(PerFieldPostingsFormat.java:292) > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fie > ldsProducer(PerFieldPostingsFormat.java:372) > at > org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCor > eReaders.java:112) > at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:74) > at > org.apache.lucene.index.StandardDirectoryReader$1.doBody(Sta > ndardDirectoryReader.java:62) > at > org.apache.lucene.index.StandardDirectoryReader$1.doBody(Sta > ndardDirectoryReader.java:54) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run( > SegmentInfos.java:692) > at > org.apache.lucene.index.StandardDirectoryReader.open(Standar > dDirectoryReader.java:77) > at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63) > at > org.apache.jena.query.text.TextIndexLucene.query(TextIndexLucene.java:370) > at org.apache.jena.query.text.TextQueryPF.performQuery(TextQuer > yPF.java:290) > at > org.apache.jena.query.text.TextQueryPF.lambda$query$1(TextQu > eryPF.java:267) > at > org.apache.jena.query.text.TextQueryPF$$Lambda$66/2108167189.call(Unknown > Source) > at > org.apache.jena.ext.com.google.common.cache.LocalCache$ > LocalManualCache$1.load(LocalCache.java:5065) > at > org.apache.jena.ext.com.google.common.cache.LocalCache$Loadi > ngValueReference.loadFuture(LocalCache.java:3716) > at > org.apache.jena.ext.com.google.common.cache.LocalCache$ > Segment.loadSync(LocalCache.java:2424) > at > org.apache.jena.ext.com.google.common.cache.LocalCache$ > Segment.lockedGetOrLoad(LocalCache.java:2298) > * - locked <0x00000000ef9d34f8> (a > org.apache.jena.ext.com.google.common.cache.LocalCache$StrongAccessEntry)* > > at > org.apache.jena.ext.com.google.common.cache.LocalCache$ > Segment.get(LocalCache.java:2211) > at > org.apache.jena.ext.com.google.common.cache.LocalCache.get( > LocalCache.java:4154) > at > org.apache.jena.ext.com.google.common.cache.LocalCache$ > LocalManualCache.get(LocalCache.java:5060) > at org.apache.jena.atlas.lib.cache.CacheGuava.getOrFill(CacheGu > ava.java:58) > at org.apache.jena.query.text.TextQueryPF.query(TextQueryPF.java:267) > at > org.apache.jena.query.text.TextQueryPF.variableSubject(TextQ > ueryPF.java:227) > at org.apache.jena.query.text.TextQueryPF.exec(TextQueryPF.java:196) > at > org.apache.jena.sparql.pfunction.PropertyFunctionBase$Repeat > ApplyIteratorPF.nextStage(PropertyFunctionBase.java:106) > at > org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply. > makeNextStage(QueryIterRepeatApply.java:108) > at > org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply. > hasNextBinding(QueryIterRepeatApply.java:65) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIterProcedure.ha > sNextBinding(QueryIterProcedure.java:73) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIterProcessBindi > ng.hasNextBinding(QueryIterProcessBinding.java:66) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.main.iterator.QueryIterGraph$Q > ueryIterGraphInner.hasNextBinding(QueryIterGraph.java:121) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply. > hasNextBinding(QueryIterRepeatApply.java:74) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265) > at > org.apache.jena.atlas.iterator.RepeatApplyIterator.hasNext( > RepeatApplyIterator.java:45) > at > org.apache.jena.tdb.solver.SolverLib$IterAbortable.hasNext( > SolverLib.java:195) > at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265) > at > org.apache.jena.sparql.engine.iterator.QueryIterPlainWrapper > .hasNextBinding(QueryIterPlainWrapper.java:53) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply. > makeNextStage(QueryIterRepeatApply.java:101) > at > org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply. > hasNextBinding(QueryIterRepeatApply.java:65) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIterConvert.hasN > extBinding(QueryIterConvert.java:58) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIterTopN$1.initi > alizeIterator(QueryIterTopN.java:98) > at > org.apache.jena.atlas.iterator.IteratorDelayedInitialization > .init(IteratorDelayedInitialization.java:40) > at > org.apache.jena.atlas.iterator.IteratorDelayedInitialization > .hasNext(IteratorDelayedInitialization.java:50) > at > org.apache.jena.sparql.engine.iterator.QueryIterPlainWrapper > .hasNextBinding(QueryIterPlainWrapper.java:53) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper. > hasNextBinding(QueryIteratorWrapper.java:39) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper. > hasNextBinding(QueryIteratorWrapper.java:39) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper. > hasNextBinding(QueryIteratorWrapper.java:39) > at > org.apache.jena.sparql.engine.iterator.QueryIteratorBase.has > Next(QueryIteratorBase.java:114) > at > org.apache.jena.sparql.engine.ResultSetStream.hasNext(Result > SetStream.java:74) > at > org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNex > t(ResultSetCheckCondition.java:55) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext( > Wrappers.scala:42) > at scala.collection.Iterator$class.toStream(Iterator.scala:1320) > at scala.collection.AbstractIterator.toStream(Iterator.scala:1334) > at > scala.collection.TraversableOnce$class.toIterable( > TraversableOnce.scala:296) > at scala.collection.AbstractIterator.toIterable(Iterator.scala:1334) > at > deductions.runtime.sparql_cache.SPARQLHelpers$$anonfun$sparq > lSelectQueryVariablesNT$1.apply(SPARQLHelpers.scala:382) > at > deductions.runtime.sparql_cache.SPARQLHelpers$$anonfun$sparq > lSelectQueryVariablesNT$1.apply(SPARQLHelpers.scala:370) > at deductions.runtime.utils.Timer$class.time(Timer.scala:18) > at controllers.Application$.time(Application.scala:8) > at > deductions.runtime.sparql_cache.SPARQLHelpers$class.sparqlSe > lectQueryVariablesNT(SPARQLHelpers.scala:370) > at > controllers.Application$.sparqlSelectQueryVariablesNT(Application.scala:8) > at > deductions.runtime.sparql_cache.SPARQLHelpers$$anonfun$8. > apply(SPARQLHelpers.scala:361) > at > deductions.runtime.sparql_cache.SPARQLHelpers$$anonfun$8. > apply(SPARQLHelpers.scala:361) > at > org.w3.banana.jena.JenaDatasetStore$$anonfun$r$1.apply( > JenaDatasetStore.scala:17) > at scala.util.Try$.apply(Try.scala:192) > at org.w3.banana.jena.JenaDatasetStore.r(JenaDatasetStore.scala:14) > at org.w3.banana.jena.JenaDatasetStore.r(JenaDatasetStore.scala:10) > at > deductions.runtime.sparql_cache.SPARQLHelpers$class.sparqlSe > lectQueryVariables(SPARQLHelpers.scala:359) > at controllers.Application$.sparqlSelectQueryVariables(Applicat > ion.scala:8) > at > deductions.runtime.services.Lookup$class.searchStringOrClass > (Lookup.scala:76) > at deductions.runtime.services.Lookup$class.lookup(Lookup.scala:44) > at controllers.Application$.lookup(Application.scala:8) > at controllers.Services$$anonfun$lookupService$1.apply(Services.scala:199) > at controllers.Services$$anonfun$lookupService$1.apply(Services.scala:193) > at play.api.mvc.ActionBuilder$$anonfun$apply$13.apply(Action.scala:371) > at play.api.mvc.ActionBuilder$$anonfun$apply$13.apply(Action.scala:370) > at play.api.mvc.Action$.invokeBlock(Action.scala:498) > at play.api.mvc.Action$.invokeBlock(Action.scala:495) > at play.api.mvc.ActionBuilder$$anon$2.apply(Action.scala:458) > at > play.api.mvc.Action$$anonfun$apply$2$$anonfun$apply$5$$anonf > un$apply$6.apply(Action.scala:112) > at > play.api.mvc.Action$$anonfun$apply$2$$anonfun$apply$5$$anonf > un$apply$6.apply(Action.scala:112) > at play.utils.Threads$.withContextClassLoader(Threads.scala:21) > at > play.api.mvc.Action$$anonfun$apply$2$$anonfun$apply$5.apply( > Action.scala:111) > at > play.api.mvc.Action$$anonfun$apply$2$$anonfun$apply$5.apply( > Action.scala:110) > at scala.Option.map(Option.scala:146) > at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:110) > at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:103) > at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253) > at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(Ba > tchingExecutor.scala:55) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp( > BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$ > 1.apply(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$ > 1.apply(BatchingExecutor.scala:91) > at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingEx > ecutor.scala:90) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask. > exec(AbstractDispatcher.scala:415) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask( > ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo > l.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW > orkerThread.java:107) > >
