Hi, Please find the requested details as below:
Dataset - TDB2 Dataset Fuseki configuration- I am using the same index config file to start fuseki server. What do you mean by fuseki configuration sorry I am not getting it. number of results of the query - There are 11 triples getting returned from above query Thanks and Regards, Deepali On Tue, Jan 5, 2021 at 5:02 PM Lorenz Buehmann < [email protected]> wrote: > Ok, thanks for sharing the spreadsheet. > > We need more configuration infos: dataset, Fuseki configuration, number > of results of the query. > > We didn't get the attachment of the assembler config. > > With no optimizer used, the text:query triple pattern should be > evaluated first - and depending on the number of matching literals, > faster than a scan with filter. But it depends. Also not sure if > text:query is preferred in query optimization, but I think so. Andy > knows better indeed > > On 04.01.21 12:11, Deepali Singhavi wrote: > > Hi, > > > > Sample size means number of triples? > > > > I have tried with 6000,40000,50000 and even with 1,00,000 triples. > > Please find the performance report attached with this email. > > > > Regards, > > Deepali > > > > On Mon, Jan 4, 2021 at 1:03 PM Lorenz Buehmann > > <[email protected] > > <mailto:[email protected]>> wrote: > > > > What is the sample size here? I mean, for a low number of literals > > it's > > obvious that String containment check in Java isn't that slow. The > > difference will most likely come from a large scan over literals with > > containment check whereas with a Lucene index - which is basically an > > inverted index - it's obviously more efficient to lookup terms for > the > > documents. > > > > On 04.01.21 05:56, Deepali Singhavi wrote: > > > Hi, > > > > > > I am trying to implement indexing for Fuseki using > > > Lucene/ElasticSearch using an assembler configuration file > > (attaching > > > file for reference) but there is no improvement in performance > > > (performance without index is better than with index). > > > > > > I am using sample data from *films.ttl* file. > > > > > > *Sample Query * > > > PREFIX text: <http://jena.apache.org/text#> > > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > > > select ?subject ?object > > > WHERE { > > > # Without Index > > > #?subject rdfs:label ?object . > > > #FILTER contains(?object,"City") > > > #With Index > > > ?subject text:query (rdfs:label "city"). > > > ?subject rdfs:label ?object . > > > } > > > > > > *Performance:* > > > > > > No of Triples > > > > > > > > > > > > No of Runs > > > > > > > > > > > > Without Index > > > > > > > > > > > > Lucene Index > > > > > > > > > > > > ElasticSearch Index > > > > > > 6918 > > > > > > > > > > > > 1 > > > > > > > > > > > > 16ms > > > > > > > > > > > > 18ms > > > > > > > > > > > > 19ms > > > > > > 2 > > > > > > > > > > > > 29ms > > > > > > > > > > > > 32ms > > > > > > > > > > > > 32ms > > > > > > 3 > > > > > > > > > > > > 22ms > > > > > > > > > > > > 23ms > > > > > > > > > > > > 21ms > > > > > > 4 > > > > > > > > > > > > 22ms > > > > > > > > > > > > 14ms > > > > > > > > > > > > 53ms > > > > > > 5 > > > > > > > > > > > > 15ms > > > > > > > > > > > > 19ms > > > > > > > > > > > > 18ms > > > > > > > > > Please let me know if any other information is required from my > side > > > and please suggest how I can improve performance. > > > > > > Regards, > > > Deepali > > > > > >
