Hi Lorenz, Please find the content of my configuration file and hope this is what you are looking for.
But I am using the same index.ttl file to start my fuseki server using below command. java -Xmx1200M -jar fuseki-server.jar --config=*LunceneIndex.ttl* # Licensed under the terms of http://www.apache.org/licenses/LICENSE-2.0 ## Fuseki Server configuration file. @prefix : <#> . @prefix fuseki: <http://jena.apache.org/fuseki#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . [] rdf:type fuseki:Server ; # Example:: # Server-wide query timeout. # # Timeout - server-wide default: milliseconds. # Format 1: "1000" -- 1 second timeout # Format 2: "10000,60000" -- 10s timeout to first result, # then 60s timeout for the rest of query. # # See javadoc for ARQ.queryTimeout for details. # This can also be set on a per dataset basis in the dataset assembler. # # ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "30000" ] ; # Add any custom classes you want to load. # Must have a "public static void init()" method. # ja:loadClass "your.code.Class" ; # End triples. Regards, Deepali . On Wed, Jan 6, 2021 at 6:49 PM Lorenz Buehmann < [email protected]> wrote: > > On 06.01.21 13:33, Deepali Singhavi wrote: > > Hi, > > > > Please find the requested details as below: > > > > Dataset - TDB2 Dataset > > Fuseki configuration- I am using the same index config file to start > fuseki > > server. What do you mean by fuseki configuration sorry I am not getting > it. > The config file for Fuseki which contains your text index config. In a > first glance this is the Fuseki config, not a Lucene config. The > App-Assembler file. Please post it here as content if the attachment > doesn't work. > > number of results of the query - There are 11 triples getting returned > from > > above query > > > > Thanks and Regards, > > Deepali > > > > On Tue, Jan 5, 2021 at 5:02 PM Lorenz Buehmann < > > [email protected]> wrote: > > > >> Ok, thanks for sharing the spreadsheet. > >> > >> We need more configuration infos: dataset, Fuseki configuration, number > >> of results of the query. > >> > >> We didn't get the attachment of the assembler config. > >> > >> With no optimizer used, the text:query triple pattern should be > >> evaluated first - and depending on the number of matching literals, > >> faster than a scan with filter. But it depends. Also not sure if > >> text:query is preferred in query optimization, but I think so. Andy > >> knows better indeed > >> > >> On 04.01.21 12:11, Deepali Singhavi wrote: > >>> Hi, > >>> > >>> Sample size means number of triples? > >>> > >>> I have tried with 6000,40000,50000 and even with 1,00,000 triples. > >>> Please find the performance report attached with this email. > >>> > >>> Regards, > >>> Deepali > >>> > >>> On Mon, Jan 4, 2021 at 1:03 PM Lorenz Buehmann > >>> <[email protected] > >>> <mailto:[email protected]>> wrote: > >>> > >>> What is the sample size here? I mean, for a low number of literals > >>> it's > >>> obvious that String containment check in Java isn't that slow. The > >>> difference will most likely come from a large scan over literals > with > >>> containment check whereas with a Lucene index - which is basically > an > >>> inverted index - it's obviously more efficient to lookup terms for > >> the > >>> documents. > >>> > >>> On 04.01.21 05:56, Deepali Singhavi wrote: > >>> > Hi, > >>> > > >>> > I am trying to implement indexing for Fuseki using > >>> > Lucene/ElasticSearch using an assembler configuration file > >>> (attaching > >>> > file for reference) but there is no improvement in performance > >>> > (performance without index is better than with index). > >>> > > >>> > I am using sample data from *films.ttl* file. > >>> > > >>> > *Sample Query * > >>> > PREFIX text: <http://jena.apache.org/text#> > >>> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > >>> > select ?subject ?object > >>> > WHERE { > >>> > # Without Index > >>> > #?subject rdfs:label ?object . > >>> > #FILTER contains(?object,"City") > >>> > #With Index > >>> > ?subject text:query (rdfs:label "city"). > >>> > ?subject rdfs:label ?object . > >>> > } > >>> > > >>> > *Performance:* > >>> > > >>> > No of Triples > >>> > > >>> > > >>> > > >>> > No of Runs > >>> > > >>> > > >>> > > >>> > Without Index > >>> > > >>> > > >>> > > >>> > Lucene Index > >>> > > >>> > > >>> > > >>> > ElasticSearch Index > >>> > > >>> > 6918 > >>> > > >>> > > >>> > > >>> > 1 > >>> > > >>> > > >>> > > >>> > 16ms > >>> > > >>> > > >>> > > >>> > 18ms > >>> > > >>> > > >>> > > >>> > 19ms > >>> > > >>> > 2 > >>> > > >>> > > >>> > > >>> > 29ms > >>> > > >>> > > >>> > > >>> > 32ms > >>> > > >>> > > >>> > > >>> > 32ms > >>> > > >>> > 3 > >>> > > >>> > > >>> > > >>> > 22ms > >>> > > >>> > > >>> > > >>> > 23ms > >>> > > >>> > > >>> > > >>> > 21ms > >>> > > >>> > 4 > >>> > > >>> > > >>> > > >>> > 22ms > >>> > > >>> > > >>> > > >>> > 14ms > >>> > > >>> > > >>> > > >>> > 53ms > >>> > > >>> > 5 > >>> > > >>> > > >>> > > >>> > 15ms > >>> > > >>> > > >>> > > >>> > 19ms > >>> > > >>> > > >>> > > >>> > 18ms > >>> > > >>> > > >>> > Please let me know if any other information is required from my > >> side > >>> > and please suggest how I can improve performance. > >>> > > >>> > Regards, > >>> > Deepali > >>> > > >>> >
