Re: No Improvement In Performance with indexing in Jena Fuseki

Deepali Singhavi Wed, 06 Jan 2021 06:28:40 -0800

Hi Lorenz,

Please find the content of my configuration file and hope this is what you
are looking for.


But I am using the same index.ttl file to start my fuseki server using
below command.

java -Xmx1200M -jar fuseki-server.jar --config=*LunceneIndex.ttl*

# Licensed under the terms of http://www.apache.org/licenses/LICENSE-2.0

## Fuseki Server configuration file.

@prefix :        <#> .
@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .

[] rdf:type fuseki:Server ;
   # Example::
   # Server-wide query timeout.
   #
   # Timeout - server-wide default: milliseconds.
   # Format 1: "1000" -- 1 second timeout
   # Format 2: "10000,60000" -- 10s timeout to first result,
   #                            then 60s timeout for the rest of query.
   #
   # See javadoc for ARQ.queryTimeout for details.
   # This can also be set on a per dataset basis in the dataset assembler.
   #
   # ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "30000" ] ;

   # Add any custom classes you want to load.
   # Must have a "public static void init()" method.
   # ja:loadClass "your.code.Class" ;

   # End triples.



Regards,
Deepali

   .


On Wed, Jan 6, 2021 at 6:49 PM Lorenz Buehmann <
[email protected]> wrote:

>
> On 06.01.21 13:33, Deepali Singhavi wrote:
> > Hi,
> >
> > Please find the requested details as below:
> >
> > Dataset - TDB2 Dataset
> > Fuseki configuration- I am using the same index config file to start
> fuseki
> > server. What do you mean by fuseki configuration sorry I am not getting
> it.
> The config file for Fuseki which contains your text index config. In a
> first glance this is the Fuseki config, not a Lucene config. The
> App-Assembler file. Please post it here as content if the attachment
> doesn't work.
> > number of results of the query - There are 11 triples getting returned
> from
> > above query
> >
> > Thanks and Regards,
> > Deepali
> >
> > On Tue, Jan 5, 2021 at 5:02 PM Lorenz Buehmann <
> > [email protected]> wrote:
> >
> >> Ok, thanks for sharing the spreadsheet.
> >>
> >> We need more configuration infos: dataset, Fuseki configuration, number
> >> of results of the query.
> >>
> >> We didn't get  the attachment of the assembler config.
> >>
> >> With no optimizer used, the text:query triple pattern should be
> >> evaluated first - and depending on the number of matching literals,
> >> faster than a scan with filter. But it depends. Also not sure if
> >> text:query is preferred in query optimization, but I think so. Andy
> >> knows better indeed
> >>
> >> On 04.01.21 12:11, Deepali Singhavi wrote:
> >>> Hi,
> >>>
> >>> Sample size means number of triples?
> >>>
> >>> I have tried with 6000,40000,50000 and even with 1,00,000 triples.
> >>> Please find the performance report attached with this email.
> >>>
> >>> Regards,
> >>> Deepali
> >>>
> >>> On Mon, Jan 4, 2021 at 1:03 PM Lorenz Buehmann
> >>> <[email protected]
> >>> <mailto:[email protected]>> wrote:
> >>>
> >>>     What is the sample size here? I mean, for a low number of literals
> >>>     it's
> >>>     obvious that String containment check in Java isn't that slow. The
> >>>     difference will most likely come from a large scan over literals
> with
> >>>     containment check whereas with a Lucene index - which is basically
> an
> >>>     inverted index - it's obviously more efficient to lookup terms for
> >> the
> >>>     documents.
> >>>
> >>>     On 04.01.21 05:56, Deepali Singhavi wrote:
> >>>     > Hi,
> >>>     >
> >>>     > I am trying to implement indexing for Fuseki using
> >>>     > Lucene/ElasticSearch using an assembler configuration file
> >>>     (attaching
> >>>     > file for reference) but there is no improvement in performance
> >>>     > (performance without index is better than with index).
> >>>     >
> >>>     > I am using sample data from *films.ttl* file.
> >>>     >
> >>>     > *Sample Query *
> >>>     > PREFIX text: <http://jena.apache.org/text#>
> >>>     > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> >>>     > select ?subject ?object
> >>>     > WHERE {
> >>>     > # Without Index
> >>>     > #?subject rdfs:label ?object .
> >>>     > #FILTER contains(?object,"City")
> >>>     > #With Index
> >>>     > ?subject text:query (rdfs:label "city").
> >>>     > ?subject rdfs:label ?object .
> >>>     > }
> >>>     >
> >>>     > *Performance:*
> >>>     >
> >>>     > No of Triples
> >>>     >
> >>>     >
> >>>     >
> >>>     > No of Runs
> >>>     >
> >>>     >
> >>>     >
> >>>     > Without Index
> >>>     >
> >>>     >
> >>>     >
> >>>     > Lucene Index
> >>>     >
> >>>     >
> >>>     >
> >>>     > ElasticSearch Index
> >>>     >
> >>>     > 6918
> >>>     >
> >>>     >
> >>>     >
> >>>     > 1
> >>>     >
> >>>     >
> >>>     >
> >>>     > 16ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 18ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 19ms
> >>>     >
> >>>     > 2
> >>>     >
> >>>     >
> >>>     >
> >>>     > 29ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 32ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 32ms
> >>>     >
> >>>     > 3
> >>>     >
> >>>     >
> >>>     >
> >>>     > 22ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 23ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 21ms
> >>>     >
> >>>     > 4
> >>>     >
> >>>     >
> >>>     >
> >>>     > 22ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 14ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 53ms
> >>>     >
> >>>     > 5
> >>>     >
> >>>     >
> >>>     >
> >>>     > 15ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 19ms
> >>>     >
> >>>     >
> >>>     >
> >>>     > 18ms
> >>>     >
> >>>     >
> >>>     > Please let me know if any other information is required from my
> >> side
> >>>     > and please suggest how I can improve performance.
> >>>     >
> >>>     > Regards,
> >>>     > Deepali
> >>>     >
> >>>
>

Re: No Improvement In Performance with indexing in Jena Fuseki

Reply via email to