Ok, thanks for sharing the spreadsheet.

We need more configuration infos: dataset, Fuseki configuration, number
of results of the query.

We didn't get  the attachment of the assembler config.

With no optimizer used, the text:query triple pattern should be
evaluated first - and depending on the number of matching literals,
faster than a scan with filter. But it depends. Also not sure if
text:query is preferred in query optimization, but I think so. Andy
knows better indeed

On 04.01.21 12:11, Deepali Singhavi wrote:
> Hi,
>
> Sample size means number of triples? 
>
> I have tried with 6000,40000,50000 and even with 1,00,000 triples.
> Please find the performance report attached with this email.
>
> Regards,
> Deepali
>
> On Mon, Jan 4, 2021 at 1:03 PM Lorenz Buehmann
> <[email protected]
> <mailto:[email protected]>> wrote:
>
>     What is the sample size here? I mean, for a low number of literals
>     it's
>     obvious that String containment check in Java isn't that slow. The
>     difference will most likely come from a large scan over literals with
>     containment check whereas with a Lucene index - which is basically an
>     inverted index - it's obviously more efficient to lookup terms for the
>     documents.
>
>     On 04.01.21 05:56, Deepali Singhavi wrote:
>     > Hi,
>     >
>     > I am trying to implement indexing for Fuseki using
>     > Lucene/ElasticSearch using an assembler configuration file
>     (attaching
>     > file for reference) but there is no improvement in performance
>     > (performance without index is better than with index).
>     >
>     > I am using sample data from *films.ttl* file.
>     >
>     > *Sample Query *
>     > PREFIX text: <http://jena.apache.org/text#>
>     > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>     > select ?subject ?object
>     > WHERE {
>     > # Without Index
>     > #?subject rdfs:label ?object .
>     > #FILTER contains(?object,"City")
>     > #With Index
>     > ?subject text:query (rdfs:label "city").
>     > ?subject rdfs:label ?object .
>     > }
>     >
>     > *Performance:*
>     >
>     > No of Triples
>     >
>     >       
>     >
>     > No of Runs
>     >
>     >       
>     >
>     > Without Index
>     >
>     >       
>     >
>     > Lucene Index
>     >
>     >       
>     >
>     > ElasticSearch Index
>     >
>     > 6918
>     >
>     >       
>     >
>     > 1
>     >
>     >       
>     >
>     > 16ms
>     >
>     >       
>     >
>     > 18ms
>     >
>     >       
>     >
>     > 19ms
>     >
>     > 2
>     >
>     >       
>     >
>     > 29ms
>     >
>     >       
>     >
>     > 32ms
>     >
>     >       
>     >
>     > 32ms
>     >
>     > 3
>     >
>     >       
>     >
>     > 22ms
>     >
>     >       
>     >
>     > 23ms
>     >
>     >       
>     >
>     > 21ms
>     >
>     > 4
>     >
>     >       
>     >
>     > 22ms
>     >
>     >       
>     >
>     > 14ms
>     >
>     >       
>     >
>     > 53ms
>     >
>     > 5
>     >
>     >       
>     >
>     > 15ms
>     >
>     >       
>     >
>     > 19ms
>     >
>     >       
>     >
>     > 18ms
>     >
>     >
>     > Please let me know if any other information is required from my side
>     > and please suggest how I can improve performance.
>     >
>     > Regards,
>     > Deepali
>     >
>

Reply via email to