Re: No Improvement In Performance with indexing in Jena Fuseki

Lorenz Buehmann Wed, 06 Jan 2021 05:19:52 -0800

On 06.01.21 13:33, Deepali Singhavi wrote:
> Hi,
>
> Please find the requested details as below:
>
> Dataset - TDB2 Dataset
> Fuseki configuration- I am using the same index config file to start fuseki
> server. What do you mean by fuseki configuration sorry I am not getting it.
The config file for Fuseki which contains your text index config. In a
first glance this is the Fuseki config, not a Lucene config. The
App-Assembler file. Please post it here as content if the attachment
doesn't work.
> number of results of the query - There are 11 triples getting returned from
> above query
>
> Thanks and Regards,
> Deepali
>
> On Tue, Jan 5, 2021 at 5:02 PM Lorenz Buehmann <
> [email protected]> wrote:
>
>> Ok, thanks for sharing the spreadsheet.
>>
>> We need more configuration infos: dataset, Fuseki configuration, number
>> of results of the query.
>>
>> We didn't get  the attachment of the assembler config.
>>
>> With no optimizer used, the text:query triple pattern should be
>> evaluated first - and depending on the number of matching literals,
>> faster than a scan with filter. But it depends. Also not sure if
>> text:query is preferred in query optimization, but I think so. Andy
>> knows better indeed
>>
>> On 04.01.21 12:11, Deepali Singhavi wrote:
>>> Hi,
>>>
>>> Sample size means number of triples?
>>>
>>> I have tried with 6000,40000,50000 and even with 1,00,000 triples.
>>> Please find the performance report attached with this email.
>>>
>>> Regards,
>>> Deepali
>>>
>>> On Mon, Jan 4, 2021 at 1:03 PM Lorenz Buehmann
>>> <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>     What is the sample size here? I mean, for a low number of literals
>>>     it's
>>>     obvious that String containment check in Java isn't that slow. The
>>>     difference will most likely come from a large scan over literals with
>>>     containment check whereas with a Lucene index - which is basically an
>>>     inverted index - it's obviously more efficient to lookup terms for
>> the
>>>     documents.
>>>
>>>     On 04.01.21 05:56, Deepali Singhavi wrote:
>>>     > Hi,
>>>     >
>>>     > I am trying to implement indexing for Fuseki using
>>>     > Lucene/ElasticSearch using an assembler configuration file
>>>     (attaching
>>>     > file for reference) but there is no improvement in performance
>>>     > (performance without index is better than with index).
>>>     >
>>>     > I am using sample data from *films.ttl* file.
>>>     >
>>>     > *Sample Query *
>>>     > PREFIX text: <http://jena.apache.org/text#>
>>>     > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>>>     > select ?subject ?object
>>>     > WHERE {
>>>     > # Without Index
>>>     > #?subject rdfs:label ?object .
>>>     > #FILTER contains(?object,"City")
>>>     > #With Index
>>>     > ?subject text:query (rdfs:label "city").
>>>     > ?subject rdfs:label ?object .
>>>     > }
>>>     >
>>>     > *Performance:*
>>>     >
>>>     > No of Triples
>>>     >
>>>     >
>>>     >
>>>     > No of Runs
>>>     >
>>>     >
>>>     >
>>>     > Without Index
>>>     >
>>>     >
>>>     >
>>>     > Lucene Index
>>>     >
>>>     >
>>>     >
>>>     > ElasticSearch Index
>>>     >
>>>     > 6918
>>>     >
>>>     >
>>>     >
>>>     > 1
>>>     >
>>>     >
>>>     >
>>>     > 16ms
>>>     >
>>>     >
>>>     >
>>>     > 18ms
>>>     >
>>>     >
>>>     >
>>>     > 19ms
>>>     >
>>>     > 2
>>>     >
>>>     >
>>>     >
>>>     > 29ms
>>>     >
>>>     >
>>>     >
>>>     > 32ms
>>>     >
>>>     >
>>>     >
>>>     > 32ms
>>>     >
>>>     > 3
>>>     >
>>>     >
>>>     >
>>>     > 22ms
>>>     >
>>>     >
>>>     >
>>>     > 23ms
>>>     >
>>>     >
>>>     >
>>>     > 21ms
>>>     >
>>>     > 4
>>>     >
>>>     >
>>>     >
>>>     > 22ms
>>>     >
>>>     >
>>>     >
>>>     > 14ms
>>>     >
>>>     >
>>>     >
>>>     > 53ms
>>>     >
>>>     > 5
>>>     >
>>>     >
>>>     >
>>>     > 15ms
>>>     >
>>>     >
>>>     >
>>>     > 19ms
>>>     >
>>>     >
>>>     >
>>>     > 18ms
>>>     >
>>>     >
>>>     > Please let me know if any other information is required from my
>> side
>>>     > and please suggest how I can improve performance.
>>>     >
>>>     > Regards,
>>>     > Deepali
>>>     >
>>>
Re: No Improvement In Performance with indexing in Jena Fuseki

Reply via email to