You haven’t specified how your data is stored but assuming you are using Jena’s 
TDB/TDB2 then the triples/quads themselves are already indexed for efficient 
access.  It also inlines some value types that speeds up some comparisons and 
filters, including those used in simple ORDER BY expression as in your example.

This assumes that your objects for relations:hasUserCount triples are properly 
typed as xsd:integer or another well-known XSD numeric type, if not Jena is 
forced to fallback to more simplistic lexical string sorting which can be more 
expensive.

However, there is no indexing available for sorting because SPARQL allows for 
arbitrarily complex sort expressions, and the inputs to those expressions may 
themselves be dynamically computed values that don’t exist in the underlying 
dataset directly.

Rob

From: Chirag Ratra <[email protected]>
Date: Tuesday, 19 March 2024 at 10:39
To: [email protected] <[email protected]>, Andy Seaborne 
<[email protected]>, [email protected] <[email protected]>
Subject: Re: [EXTERNAL] Re: Query Performance Degrade With Sorting In Subquery
Is there any way to create an index or something?

On Tue, Mar 19, 2024 at 3:46 PM Rob @ DNR <[email protected]> wrote:

> This is due to Jena’s lazy evaluation in its query engine.
>
> When you include a LIMIT clause on its own Jena only needs find the first
> N results (10 in your example) at which point it can abort any further
> processing and return results.  In this case evaluation is lazy.
>
> When you include LIMIT and ORDER BY clauses Jena has to find all possible
> results, sort them, and then return only the first N results.  In this case
> full evaluation is required.
>
> One possible approach might be to split into multiple queries i.e. do one
> query to get your main set of results, and then separately issue the
> related item sub-queries with concrete values substituted into for your
> ?concept and ?titleSkosXl values as while Jena will still need to do full
> evaluation injecting a concrete value will constrain the query evaluation
> further
>
> Hope this helps,
>
> Rob
>
> From: Chirag Ratra <[email protected]>
> Date: Tuesday, 19 March 2024 at 07:46
> To: [email protected] <[email protected]>
> Subject: Query Performance Degrade With Sorting In Subquery
> Hi,
>
> Facing a big performance degradation  while using sort query in subquery
> If I run query without sorting the response of my query is around 200 ms
> but when I use the order by query,  performance comes to be around 4-5
> seconds.
>
> Here is my query :
>
> PREFIX text: <http://jena.apache.org/text#<http://jena.apache.org/text>>
> PREFIX skos: <http://www.w3.org/2004/02/skos/core#<
> http://www.w3.org/2004/02/skos/core>><http://www.w3.org/2004/02/skos/core%3e%3e>
> PREFIX skosxl: <http://www.w3.org/2008/05/skos-xl#<
> http://www.w3.org/2008/05/skos-xl>><http://www.w3.org/2008/05/skos-xl%3e%3e>
> PREFIX relations: <https://cxdata.bold.com/ontologies/myDomain#<
> https://cxdata.bold.com/ontologies/myDomain>><https://cxdata.bold.com/ontologies/myDomain%3e%3e>
>
> SELECT ?concept ?titleSkosxl ?title ?languageCode (GROUP_CONCAT(DISTINCT
> ?relatedTitle; separator=", ") AS ?relatedTitles) (GROUP_CONCAT(DISTINCT
> ?alternate; separator=", ") AS ?alternates)
> WHERE
> {
>   (?titleSkosxl ?score) text:query ('cashier').
>
> ?concept skosxl:prefLabel ?titleSkosxl.
>   ?titleSkosxl skosxl:literalForm ?title.
>   ?titleSkosxl relations:usedInLocale ?controlledList.
>   ?controlledList relations:languageMarketCode ?languageCode
> FILTER(?languageCode = 'en-US').
>
>
> #  get alternate title
> OPTIONAL
>   {
>         Select ?alternate  {
>         ?concept skosxl:altLabel ?alternateSkosxl.
>         ?alternateSkosxl skosxl:literalForm ?alternate;
>   relations:hasUserCount ?alternateUserCount.
>         }
> ORDER BY DESC (?alternateUserCount) LIMIT 10
> }
>
> #  get related titles
>   OPTIONAL
>   {
>       Select ?relatedTitle
>       {
>             ?titleSkosxl relations:isRelatedTo ?relatedSkosxl.
>             ?relatedSkosxl skosxl:literalForm ?relatedTitle;
>             relations:hasUserCount ?relatedUserCount.
>       }
> ORDER BY DESC (?relatedUserCount) LIMIT 10
>    }
> }
> GROUP BY ?concept ?titleSkosxl ?title ?languageCode ?alternateJobTitle
> ?notation
> ORDER BY DESC(?jobtitleWeight) DESC(?score)
> LIMIT 10
>
> The sorting queries given causes huge performance degradation :
> ORDER BY DESC (?alternateUserCount) AND ORDER BY DESC (?relatedUserCount)
>
> How can this be improved, this sorting will be used in each and every query
> in my application.
>
> --
>
>
>
>
>
>
>
>
> This email may contain material that is confidential, privileged,
> or for the sole use of the intended recipient.  Any review, disclosure,
> reliance, or distribution by others or forwarding without express
> permission is strictly prohibited.  If you are not the intended recipient,
> please contact the sender and delete all copies, including attachments.
>

--








This email may contain material that is confidential, privileged,
or for the sole use of the intended recipient.  Any review, disclosure,
reliance, or distribution by others or forwarding without express
permission is strictly prohibited.  If you are not the intended recipient,
please contact the sender and delete all copies, including attachments.

Reply via email to