Re: How does Jena-ARQ execute the queries containing ORDER BY + LIMIT clause

Rob Vesse Tue, 10 Mar 2015 04:07:22 -0700

Yes that is pretty much what happens

Note though that most query evaluation in ARQ is done in a streaming
fashion so the full set of solutions is typically never held in memory for
any query unless an operator which requires full/partial materialisation
e.g. DISTINCT is encountered


Rob

On 10/03/2015 10:24, "Rose Beck" <[email protected]> wrote:

>Dear Rob,
>
>First and foremost thank you for such a wonderful explanation.
>
>Just to clarify, say my example query is:
>
>select ?a?b?c where{?a <pred1> ?c. ?a <pred2> ?b} order by ?c limit 10
>
>Then for the query above all the solutions are generated just as it
>would be the case for the SPARQL query: select ?a?b?c where{?a <pred1>
>?c. ?a <pred2> ?b}. But within the priority queue (which is the last
>operation which is applied before results are output to the users) at
>any point just 10 solutions ordered by ?c are placed. Please correct
>me if I am wrong.
>
>Cheers,
>Rose
>
>
>On Tue, Mar 10, 2015 at 3:38 PM, Rob Vesse <[email protected]> wrote:
>> Query execution in ARQ is based on nested iterators so QueryIterTopN
>>will
>> always apply over another iterator
>>
>> A PriorityQueue is used internally as temporary storage within
>> QueryIterTopN while it exhausts the inner iterator allowing it to only
>>use
>> at most the limit amount of storage in the priority queue plus whatever
>> temporary storage the inner iterator(s) may need.
>>
>> There is still a "total sort" in the sense that every possible solution
>> has to be compared to see if it should be placed into the priority queue
>> however there is not a "total sort" in the sense of needing to
>>materialise
>> all possible solutions into memory and then sort.
>>
>> Rob
>>
>> p.s. Please don't post identical questions to both users@ and dev@ - one
>> list is sufficient as the developers are on both lists.  As a general
>>rule
>> general support questions should go to users@ and technical/architecture
>> questions like this should go to dev@
>>
>>
>>
>> On 10/03/2015 05:54, "Rose Beck" <[email protected]> wrote:
>>
>>>Hi,
>>>
>>>I saw the following issue posted on Jena website (which has been
>>>recently resolved):
>>>Avoid a total sort for ORDER BY + LIMIT queries
>>>(https://issues.apache.org/jira/browse/JENA-89).
>>>
>>>I am very interested in understanding as to how does Jena-ARQ avoids
>>>total sort for ORDER BY + LIMIT queries. In the post it is mentioned
>>>that Jena-ARQ uses priority queue for avoiding a final sort, however
>>>it is also mentioned that "ARQ's algebra package contains already a
>>>OpTopN [3] operator. The OpExecutor [4] will need to use a new
>>>QueryIterTopN instead of QueryIterSort + QueryIterSlice." It is not
>>>clear now does the priority queue benefit from OpTopN operator and
>>>QueryIterTopN as the links [3] and [4] mentioned on the website does
>>>not work, so I am not able to understand their operation and as to how
>>>do they help in avoiding a total sort.
>>>
>>>Can someone please explain how does Jena-ARQ execute the queries
>>>containing ORDER BY + LIMIT clause.
>>>
>>>With Warm Regards,
>>>Rose
>>
>>
>>
>>
>
>
>
>-- 
>With Warm Regards,
>Rose

Re: How does Jena-ARQ execute the queries containing ORDER BY + LIMIT clause

Reply via email to