[ 
https://issues.apache.org/jira/browse/JENA-587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820026#comment-13820026
 ] 

Andy Seaborne edited comment on JENA-587 at 11/12/13 11:31 AM:
---------------------------------------------------------------

We should not reply on the ARQ join strategy.  

* It may change and it does not apply to all storage systems.
* The join order is only sufficiently predicable for some cases like BGPs - 
even adding union default graph may scramble the the order coming of the 
{{WHERE}} clause depends on the pattern (e.g. some inner SELECTs mixed with 
other things).  We use hash tables for {{MINUS}}.

It is a legal optimization if the {{DISTINCT}} is of variables that are in 
order due to {{ORDER BY}}

Legal: 
1. {{DISTINCT ?v ORDER BY ?v}}
2. {{DISTINCT ?v ORDER BY ?v ?w}}
3. {{DISTINCT ?v ?w ORDER BY ?v ?w}}
4. {{DISTINCT ?v ?w ORDER BY ?w ?v}}

{{DISTINCT * ORDER BY ...}} is possible only if the ORDER BY is a total 
ordering of the underlying pattern.

Not legal:
1. {{DISTINCT ?v ORDER BY ?w}}
2. {{DISTINCT ?v ORDER BY ?w ?v}} because not sorted by ?v first.

Maybe the first step is to just do some simple cases such as {{ORDER BY}} 
exactly the variables of the project of the {{DISTINCT}} then expand  the 
intelligence of the transformation.



was (Author: andy.seaborne):
We should not reply on the ARQ join strategy.  

* It may change.
* The join order is only predicable for BGPs - even adding union default graph 
may 
The order coming of the {{WHERE}} clause depends on the pattern (e.g. sub 
SELECTs mixed with other things).
* 

It is a legal optimization if the {{DISTINCT}} is of variables that are in 
order due to {{ORDER BY}}

Legal: 
1. {{DISTINCT ?v ORDER BY ?v}}
2. {{DISTINCT ?v ORDER BY ?v ?w}}
3. {{DISTINCT ?v ?w ORDER BY ?v ?w}}
4. {{DISTINCT ?v ?w ORDER BY ?w ?v}}

Not legal:
1. {{DISTINCT ?v ORDER BY ?w}}
2. {{DISTINCT ?v ORDER BY ?w ?v}} because not sorted by ?v first.



> SELECT DISTINCT returns duplicate results
> -----------------------------------------
>
>                 Key: JENA-587
>                 URL: https://issues.apache.org/jira/browse/JENA-587
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 2.11.0
>            Reporter: Veyriere
>         Attachments: D.ttl, Q.rq, bug Jena2.11.0.zip, jena-587.zip
>
>
> SELECT DISTINCT returns duplicate results. Attaching a small quads dump and 
> the query to reproduce with TDB
> Reproduced with Jena 2.11.0 and Jena 2.10.1 (was working with 2.7.4)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to