[ 
https://issues.apache.org/jira/browse/OAK-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-1654:
---------------------------------

    Attachment: OAK-1654.patch

attaching proposed patch.

There are a few issues this patch introduces. First, it will break any query 
containing more than a single token into multiple queries and depending on the 
type of the constraints ('and', 'or') it will merge the resulting sets. For the 
'or' it was easy enough, you only need to chain the iterators, but the 'and' 
turned out a bit more difficult, you need to consume the entire result sets and 
join them afterwards.
And of course running more queries means the overall query is slower.

Also this approach turned out to be a bit hairy when dealing with negative 
tokens, so I decided to not include a fix for that case yet (see 
#hasNegativeContains method).

The last weak part is the cost. The approach was to sum the cost generated by 
the parts of the query. I think this is a good approach but it can fail when 
there are both the lucene and the aggregation index deployed on the same 
instance. 
Ex. for a double token query: 'one' and 'two', lucene will return 10 and the 
aggregator will return the double, so the lucene index will be picked up and 
the aggregation won't work. This can be fixed by making sure the 2 are not 
present at the same time, which means potentially removing the osgi annotations 
from the lucene index.




> Composite index aggregates
> --------------------------
>
>                 Key: OAK-1654
>                 URL: https://issues.apache.org/jira/browse/OAK-1654
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: oak-lucene, query
>            Reporter: Alex Parvulescu
>            Assignee: Alex Parvulescu
>             Fix For: 0.20
>
>         Attachments: OAK-1654.patch
>
>
> This is a followup for what is still missing from OAK-828: composite 
> aggregates.
> This covers 2 parts:
>  - when searching for 2 tokens that can be present on 2 different nodes, the 
> common aggregated parent doesn't show up
>  - when adding together multiple contains clauses on different hierarchy 
> levels: now the lucene index simply returns an infinite cost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to