[ 
https://issues.apache.org/jira/browse/OAK-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dirk Rudolph updated OAK-7109:
------------------------------
    Description: 
eComplex queries in that case are queries, which are passed to lucene not 
containing all original constraints. For example queries with multiple path 
restrictions like:

{code}
select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], 
'ipsum') and (isdescendantnode(a,'/content1') or 
isdescendantnode(a,'/content2'))
{code}

In that particular case the index planer gives ":fulltext:ipsum" to lucene even 
though the index supports evaluating path constraints. 

As counting the facets happens on the raw result of lucene, the returned facets 
are incorrect. For example having the following content 

{code}
/content1/test/foo
 + text = lorem ipsum
 - simple/
  + tags = tag1, tag2
/content2/test/bar
 + text = lorem ipsum
 - simple/
  + tags = tag1, tag2
/content3/test/bar
 + text = lorem ipsum
 - simple/
   + tags = tag1, tag2
{code}

the expected result for the dimensions of simple/tags and the query above is 
- tag1: 2
- tag2: 2

as the result set is 2 results long and all documents are equal. The actual 
result set is 
- tag1: 3
- tag2: 3

as the path constraint is not handled by lucene.

To workaround that the only solution that came to my mind is building the DNF 
of my complex query and executing a query for each of the disjunctive 
statements. As this is expanding exponentially its only a theoretical solution, 
nothing for production. 

  was:
Complex queries in that case are queries, which are passed to lucene not 
containing all original constraints. For example queries with multiple path 
restrictions like:

{code}
select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], 
'ipsum') and (isdescendantnode(a,'/content1') or 
isdescendantnode(a,'/content2'))
{code}

In that particular case the index planer gives ":fulltext:ipsum" to lucene even 
though the index supports evaluating path constraints. 

As counting the facets happens on the raw result of lucene, the returned facets 
are incorrect. For example having the following content 

{code}
/content1/test/foo
 + text = lorem ipsum
 - simple/
  + tags = tag1, tag2
/content2/test/bar
 + text = lorem ipsum
 - simple/
  + tags = tag1, tag2
/content1/test/bar
 + text = lorem ipsum
 - simple/
   + tags = tag1, tag2
{code}

the expected result for the dimensions of simple/tags and the query above is 
- tag1: 2
- tag2: 2

as the result set is 2 results long and all documents are equal. The actual 
result set is 
- tag1: 3
- tag2: 3

as the path constraint is not handled by lucene.

To workaround that the only solution that came to my mind is building the DNF 
of my complex query and executing a query for each of the disjunctive 
statements. As this is expanding exponentially its only a theoretical solution, 
nothing for production. 


> rep:facet returns wrong results for complex queries
> ---------------------------------------------------
>
>                 Key: OAK-7109
>                 URL: https://issues.apache.org/jira/browse/OAK-7109
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.6.7
>            Reporter: Dirk Rudolph
>         Attachments: facetsInMultipleRoots.patch
>
>
> eComplex queries in that case are queries, which are passed to lucene not 
> containing all original constraints. For example queries with multiple path 
> restrictions like:
> {code}
> select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], 
> 'ipsum') and (isdescendantnode(a,'/content1') or 
> isdescendantnode(a,'/content2'))
> {code}
> In that particular case the index planer gives ":fulltext:ipsum" to lucene 
> even though the index supports evaluating path constraints. 
> As counting the facets happens on the raw result of lucene, the returned 
> facets are incorrect. For example having the following content 
> {code}
> /content1/test/foo
>  + text = lorem ipsum
>  - simple/
>   + tags = tag1, tag2
> /content2/test/bar
>  + text = lorem ipsum
>  - simple/
>   + tags = tag1, tag2
> /content3/test/bar
>  + text = lorem ipsum
>  - simple/
>    + tags = tag1, tag2
> {code}
> the expected result for the dimensions of simple/tags and the query above is 
> - tag1: 2
> - tag2: 2
> as the result set is 2 results long and all documents are equal. The actual 
> result set is 
> - tag1: 3
> - tag2: 3
> as the path constraint is not handled by lucene.
> To workaround that the only solution that came to my mind is building the DNF 
> of my complex query and executing a query for each of the disjunctive 
> statements. As this is expanding exponentially its only a theoretical 
> solution, nothing for production. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to