[ https://issues.apache.org/jira/browse/OAK-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312675#comment-16312675 ]
Thomas Mueller commented on OAK-7109: ------------------------------------- I don't fully know how facets work. Could you help me a bit with this please. The query {noformat} select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], 'ipsum') and (isdescendantnode(a,'/content1') or isdescendantnode(a,'/content2')) {noformat} converted to "regular SQL" would be this, right? {noformat} select [simple/tags], count(*) from [nt:base] as a where contains(a.[*], 'ipsum') and (isdescendantnode(a,'/content1') or isdescendantnode(a,'/content2')) group by [simple/tags] {noformat} (I know the "group by" and "count" are not currently supported by Oak). Or are there other aspects I missed? What do you mean with "scoring"? If it's the same, then I guess we might want to support the "group by" and "count" features in Oak, or add a custom logic to combine the results of {noformat} select [rep:facet(...)] ... UNION select [rep:facet(...)] ... {noformat} > passing all constraints to lucene What if Lucene doesn't index all the constraints? > rep:facet returns wrong results for complex queries > --------------------------------------------------- > > Key: OAK-7109 > URL: https://issues.apache.org/jira/browse/OAK-7109 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene > Affects Versions: 1.6.7 > Reporter: Dirk Rudolph > Labels: facet > Attachments: facetsInMultipleRoots.patch, > restrictionPropagationTest.patch > > > eComplex queries in that case are queries, which are passed to lucene not > containing all original constraints. For example queries with multiple path > restrictions like: > {code} > select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], > 'ipsum') and (isdescendantnode(a,'/content1') or > isdescendantnode(a,'/content2')) > {code} > In that particular case the index planer gives ":fulltext:ipsum" to lucene > even though the index supports evaluating path constraints. > As counting the facets happens on the raw result of lucene, the returned > facets are incorrect. For example having the following content > {code} > /content1/test/foo > + text = lorem ipsum > - simple/ > + tags = tag1, tag2 > /content2/test/bar > + text = lorem ipsum > - simple/ > + tags = tag1, tag2 > /content3/test/bar > + text = lorem ipsum > - simple/ > + tags = tag1, tag2 > {code} > the expected result for the dimensions of simple/tags and the query above is > - tag1: 2 > - tag2: 2 > as the result set is 2 results long and all documents are equal. The actual > result set is > - tag1: 3 > - tag2: 3 > as the path constraint is not handled by lucene. > To workaround that the only solution that came to my mind is building the > [disjunctive normal > form|https://en.wikipedia.org/wiki/Disjunctive_normal_form] of my complex > query and executing a query for each of the disjunctive statements. As this > is expanding exponentially its only a theoretical solution, nothing for > production. -- This message was sent by Atlassian JIRA (v6.4.14#64029)