[jira] [Commented] (OAK-4788) Fulltext parser sorts and unique-s parsed terms

Thomas Mueller (JIRA) Wed, 02 Nov 2016 08:05:39 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629224#comment-15629224
 ]


Thomas Mueller commented on OAK-4788:
-------------------------------------

I didn't find any information whether repeating the same word affects the 
search result ([~teofili] do you know?). I would prefer if we keep the current 
algorithm (make terms unique), to avoid compatibility issues, and improve 
performance of the "aggregate at query time" case (which is still supported in 
order to support very fine-grained security restrictions).

But I assume using "query level boosting", which is already supported and 
documented, can be used instead. So I would say we are fine right now.

> Fulltext parser sorts and unique-s parsed terms
> -----------------------------------------------
>
>                 Key: OAK-4788
>                 URL: https://issues.apache.org/jira/browse/OAK-4788
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Vikas Saurabh
>            Assignee: Thomas Mueller
>            Priority: Minor
>             Fix For: 1.6
>
>
> Pasting a bit of discussion from OAK-4705:
> {quote}
> bq. whether it's a good idea to sort entries ("hello - world" becomes "- 
> hello world") and make them unique ("test test" becomes "test").
> I think parser shouldn't play with ordering .. but I can see the rational 
> that it allows consumer of parsed output to potentially have forward seeks in 
> their dictionaries. Otoh, I think making unique or not shouldn't be parsers's 
> concern at all.
> I'd open a new issue to follow up on these aspects.
> {quote}
> /cc [~tmueller]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OAK-4788) Fulltext parser sorts and unique-s parsed terms

Reply via email to