[jira] [Commented] (OAK-9781) Lucene Index MBean getFieldTerms Excludes Results for Unique Fields

Thomas Mueller (Jira) Tue, 24 May 2022 08:37:08 -0700


    [ 
https://issues.apache.org/jira/browse/OAK-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541580#comment-17541580
 ]


Thomas Mueller commented on OAK-9781:
-------------------------------------

The code says "> 1" 
https://github.com/apache/jackrabbit-oak/blame/bd4b690561fb6456ed9f42beefd47f93919917f2/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexMBeanImpl.java#L506

I don't remember the reason for having this condition... maybe it was because 
without the condition, too many entries were added easily? If we change it, 
maybe adding a parameter would be good (min count) - that might mean we have to 
add one more method... But I'm not sure - it would need to be tested if it's 
really necessary.


> Lucene Index MBean getFieldTerms Excludes Results for Unique Fields
> -------------------------------------------------------------------
>
>                 Key: OAK-9781
>                 URL: https://issues.apache.org/jira/browse/OAK-9781
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.8.0
>            Reporter: Dan Klco
>            Priority: Minor
>             Fix For: 1.44.0
>
>
> The getFieldTerms method in the Lucene Index MBean only includes terms with < 
> 1 documents. This means that terms with unique or very well distributed 
> values such as UUIDs, paths or even file sizes will return few or no results 
> from this method. 
> Instead, this should only exclude terms where there are no associated 
> documents.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (OAK-9781) Lucene Index MBean getFieldTerms Excludes Results for Unique Fields

Reply via email to