[
https://issues.apache.org/jira/browse/SOLR-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984133#comment-15984133
]
Kensho Hirasawa commented on SOLR-10528:
----------------------------------------
Thank you Yonik. I overlooked that issue...
I read SOLR-9868 and its pull request.
The goal is the same in respect of using docvalues to efficiently execute range
facets.
A important difference is that I try to reduce memory consumption by allocating
as few buckets as needed in my implementation. When the number of buckets are
large but there are many buckets with count 0, this can lead big performance
improvement. In contrast, I think SOLR-9868's implementation consumes memory
very much in such a situation since slots for all ranges are allocated even if
mincount > 0.
However, SOLR-9868 has fewer limitations than my implementation.
* can handle TrieDate
* can handle include/others
* can handle mincount == 0
* can handle subfacets/substats
There are also other small differences as follows.
* Patch in this issue is for 6.x, pull req of SOLR-9868 is for 7 (master)
* Patch in this issue supports range with open end (e.g. [0, inf) , (-inf, inf))
I wonder whether I should go on developing based on the patch of this issue or
I should make some changes in SOLR-9868's pull request.
Please give me some advice if you do not mind.
> Use docvalue for range faceting in JSON facet API
> -------------------------------------------------
>
> Key: SOLR-10528
> URL: https://issues.apache.org/jira/browse/SOLR-10528
> Project: Solr
> Issue Type: Improvement
> Components: Facet Module
> Affects Versions: 6.5
> Reporter: Kensho Hirasawa
> Priority: Minor
> Attachments: SOLR-10528.patch
>
>
> Range faceting in JSON facet API has only one implementation. In the
> implementation, all buckets are allocated and then range queries are executed
> for all the buckets. Therefore, memory usage and computational cost of range
> facet can be very high if range is wide and gap is narrow.
> I think range faceting in JSON facet should have the implementation which
> uses DocValues instead of inverted indices. By scanning DocValues, we can
> execute range facets much more efficiently especially when the number of
> buckets is large.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]