[
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888445#comment-13888445
]
Shai Erera commented on LUCENE-5425:
------------------------------------
If we need to make any change to the API, it has to be a DocIdSet and not
Iterator, because the iterator takes away one layer that could be useful (such
as a specialized implementation which uses instanceof FixedBitSet check, as
what I think Rob suggests).
But, John, didn't we say we should explore the move to a DocIdSet-based API in
a separate issue where we also benchmark the implications of using all of those
abstractions (both at collection and aggregation phases)? This issue was
supposed to be about letting you cache the FBS instance.
I don't think we should commit this patch. This issue should allow you to reuse
a FixedBitSet. A separate issue should benchmark the move to a more general
API. I want to be sure that whatever abstractions that we add do not hurt
faceted search, or at least note by how much and why they are worth it. For
instance, if we move to a DocIdSet API where none of the Lucene sets improves
faceted search over FixedBitSet, I don't think we should do it...
So John, please revert back to the simple patch w/ the protected method on
FacetsCollector (and on trunk-based code) so we can final review and commit it.
And please open a separate issue to explore using a DocIdSet in MatchingDocs,
instead of FixedBitSet. Thanks!
> Make creation of FixedBitSet in FacetsCollector overridable
> -----------------------------------------------------------
>
> Key: LUCENE-5425
> URL: https://issues.apache.org/jira/browse/LUCENE-5425
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Affects Versions: 4.6
> Reporter: John Wang
> Attachments: facetscollector.patch, facetscollector.patch
>
>
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query.
> For large indexes where maxDocs are large creating a bitset of maxDoc bits
> will be expensive and would great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining
> current behavior.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]