[
https://issues.apache.org/jira/browse/SOLR-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381651#comment-14381651
]
Shai Erera commented on SOLR-7296:
----------------------------------
bq. Is Lucene faceting with taxonomy index capable of merging facet results
across installations (needed for SolrCloud) and if so, do they each have their
own independent taxonomy indexes or do they need to share a single one?
Yes, definitely though the code isn't part of Lucene facet module. There are
two options:
* Share a single taxonomy index for all shards. This was done in cases where
the indexes were created via MapReduce, and then left read-only. The taxonomy
index was still collocated with each shard (copied), but it was never updated.
This I admit is not the common case, but it's doable.
* Have each shard manage its own taxonomy index. When you ask for top-K values
of the facet "Author" you get the top-K values from each shard and merge them.
Obviously this is simpler than what really happens, since you need to make sure
that you return the true global top-K values, but I believe Solr already
handles that. I don't know the full details of the Solr implementation, but I
believe it involves two phases, where in the first phase each shard returns its
top-K (or top-cK) values, and then the merger decides if it needs to go back to
some shards, since they may contribute to same of the facet values that didn't
make it to the top-K, while they should. I read your blogs on Solr faceting, so
I'm sure you know the details better than me :).
I would like to assume that the second phase of distributed faceted search is
rather generic and shouldn't depend on one facet implementation or another.
I.e. if it receives a ranked list of facet values and counts/weights (String +
Integer/Float), it shouldn't care which facet impl generated these correct?
So to answer your question, it is doable, but lucene-facet currently don't
offer tools to do that. However, I hope the Solr implementation can be
ported/reused straightforwardly. If you know which code in Solr does that, I'd
be happy to take a look.
> Reconcile facetting implementations
> -----------------------------------
>
> Key: SOLR-7296
> URL: https://issues.apache.org/jira/browse/SOLR-7296
> Project: Solr
> Issue Type: Task
> Components: faceting
> Reporter: Steve Molloy
>
> SOLR-7214 introduced a new way of controlling faceting, the unmbrella
> SOLR-6348 brings a lot of improvements in facet functionality, namely around
> pivots. Both make a lot of sense from a user perspective, but currently have
> completely different implementations. With the analytics components, this
> makes 3 implementation of the same logic, which is bound to behave
> differently as time goes by. We should reconcile all implementations to ease
> maintenance and offer consistent behaviour no matter how parameters are
> passed to the API.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]