On Fri, Dec 17, 2021 at 10:02 AM Greg Miller <[email protected]> wrote: > > I suppose the last thing I'd say is that there are valid use-cases for > wanting the "top" dims along with their "top" children, and getAllDims > provides a reasonable way to do this. For example, in Amazon's product > search, we have a large number of different dims but only want to show > a small sub-set to customers on a search page. One way to go about > this would be to determine the "top" dims for the match set along with > the "top n" values under each; getAllDims is helpful for this but has > a bit of an unpleasant side-effect that it unnecessarily resolves the > paths for all children for all dims. As I think about this, I wonder > if a getTopDims method would be more useful that lets the user specify > the number of dims they want back along with the number of children > for each? I'll open a Jira for that.
getTopDims() seems much more reasonable than getAllDims() for this use-case! But still, I feel like facets is "storing" stuff in a lucene-3.x-style-term-dictionary here. Background: before Lucene 4.0, all the terms across all the indexed fields were stored in a single massive dictionary, but each one coded, very similar to what facets is doing. We found it was better to keep fields separate. I really think it might be the same for facets. If i have a field "color", index it with DV so that I can both sort and facet on it. If i have another field "size", do the same thing. If you want to facet on both fields, facet on both fields. And you get two single-valued fields instead of one big multi-valued field... so I'm not sure I am convinced that "dim mixing" is typically a good thing. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
