: This works, but i'm concerned about how many terms we could end up
: with as the size grows.
:
: Another possibility could be a Filter that iterates though FieldCache
: and checks if each value is in the SetString
:
: Any thoughts/directions on things to look at?
It really all depends on what kind of orders of magnitude you're tlaking
about. both in terms of the number of filters, the cardinality of
those filters, and the likely hood of reuse (ie: will the same SetString
be used many times? will the strings in that Set typically be used but in
various perumtations?
You might want to consider ways you could apply the concepts
from Field Faceting (particularly the tradeoffs between the fc and enum
methods, good values for enum.cache.minDf, fieldValueCache's use of
bigTerms etc...) since you're faceing roughly the same questions --
except instead of computing a bunch of distinct facet counts, you want to
compute the intersection of a bunch of filters ... but you need to
decide when to cache those filters independently, when to not bother
caching them at all, when to cache them as a reusable unit, etc...
-Hoss