On 5/11/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
A couple of questions about DocSet's though, so that I'm confident
I'll be able to get the same functionality...
Along with a BitSet for each term in selected fields, I also store a
"catchall" BitSet that is an OR'd BitSet of all term BitSets
An efficient union isn't implemented yet. The current union() method
creates a new DocSet, and it isn't optimized for speed with
HashDocSets.
I think we'd want to either
- create a mutatingUnion(DocSet other) to prevent repeated creation
of a new DocSet, or
- create a union(Collection<DocSet>)
- or create a addTo(BitSet target)
How can I flip a DocSet or
achieve the same sort of thing?
Currently not implemented... we either could implement it (flip on a
HashDocSet will be big though), or implement some stuff like
ChainedFilter (have a NotDocSet that wraps a DocSet). If memory is a
concern, the latter sounds like the right way to implement that one.
Also, we allow for inverted facet selection as well, allowing a user
to select all documents that do not have a specified value.
So for a certain facet like "platform:pc", you also allow for "-platform:pc"?
If this is a common enough thing for faceted browsing, we should
probably build in support for that in the Solr APIs somehow (w/o
storing DocSets for both).
I
currently accomplish this in my loop to build up an aggregate
constraint BitSet by using its .andNot() method. How can I
accomplish this using DocSet's?
It's not there yet, but I'd be in favor of andNot functionallity in DocSet.
If I can achieve these capabilities without too much effort, then my
DocSet refactoring will happen sooner rather than later :)
Looks like it might be a little later ;-)
It's great to see the requirements that others have though!
Do you facet on all terms for a particular set of fields, or are the
terms to be faceted on defined outside the system? If the former,
most of your system would fall into what I would think of as "simple"
faceted browsing, that should be supported by default some day. The
latter isn't too big of a leap either... maybe with the terms defined
in solrconfig.xml or something.
-Yonik