User story: We have a lot of peoples names in our data ("agents" that in some way contributed to a 19th century work). We're refactoring our user interface to have a better navigation of these names, such that someone can just start typing and immediately (google-suggest style) see terms and their document frequency within a set of filters. Someone types "yo", pauses, and "Yonik Seely (37)" appears. Also it would appear if someone typed "see".

Falling back on my Lucene know-how, I've gotten Solr to respond with almost what I need using this code:

      TreeMap map = new TreeMap();
      String prefix = req.getParam("prefix");

      try {
        TermEnum enumerator = reader.terms(new Term(facet, prefix));

        do {
          Term term = enumerator.term();
if (term != null && term.field().equals(facet) && term.text ().startsWith(prefix)) {
            DocSet docSet = searcher.getDocSet(new TermQuery(term));
            BitSet bits = docSet.getBits();
            bits.and(constraintMask);
            map.put(term.text(), bits.cardinality());
          } else {
            break;
          }
        }
        while (enumerator.next());
      } catch (IOException e) {
        rsp.setException(e);
        numErrors++;
        return;
      }

      rsp.add(facet, map);

I'm going on gut feeling that Solr provides some handy benefits for me in this regard. For quick-and-dirty's sake I used DocSet.getBits () and did things the way I know how in order to AND it with an existing constraintMask BitSet (built earlier in my custom request handler based on constraint parameters passed in).

The thing I'm missing is retrieving the stored field value and using that instead of term.text() in the data sent back to the client. In the example mentioned above, I currently get back "yonik (37)" if "yo" was sent in as a prefix. But I want the full stored field name, not the analyzed tokens.

Advice on how to implement what I'm after using Solr's infrastructure (or just Lucene's) is welcome.

Thanks,
        Erik

Reply via email to