solr-suggestion - terms that "start with"...

Erik Hatcher Tue, 16 May 2006 18:20:18 -0700

User story: We have a lot of peoples names in our data ("agents" thatin some way contributed to a 19th century work). We're refactoringour user interface to have a better navigation of these names, suchthat someone can just start typing and immediately (google-suggeststyle) see terms and their document frequency within a set offilters. Someone types "yo", pauses, and "Yonik Seely (37)"appears. Also it would appear if someone typed "see".

Falling back on my Lucene know-how, I've gotten Solr to respond withalmost what I need using this code:


      TreeMap map = new TreeMap();
      String prefix = req.getParam("prefix");

      try {
        TermEnum enumerator = reader.terms(new Term(facet, prefix));

        do {
          Term term = enumerator.term();

if (term != null && term.field().equals(facet) && term.text().startsWith(prefix)) {

            DocSet docSet = searcher.getDocSet(new TermQuery(term));
            BitSet bits = docSet.getBits();
            bits.and(constraintMask);
            map.put(term.text(), bits.cardinality());
          } else {
            break;
          }
        }
        while (enumerator.next());
      } catch (IOException e) {
        rsp.setException(e);
        numErrors++;
        return;
      }

      rsp.add(facet, map);

I'm going on gut feeling that Solr provides some handy benefits forme in this regard. For quick-and-dirty's sake I used DocSet.getBits() and did things the way I know how in order to AND it with anexisting constraintMask BitSet (built earlier in my custom requesthandler based on constraint parameters passed in).

The thing I'm missing is retrieving the stored field value and usingthat instead of term.text() in the data sent back to the client. Inthe example mentioned above, I currently get back "yonik (37)" if"yo" was sent in as a prefix. But I want the full stored field name,not the analyzed tokens.

Advice on how to implement what I'm after using Solr's infrastructure(or just Lucene's) is welcome.


Thanks,
        Erik

solr-suggestion - terms that "start with"...

Reply via email to