ShingleFilter include words --------------------------- Key: LUCENE-1917 URL: https://issues.apache.org/jira/browse/LUCENE-1917 Project: Lucene - Java Issue Type: Improvement Components: contrib/analyzers Affects Versions: 2.9 Reporter: Jason Rutherglen Priority: Minor Fix For: 3.0
By default ShingleFilter creates shingles (i.e. combines tokens into a single token) from all tokens. For the purposes of for example, indexing stop words as shingles, however not creating shingles out of every word, we can supply an include words CharArraySet to ShingleFilter that determines the tokens to shingle. This is similar to Nutch CommonGrams and SOLR-908. SOLR-908 does not utilize the new token attribute API, and I figured this functionality is more suitable being a part of Lucene. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org