The unit tests don't really show how I could use it for synonyms at index time- does anyone have sample code? Is it possible?
-----Original Message----- From: Otis Gospodnetic [mailto:[email protected]] Sent: Tuesday, January 13, 2009 3:06 PM To: [email protected] Subject: Re: ShingleMatrixFilter for synonyms Eric, Unit tests should help you see how this can be used: ./contrib/analyzers/src/java/org/apache/lucene/analysis/shingle/ShingleF ilter.java ./contrib/analyzers/src/java/org/apache/lucene/analysis/shingle/ShingleA nalyzerWrapper.java ./contrib/analyzers/src/java/org/apache/lucene/analysis/shingle/ShingleM atrixFilter.java ./contrib/analyzers/src/test/org/apache/lucene/analysis/shingle/ShingleA nalyzerWrapperTest.java ./contrib/analyzers/src/test/org/apache/lucene/analysis/shingle/TestShin gleMatrixFilter.java ./contrib/analyzers/src/test/org/apache/lucene/analysis/shingle/ShingleF ilterTest.java As for multi-word tokens, you just have to make sure they don't get injected before something that would remove any portion of them. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: "Angel, Eric" <[email protected]> > To: [email protected] > Sent: Tuesday, January 13, 2009 2:39:11 PM > Subject: ShingleMatrixFilter for synonyms > > Does anyone have an example using this? > > > > I have a SynonymEngine that returns a an array list of strings, some of > which may be multiple words. How can I incorporate this with my > SynonymEngine at index time? > > > > Also, the javadoc for the ShingleMatrixFilter class says: > > Without a spacer character it can be used to handle > composition and decomposion of words such as searching for "multi > dimensional" instead of "multidimensional". > > > > Does any one have a working example of this? > > > > > > Here's my synonym engine (taken from the Lucene In Action book): > > > > public interface SynonymEngine { > > public String[] getSynonyms(String word) throws IOException; > > } > > > > public class DexSynonymEngine implements SynonymEngine { > > > > private static Mapmap = new HashMap > String[]>(); > > > > static { > > // numbers > > map.put("1" , new String[] {"one"}); > > map.put("2" , new String[] {"two"}); > > map.put("3" , new String[] {"three"}); > > map.put("4" , new String[] {"four"}); > > map.put("5" , new String[] {"five"}); > > map.put("6" , new String[] {"six", "seis"}); > > map.put("7" , new String[] {"seven"}); > > map.put("8" , new String[] {"eight"}); > > map.put("9" , new String[] {"nine"}); > > map.put("10" , new String[] {"ten"}); > > map.put("11" , new String[] {"eleven"}); > > map.put("12" , new String[] {"twelve"}); > > map.put("13" , new String[] {"thirteen"}); > > map.put("14" , new String[] {"fourteen"}); > > map.put("15" , new String[] {"fifteen"}); > > map.put("16" , new String[] {"sixteen"}); > > map.put("17" , new String[] {"seventeen"}); > > map.put("18" , new String[] {"eighteen"}); > > map.put("19" , new String[] {"nineteen"}); > > map.put("20" , new String[] {"twenty"}); > > map.put("21" , new String[] {"twenty one"}); > > // words > > map.put("pharmacy" , new String[] {"drug store"}); > > map.put("pharmacy" , new String[] {"drug store"}); > > map.put("hospital" , new String[] {"medical center"}); > > map.put("fast", new String[]{"quick", "speedy"}); > > map.put("search", new String[]{"explore", "hunt", "hunting", > "look"}); > > map.put("sound", new String[]{"audio"}); > > map.put("restaurant", new String[]{"eatery"}); > > > > } > > > > > > public String[] getSynonyms(String word) throws IOException { > > return map.get(word); > > } > > > > } --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
