[ https://issues.apache.org/jira/browse/LUCENE-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653285#action_12653285 ]
Mark Miller commented on LUCENE-1469: ------------------------------------- This makes sense to me. Care to submit a patch? > isValid should be invoked after analyze rather than before it so it can > validate the output of analyze > ------------------------------------------------------------------------------------------------------ > > Key: LUCENE-1469 > URL: https://issues.apache.org/jira/browse/LUCENE-1469 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* > Affects Versions: 2.4 > Reporter: Vincent Li > Priority: Minor > Original Estimate: 0.08h > Remaining Estimate: 0.08h > > The Synonym map has a protected method String analyze(String word) designed > for custom stemming. > However, before analyze is invoked on a word, boolean isValid(String str) is > used to validate the word - which causes the program to discard words that > maybe useable by the custom analyze method. > I think that isValid should be invoked after analyze rather than before it so > it can validate the output of analyze and allow implemters to decide what is > valid for the overridden analyze method. (In fact, if you look at code > snippet below, isValid should really go after the empty string check) > This is a two line change in org.apache.lucene.index.memory.SynonymMap > /* > * Part B: ignore phrases (with spaces and hyphens) and > * non-alphabetic words, and let user customize word (e.g. do some > * stemming) > */ > if (!isValid(word)) continue; // ignore > word = analyze(word); > if (word == null || word.length() == 0) continue; // ignore -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]