[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements="false"
[ https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017843#comment-16017843 ] Elvis Rocha commented on SOLR-6468: --- I created a filter to remove token gaps {code:title=RemoveTokenGapsFilterFactory.java|borderStyle=solid} package filters; import java.io.IOException; import java.util.Map; import org.apache.lucene.analysis.TokenFilter; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; import org.apache.lucene.analysis.util.TokenFilterFactory; public class RemoveTokenGapsFilterFactory extends TokenFilterFactory { public RemoveTokenGapsFilterFactory(Mapargs) { super(args); } @Override public TokenStream create(TokenStream input) { RemoveTokenGapsFilter filter = new RemoveTokenGapsFilter(input); return filter; } } final class RemoveTokenGapsFilter extends TokenFilter { private final PositionIncrementAttribute posIncrAtt = addAttribute(PositionIncrementAttribute.class); public RemoveTokenGapsFilter(TokenStream input) { super(input); } @Override public final boolean incrementToken() throws IOException { while (input.incrementToken()) { posIncrAtt.setPositionIncrement(1); return true; } return false; } } {code} {code:title=schema.xml|borderStyle=solid} {code} !FieldValue.png! > Regression: StopFilterFactory doesn't work properly without > enablePositionIncrements="false" > > > Key: SOLR-6468 > URL: https://issues.apache.org/jira/browse/SOLR-6468 > Project: Solr > Issue Type: Bug >Affects Versions: 4.8.1, 4.9 >Reporter: Alexander S. > Attachments: FieldValue.png > > > Setup: > * Schema version is 1.5 > * Field config: > {code} > autoGeneratePhraseQueries="true"> > > > ignoreCase="true" /> > > > > {code} > * Stop words: > {code} > http > https > ftp > www > {code} > So very simple. In the index I have: > * twitter.com/testuser > All these queries do match: > * twitter.com/testuser > * com/testuser > * testuser > But none of these does: > * https://twitter.com/testuser > * https://www.twitter.com/testuser > * www.twitter.com/testuser > Debug output shows: > "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")" > But we need: > "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")" > Complete debug outputs: > * a valid search: > http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za > * an invalid search: > http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww > The complete discussion and explanation of the problem is here: > http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html > I didn't find a clear explanation how can we upgrade Solr, there's no any > replacement or a workarround to this, so this is not just a major change but > a major disrespect to all existing Solr users who are using this feature. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements="false"
[ https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elvis Rocha updated SOLR-6468: -- Attachment: FieldValue.png > Regression: StopFilterFactory doesn't work properly without > enablePositionIncrements="false" > > > Key: SOLR-6468 > URL: https://issues.apache.org/jira/browse/SOLR-6468 > Project: Solr > Issue Type: Bug >Affects Versions: 4.8.1, 4.9 >Reporter: Alexander S. > Attachments: FieldValue.png > > > Setup: > * Schema version is 1.5 > * Field config: > {code} > autoGeneratePhraseQueries="true"> > > > ignoreCase="true" /> > > > > {code} > * Stop words: > {code} > http > https > ftp > www > {code} > So very simple. In the index I have: > * twitter.com/testuser > All these queries do match: > * twitter.com/testuser > * com/testuser > * testuser > But none of these does: > * https://twitter.com/testuser > * https://www.twitter.com/testuser > * www.twitter.com/testuser > Debug output shows: > "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")" > But we need: > "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")" > Complete debug outputs: > * a valid search: > http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za > * an invalid search: > http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww > The complete discussion and explanation of the problem is here: > http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html > I didn't find a clear explanation how can we upgrade Solr, there's no any > replacement or a workarround to this, so this is not just a major change but > a major disrespect to all existing Solr users who are using this feature. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements="false"
[ https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elvis Rocha updated SOLR-6468: -- Comment: was deleted (was: I created a filter to remove gaps between tokens {code:title=RemoveEmptyTokenFilterFactory.java|borderStyle=solid} package filter; import java.io.IOException; import java.util.Map; import org.apache.lucene.analysis.TokenFilter; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; import org.apache.lucene.analysis.util.TokenFilterFactory; public class RemoveEmptyTokenFilterFactory extends TokenFilterFactory { public RemoveEmptyTokenFilterFactory(Mapargs) { super(args); } @Override public TokenStream create(TokenStream input) { RemoveEmptyTokenFilter filter = new RemoveEmptyTokenFilter(input); return filter; } } final class RemoveEmptyTokenFilter extends TokenFilter { private final PositionIncrementAttribute posIncrAtt = addAttribute(PositionIncrementAttribute.class); public RemoveEmptyTokenFilter(TokenStream input) { super(input); } @Override public final boolean incrementToken() throws IOException { while (input.incrementToken()) { posIncrAtt.setPositionIncrement(1); return true; } return false; } } {code} {code:title=schema.xml|borderStyle=solid} {code}) > Regression: StopFilterFactory doesn't work properly without > enablePositionIncrements="false" > > > Key: SOLR-6468 > URL: https://issues.apache.org/jira/browse/SOLR-6468 > Project: Solr > Issue Type: Bug >Affects Versions: 4.8.1, 4.9 >Reporter: Alexander S. > > Setup: > * Schema version is 1.5 > * Field config: > {code} > autoGeneratePhraseQueries="true"> > > > ignoreCase="true" /> > > > > {code} > * Stop words: > {code} > http > https > ftp > www > {code} > So very simple. In the index I have: > * twitter.com/testuser > All these queries do match: > * twitter.com/testuser > * com/testuser > * testuser > But none of these does: > * https://twitter.com/testuser > * https://www.twitter.com/testuser > * www.twitter.com/testuser > Debug output shows: > "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")" > But we need: > "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")" > Complete debug outputs: > * a valid search: > http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za > * an invalid search: > http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww > The complete discussion and explanation of the problem is here: > http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html > I didn't find a clear explanation how can we upgrade Solr, there's no any > replacement or a workarround to this, so this is not just a major change but > a major disrespect to all existing Solr users who are using this feature. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements="false"
[ https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017816#comment-16017816 ] Elvis Rocha commented on SOLR-6468: --- I created a filter to remove gaps between tokens {code:title=RemoveEmptyTokenFilterFactory.java|borderStyle=solid} package filter; import java.io.IOException; import java.util.Map; import org.apache.lucene.analysis.TokenFilter; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; import org.apache.lucene.analysis.util.TokenFilterFactory; public class RemoveEmptyTokenFilterFactory extends TokenFilterFactory { public RemoveEmptyTokenFilterFactory(Mapargs) { super(args); } @Override public TokenStream create(TokenStream input) { RemoveEmptyTokenFilter filter = new RemoveEmptyTokenFilter(input); return filter; } } final class RemoveEmptyTokenFilter extends TokenFilter { private final PositionIncrementAttribute posIncrAtt = addAttribute(PositionIncrementAttribute.class); public RemoveEmptyTokenFilter(TokenStream input) { super(input); } @Override public final boolean incrementToken() throws IOException { while (input.incrementToken()) { posIncrAtt.setPositionIncrement(1); return true; } return false; } } {code} {code:title=schema.xml|borderStyle=solid} {code} > Regression: StopFilterFactory doesn't work properly without > enablePositionIncrements="false" > > > Key: SOLR-6468 > URL: https://issues.apache.org/jira/browse/SOLR-6468 > Project: Solr > Issue Type: Bug >Affects Versions: 4.8.1, 4.9 >Reporter: Alexander S. > > Setup: > * Schema version is 1.5 > * Field config: > {code} > autoGeneratePhraseQueries="true"> > > > ignoreCase="true" /> > > > > {code} > * Stop words: > {code} > http > https > ftp > www > {code} > So very simple. In the index I have: > * twitter.com/testuser > All these queries do match: > * twitter.com/testuser > * com/testuser > * testuser > But none of these does: > * https://twitter.com/testuser > * https://www.twitter.com/testuser > * www.twitter.com/testuser > Debug output shows: > "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")" > But we need: > "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")" > Complete debug outputs: > * a valid search: > http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za > * an invalid search: > http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww > The complete discussion and explanation of the problem is here: > http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html > I didn't find a clear explanation how can we upgrade Solr, there's no any > replacement or a workarround to this, so this is not just a major change but > a major disrespect to all existing Solr users who are using this feature. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org