[
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15638299#comment-15638299
]
Diego Oliveira commented on SOLR-6468:
--------------------------------------
I read all discussion and can't believe on this decision. I'm having the same
problem!!! I need to use stopword filter + shingle filter. But when removed the
stop words I stay with a hole that create a bug for shingle filters... they
duplicate tokens that cannot be removed by Remove Duplicate Filter due to
shingle and tokens be in distinct initial positions. I don't believe that the
community cannot solve this problem enabling this old feature... as the people
said in here. It is best stay with the 'simplified' version than with the new
(plus) version. it will until Solr X? Come on!!!
> Regression: StopFilterFactory doesn't work properly without
> enablePositionIncrements="false"
> --------------------------------------------------------------------------------------------
>
> Key: SOLR-6468
> URL: https://issues.apache.org/jira/browse/SOLR-6468
> Project: Solr
> Issue Type: Bug
> Affects Versions: 4.8.1, 4.9
> Reporter: Alexander S.
>
> Setup:
> * Schema version is 1.5
> * Field config:
> {code}
> <fieldType name="words_ngram" class="solr.TextField" omitNorms="false"
> autoGeneratePhraseQueries="true">
> <analyzer>
> <tokenizer class="solr.PatternTokenizerFactory" pattern="[^\w]+" />
> <filter class="solr.StopFilterFactory" words="url_stopwords.txt"
> ignoreCase="true" />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
> {code}
> * Stop words:
> {code}
> http
> https
> ftp
> www
> {code}
> So very simple. In the index I have:
> * twitter.com/testuser
> All these queries do match:
> * twitter.com/testuser
> * com/testuser
> * testuser
> But none of these does:
> * https://twitter.com/testuser
> * https://www.twitter.com/testuser
> * www.twitter.com/testuser
> Debug output shows:
> "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")"
> But we need:
> "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")"
> Complete debug outputs:
> * a valid search:
> http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
> * an invalid search:
> http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
> The complete discussion and explanation of the problem is here:
> http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
> I didn't find a clear explanation how can we upgrade Solr, there's no any
> replacement or a workarround to this, so this is not just a major change but
> a major disrespect to all existing Solr users who are using this feature.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]