[ 
https://issues.apache.org/jira/browse/SOLR-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508769#comment-13508769
 ] 

Robert Muir commented on SOLR-4137:
-----------------------------------

Thanks Marcel. Its useful to know (but I'm sorry you are having to deal with 
it), that these failures
are not just theoretical but happening in real life.

We have a test that finds these bugs and documents a list of broken analyzers 
(http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestRandomChains.java)
 

You can see e.g. WordDelimiterFilter, HyphenatedWordsFilter, etc on these lists 
and also in your chain.
Due to the specific error, I think its WordDelimiterFilter in this case.

its time to start fixing these buggy analyzers! Thanks for reporting this.
                
> FastVectorHighlighter: StringIndexOutOfBoundsException in BaseFragmentsBuilder
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-4137
>                 URL: https://issues.apache.org/jira/browse/SOLR-4137
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 3.6.1
>            Reporter: Marcel
>
> under some circumstances the BaseFragmentsBuilder genereates a 
> StringIndexOutOfBoundsException inside the makeFragment method.
> The starting offset is higher than the end offset.
> I did a small patch checking the offsets and posted it over there at 
> Stackoverflow: 
> http://stackoverflow.com/questions/12456448/solr-highlight-bug-with-usefastvectorhighlighter
> The code in 4.0 seems to be the same as in 3.6.1
> Example how to reproduce the behaviour:
> There is a word called "www.DAKgesundAktivBonus.de" inside the index. If you 
> search for "dak bonus" some offset calculations went wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to