Hi All: I am using solr 4.9.1. and trying to use PostingsSolrHighlighter. But I got errors during indexing. I thought LUCENE-5111 has fixed issues with WordDelimitedFilter. The error is as below:
Caused by: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=31,endOffset=44,lastStartOffset=37 for field 'description_texts' at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:630) at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:342) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:301) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:241) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:451) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1539) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) My schema.xml looks like below: <dynamicField name="*_texts" stored="true" type="text" multiValued="true" indexed="true" storeOffsetsWithPositions="true"/> <fieldType name="text" class="solr.TextField" omitNorms="false"> <analyzer type="index"> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StemmerOverrideFilterFactory" dictionary= "stemdict_en.txt" /> <filter class="solr.PatternReplaceFilterFactory" pattern= "^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/> <filter class="solr.KStemFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwords_english.txt" ignoreCase="true" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" splitOnNumerics="0" catenateWords="1" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwords_english.txt" ignoreCase="true" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" splitOnNumerics="0" catenateWords="1" /> <filter class="solr.StemmerOverrideFilterFactory" dictionary= "stemdict_en.txt" /> <filter class="solr.KStemFilterFactory"/> </analyzer> </fieldType> Any help is appreciated. Thanks. Min