Hi Min, Do you have the specific bit of text that caused this exception to be thrown?
Alan Woodward www.flax.co.uk On 4 Nov 2014, at 23:15, Min L wrote: > Hi All: > > I am using solr 4.9.1. and trying to use PostingsSolrHighlighter. But I got > errors during indexing. I thought LUCENE-5111 has fixed issues with > WordDelimitedFilter. The error is as below: > > Caused by: java.lang.IllegalArgumentException: startOffset must be > non-negative, and endOffset must be >= startOffset, and offsets must > not go backwards startOffset=31,endOffset=44,lastStartOffset=37 for > field 'description_texts' > at > org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:630) > at > org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:342) > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:301) > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:241) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:451) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1539) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) > > > My schema.xml looks like below: > > <dynamicField name="*_texts" stored="true" type="text" multiValued="true" > indexed="true" storeOffsetsWithPositions="true"/> > > <fieldType name="text" class="solr.TextField" omitNorms="false"> > > <analyzer type="index"> > > <charFilter class="solr.HTMLStripCharFilterFactory"/> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.StemmerOverrideFilterFactory" dictionary= > "stemdict_en.txt" /> > > <filter class="solr.PatternReplaceFilterFactory" pattern= > "^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/> > > <filter class="solr.KStemFilterFactory"/> > > <filter class="solr.StopFilterFactory" words="stopwords_english.txt" > ignoreCase="true" enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" > splitOnNumerics="0" catenateWords="1" /> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.StopFilterFactory" words="stopwords_english.txt" > ignoreCase="true" enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" > splitOnNumerics="0" catenateWords="1" /> > > <filter class="solr.StemmerOverrideFilterFactory" dictionary= > "stemdict_en.txt" /> > > <filter class="solr.KStemFilterFactory"/> > > </analyzer> > > </fieldType> > > > Any help is appreciated. > > > Thanks. > > Min