[ https://issues.apache.org/jira/browse/SOLR-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luke Forehand updated SOLR-1883: -------------------------------- Attachment: (was: invalid_token_exception.txt) > Highlighting failure caused by InvalidTokenOffsetsException > ----------------------------------------------------------- > > Key: SOLR-1883 > URL: https://issues.apache.org/jira/browse/SOLR-1883 > Project: Solr > Issue Type: Bug > Components: highlighter > Affects Versions: 1.4 > Environment: {code:title=java} > Java(TM) SE Runtime Environment (build 1.6.0_18-b07) > Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) > {code} > {code:title=solr lib manifest} > Manifest-Version: 1.0 > Ant-Version: Apache Ant 1.7.0 > Created-By: 14.1-b02-90 (Apple Inc.) > Extension-Name: org.apache.solr > Specification-Title: Apache Solr Search Server > Specification-Version: 1.4.0 > Specification-Vendor: The Apache Software Foundation > Implementation-Title: org.apache.solr > Implementation-Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12: > 33:40 > Implementation-Vendor: The Apache Software Foundation > X-Compile-Source-JDK: 1.5 > X-Compile-Target-JDK: 1.5 > {code} > {code:title=OS} > Linux myhost 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 > x86_64 GNU/Linux > {code} > Reporter: Luke Forehand > Attachments: schema.xml, > test_doc_for_invalid_token_offsets_exception.xml > > > This issue seems to be the same as a previous issue that was bulk closed in > solr 1.4 https://issues.apache.org/jira/browse/SOLR-1404, and I see someone > reported this bug in lucene 2.9.1 > https://issues.apache.org/jira/browse/LUCENE-2208 We are experiencing this > issue as well. > I have pasted the important part of our schema.xml and the solr exception. I > have also attached the document that fails when queried as a highlight query. > The invalid token seems to be 'system' which is the very last token in the > document field if you look at the attached file. > {code:title=schema.xml} > <?xml version="1.0" encoding="UTF-8"?> > <schema name="xxx" version="1.1"> > <types> > <fieldType name="scrubbedText" class="solr.TextField" > positionIncrementGap="100"> > <analyzer> > <tokenizer > class="solr.StandardTokenizerFactory" /> > <charFilter > class="solr.HTMLStripCharFilterFactory" /> > <filter class="solr.StandardFilterFactory" /> > <filter class="solr.LowerCaseFilterFactory" /> > <filter class="solr.StopFilterFactory" /> > </analyzer> > </fieldType> > ... > </types> > <fields> > <field name="id" type="string" stored="true" indexed="true" /> > <field name="textScrubbed" type="scrubbedText" stored="true" > indexed="true" /> > ... > </fields> > <uniqueKey>id</uniqueKey> > <defaultSearchField>textScrubbed</defaultSearchField> > </schema> > {code} > {code:title=solr.log exception} > Apr 13, 2010 3:08:35 AM org.apache.solr.common.SolrException log > SEVERE: org.apache.solr.common.SolrException: > org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token system > exceeds length of provided text sized 17063 > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:342) > at > org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859) > at > org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:574) > at > org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1527) > at java.lang.Thread.run(Thread.java:619) > Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: > Token system exceeds length of provided text sized 17063 > at > org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254) > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335) > ... 18 more > {code} -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira