Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory
Robin Wojciki wrote: Koji, I was able to create a minimal replication. Attached zip has solr.xml, solrconf.xml and Main.java. I was able to replicate the issue by replacing the conf files in apache-solr-1.4.0/example/solr/conf and running the class Main. Could please confirm if this replication is enough. Also, please let me know if I should log the ticket with Lucene or Solr. Thanks, Robin Robin, I reproduced the problem with your sample data, but it could be reproduceable without HTMLStripCharFilter ... I commented out HTML Strippers in schema.xml and rebuild indexes with the following data: add doc field name=iddebug-1/field field name=descriptionhello world WGKEKW AWEHGSE/field /doc /add still the exception occurred. Can you check it and open a JIRA issue for Solr? Thank you! Koji -- http://www.rondhuit.com/en/
Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory
Koji, In the sample I sent, the exception comes only if the HTMLStripCharFilter is there. However, your test case seems to capture the essence. Sorry if I sent you on a wild goose chase. Thanks for taking the time! I will log a ticket. Robin On Mon, Dec 7, 2009 at 5:09 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: Robin Wojciki wrote: Koji, I was able to create a minimal replication. Attached zip has solr.xml, solrconf.xml and Main.java. I was able to replicate the issue by replacing the conf files in apache-solr-1.4.0/example/solr/conf and running the class Main. Could please confirm if this replication is enough. Also, please let me know if I should log the ticket with Lucene or Solr. Thanks, Robin Robin, I reproduced the problem with your sample data, but it could be reproduceable without HTMLStripCharFilter ... I commented out HTML Strippers in schema.xml and rebuild indexes with the following data: add doc field name=iddebug-1/field field name=descriptionhello world WGKEKW AWEHGSE/field /doc /add still the exception occurred. Can you check it and open a JIRA issue for Solr? Thank you! Koji -- http://www.rondhuit.com/en/
Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory
Logged a ticket for Solr: https://issues.apache.org/jira/browse/SOLR-1630 Thanks, Robin On Mon, Dec 7, 2009 at 9:36 PM, Robin Wojciki robin.wojc...@gmail.com wrote: Koji, In the sample I sent, the exception comes only if the HTMLStripCharFilter is there. However, your test case seems to capture the essence. Sorry if I sent you on a wild goose chase. Thanks for taking the time! I will log a ticket. Robin On Mon, Dec 7, 2009 at 5:09 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: Robin Wojciki wrote: Koji, I was able to create a minimal replication. Attached zip has solr.xml, solrconf.xml and Main.java. I was able to replicate the issue by replacing the conf files in apache-solr-1.4.0/example/solr/conf and running the class Main. Could please confirm if this replication is enough. Also, please let me know if I should log the ticket with Lucene or Solr. Thanks, Robin Robin, I reproduced the problem with your sample data, but it could be reproduceable without HTMLStripCharFilter ... I commented out HTML Strippers in schema.xml and rebuild indexes with the following data: add doc field name=iddebug-1/field field name=descriptionhello world WGKEKW AWEHGSE/field /doc /add still the exception occurred. Can you check it and open a JIRA issue for Solr? Thank you! Koji -- http://www.rondhuit.com/en/
Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory
Robin Wojciki wrote: I am running a search in Solr 1.4 and I am getting the StringIndexOutOfBoundsException pasted below. The spell check field uses HTMLStripCharFilterFactory. However, the search works fine if I do not use the HTMLStripCharFilterFactory. If I set a breakpoint at SpellCheckComponent.java: 248, the value of the variable best is as shown in the screenshot: http://yfrog.com/j5solrdebuginspectp At the end of first iteration, offset = 5 - (24 - 0) = -19 This causes the index out of bounds exception. The spell check field is defined as: fieldType name=text_spell class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Stack Trace: = String index out of range: -19 java.lang.StringIndexOutOfBoundsException: String index out of range: -19 at java.lang.AbstractStringBuilder.replace(Unknown Source) at java.lang.StringBuilder.replace(Unknown Source) at org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) I couldn't reproduce it with simple test data. Can you open a JIRA and attach a test case that reproduces the problem with spellchecker definition in solrconfig.xml. Koji -- http://www.rondhuit.com/en/
Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory
I am running a search in Solr 1.4 and I am getting the StringIndexOutOfBoundsException pasted below. The spell check field uses HTMLStripCharFilterFactory. However, the search works fine if I do not use the HTMLStripCharFilterFactory. If I set a breakpoint at SpellCheckComponent.java: 248, the value of the variable best is as shown in the screenshot: http://yfrog.com/j5solrdebuginspectp At the end of first iteration, offset = 5 - (24 - 0) = -19 This causes the index out of bounds exception. The spell check field is defined as: fieldType name=text_spell class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Stack Trace: = String index out of range: -19 java.lang.StringIndexOutOfBoundsException: String index out of range: -19 at java.lang.AbstractStringBuilder.replace(Unknown Source) at java.lang.StringBuilder.replace(Unknown Source) at org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)