If query contains only ideographic space, ParseException thrown
---------------------------------------------------------------
Key: LUCENE-3691
URL: https://issues.apache.org/jira/browse/LUCENE-3691
Project: Lucene - Java
Issue Type: Bug
Affects Versions: 2.9.1
Environment: Ubuntu 10.04
Lucene run under embedded Solr 1.4.0 (all behind a tomcat which is probably
irrelevant)
Reporter: Antoniya Statelova
Priority: Minor
Searching for "\u3000x", "x\u3000" and everything similar works fine; however,
searching for just "\u3000" throws the following exception:
core.SolrCore - org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse '<80><80>':
Encountered "<EOF>" at line 1, column 1.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
<TERM> ...
"*" ...
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:108)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:139)
at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:122)
...
at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
...
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
...
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
...
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse
'<80><80>': Encountered "<EOF>" at line 1, column 1.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
<TERM> ...
"*" ...
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:205)
at
org.apache.solr.search.DisMaxQParser.getUserQuery(DisMaxQParser.java:195)
at
org.apache.solr.search.DisMaxQParser.addMainQuery(DisMaxQParser.java:158)
at org.apache.solr.search.DisMaxQParser.parse(DisMaxQParser.java:74)
at org.apache.solr.search.QParser.getQuery(QParser.java:131)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89)
... 36 more
Caused by: org.apache.lucene.queryParser.ParseException: Encountered "<EOF>" at
line 1, column 1.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
<TERM> ...
"*" ...
at
org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1846)
at
org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1728)
at
org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1355)
at
org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1265)
at
org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1254)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:200)
... 41 more
Avoiding querying for empty strings fixes the problem, but I'm not sure this is
expected behavior. Solr is queried using a WhitespaceTokenizer on the query.
I'm posting the issue here since it seems like a problem with Lucene vs Solr,
sorry if I'm wrong. ~Antoniya
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]