Re: SOLR 4.4 - Slave always replicates full index
Hey Suresh, could you get a little more specific on what solved your problem here? I am currently facing the same problem and am trying to find a proper solution. Thanks! ~ Dom 2014-02-28 7:46 GMT+01:00 sureshrk19 sureshr...@gmail.com: Thanks Shawn and Erick. I followed SOLR configuration document and modified index strategy. Looks good now. I haven't seen any problems in last 1 week. Thanks for your suggestions. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-4-Slave-always-replicates-full-index-tp4113089p4120337.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH on Solr
Thanks Ahmet, Walfgang , i have installed hbase-indexer on one the server but here also im unable to start the hbase indexer server. Error: Could not find or load main class com.ngdata.hbaseindexer.Main properly set the JAVA_HOME and INDEXER_HOME environmental variables. please guide Thanks. ATP -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-on-Solr-tp4143669p4143955.html Sent from the Solr - User mailing list archive at Nabble.com.
default query operator ignored by edismax query parser
Hi, I have defined the following edismax query parser: requestHandler name=/search class=solr.SearchHandler lst name=defaultsstr name=mm100%/strstr name=defTypeedismax/strfloat name=tie0.01/floatint name=ps100/intstr name=q.alt*:*/strstr name=q.opAND/strstr name=qffield1^2.0 field2/strstr name=rows10/strstr name=fl*/str/lst /requestHandler My search query looks like: q=(word1 word2) OR (word3 word4) Since I specified AND as default query operator, the query should match documents by ((word1 AND word2) OR (word3 AND word4)) but the query matches documents by ((word1 OR word2) OR (word3 OR word4)). Could anyone explain the behaviour? Thanks! Johannes P.S. The query q=(word1 word2) match all documents by (word1 AND word2)
Re: No results for a wildcard query for text_general field in solr 4.1
Thanks for the answers. I will try to solve my problem, by extracting the affected text and index that part into another string field, where the wild card query work as expected. The Solr queries will be extend by an „OR“ with that new field, that should work for my case. Yours truly, Sven Am 24.06.2014 um 17:48 schrieb Jack Krupansky j...@basetechnology.com: I think I am officially tired of having to explain why Solr doesn't do what users expect for this query. I mean, I can accept that low level Lucene should work strictly on the decomposed terms of test test-or*, but is is very reasonable for users (even EXPERT users) to expect that the Solr query parser will generate what the complex phrase query parser generates. See: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser Having to use a separate query parser for this obvious, common case is... absurd. (What does Elasticsearch do for this case??) -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Tuesday, June 24, 2014 11:38 AM To: solr-user@lucene.apache.org ; Ahmet Arslan Subject: Re: No results for a wildcard query for text_general field in solr 4.1 Wildcards are a tough thing to get your head around. I think my first post on the users list was titled I just don't get wildcards at all or something like that... Right, wildcards aren't tokenized. So by getting your term through the query parsing as a single token, including the hyphen, when the analyzer sees that it's a wildcard it doesn't break on the hyphen. So it's looking for a single token. And since there is not single term like test-or123 you get no matches. I'm afraid this is just how it works. You can do something like replace the hyphen at the app layer. But I don't think there's a way to do what you want OOB. Best, Erick On Tue, Jun 24, 2014 at 1:55 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Sven, StandardTokenizerFactory splits it into two pieces. You can confirm this at analysis page. If this is something you don't want, lets us know. We can help you to create an analysis chain that suits your needs. Ahmet On Tuesday, June 24, 2014 10:39 AM, Sven Schönfeldt schoenfe...@subshell.com wrote: Hi Erick, that is what i did, tried that input on analysis page. The index field splitting the value into two words: „test“ and „or123 Now checking the query at analysis page, and there are the word ist splitting into „test“ and „or123“. By doing the query and look into the debug result, i see that there is no splitting of words. Thats what i expect… str name=rawquerystringsearchField_t:test\-or123*/str str name=querystringsearchField_t:test\-or123*/str str name=parsedquerysearchField_t:test-or123*/str str name=parsedquery_toStringsearchField_t:test-or123*/str Without the wildcard, the word is splitting also in two parts: str name=rawquerystringsearchField_t:test\-or123/str str name=querystringsearchField_t:test\-or123/str str name=parsedquerysearchField_t:test searchField_t:or123/str str name=parsedquery_toStringsearchField_t:test searchField_t:or123/str Any idea which configuration has the responsibility for that behavior? Thanks! Am 23.06.2014 um 22:55 schrieb Erick Erickson erickerick...@gmail.com: Well, you can do more than guess by looking at the admin/analysis page and trying your input on the field in question. That'll show you what actual transformations are performed. You're probably right though. Try adding debug=query to your URL to see what the actual parsed query looks like and compare with the admin/analysis page But yeah, it's a matter of getting all the parts (query parser and analysis chains) to do the right thing. Best, Erick On Mon, Jun 23, 2014 at 7:30 AM, Sven Schönfeldt schoenfe...@subshell.com wrote: Hi Solr-Users, i am trying to do a wildcard query on a dynamic textfield (_t), but don’t get the right result. The configuration for the field type is „text_general“, the default configuration: fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Input for the textfield is test-or123 and my query looks like test\-or*“. It seems that the input is allready split into two words: „test“ and
Re: OOM during indexing nested docs
I made two tests, one with MaxRamBuffer=128 and the second with MaxRamBuffer=256. In both i got OOM. I also made two tests on autocommit: one with commit every 5 min, and the second with commit every 100,000 docs. (disabled softcommit) In both i got OOM. merge policy - Tiered (max segment size of 5000, and merged at once = 2, merge factor = 12). any idea for more tests? -- View this message in context: http://lucene.472066.n3.nabble.com/OOM-during-indexing-nested-docs-tp4143722p4143966.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: default query operator ignored by edismax query parser
On 6/25/2014 1:05 AM, Johannes Siegert wrote: I have defined the following edismax query parser: requestHandler name=/search class=solr.SearchHandler lst name=defaultsstr name=mm100%/strstr name=defTypeedismax/strfloat name=tie0.01/floatint name=ps100/intstr name=q.alt*:*/strstr name=q.opAND/strstr name=qffield1^2.0 field2/strstr name=rows10/strstr name=fl*/str/lst /requestHandler My search query looks like: q=(word1 word2) OR (word3 word4) Since I specified AND as default query operator, the query should match documents by ((word1 AND word2) OR (word3 AND word4)) but the query matches documents by ((word1 OR word2) OR (word3 OR word4)). Could anyone explain the behaviour? I believe that you are running into this bug: https://issues.apache.org/jira/browse/SOLR-2649 It's a very old bug, coming up on three years. The workaround is to not use boolean operators at all, or to use operators EVERYWHERE so that your intent is explicitly described. It is not much of a workaround, but it does work. Thanks, Shawn
Re: TokenFilter not working at index time
On 24.06.14 17:33, Erick Erickson wrote: Hmmm. It would help if you posted a couple of other pieces of information BTW, if this is new code are you considering donating it back? If so please open a JIRA so we can track it, see: http://wiki.apache.org/solr/HowToContribute All my other language improvements for the existing Norwegian stemmers have been donated back to Solr, so yes, if possible. I want to experiment a little bit before I open a ticket. But to your question: First couple of things I'd do: 1 see what the admin/analysis page tells you happens. Shows correct results for index and query. The lemmatizer is enable to find the correct stem. 2 attach debug=query to your test case, see what the parsed query looks like. Seems to be OK. Remember that the problem is related to indexing, not querying. I have double-checked by indexing all the documents by another stemmer and configured my lemmatizer only for queries. Then everything works as it should. Here's the query. As you can see, studentene is stemmed to student for two fields (content_no and title_no) which is correct: BoostedQuery(boost(+(title_en:studentene^10.0 | host:studentene^30.0 | content_en:studentene^0.1 | content_no:student^0.1 | title_no:student^10.0 | anchortext_partial:studentene^70.0 | subjectcode:studentene^100.0 | canonicalurl:studentene^5.0)~0.2 () () () () () (product(int(url_toplevel),const(5)))^20.0 (2.0/(1.0*float(int(url_levels))+1.0))^250.0 (product(float(docrank),const(1)))^4.0 (1.0/(3.16E-11*float(ms(const(1403686863701),date(last_modified)))+1.0))^50.0 (product(int(url_landingpage),const(3)))^40.0,product(float(urlboost),map(query(language:no,def=0.0),0.0,0.0,1.0 3 use the admin/schema browser link for the field in question to see what actually makes it into the index. (Or use Luke or even the TermsComponent). I haven't played much around with this, but is says 27 for docs if I select the field content_no. Does this mean that there are only 27 documents in my index with data in this field? Then there is something really bad going on, because if I change to content_en, this number grows to 10526 (because another English stemmer is used for that field instead). If I change to NorwegianMinimalStemFilter and reindex everything, the number grows to 28270. By writing out debugging info from my stemmer, I just figured out that only the document's titles are being stemmed at index time, not the content itself. So I have found the root of the problem, but I'm not sure why the field is omitted. Erlend
Not able to save SolrInputDocument object in Solr database
Hi, I am getting exception while saving SolrInputDocument object from Java in client Server, but in my local machine it works fine. org.apache.solr.common.SolrException: Unexpected EOF in prolog at [row,col {unknown-source}]: [1,0] at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:768) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:205) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:193) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog at [row,col {unknown-source}]: [1,0] at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686) at com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2134) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2040) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:213) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) ... 22 more My java code is try { SolrInputDocument doc = new SolrInputDocument(); String docID = doctor.getId(); doc.addField(_id, docID, 1.0f); doc.addField(providerType, Doctor); doc.addField(experience, doctor.getExperience()); doc.addField(specialities, doctor.getSpecialities()); doc.addField(specialty, doctor.getSpecialty()); doc.addField(firstName, doctor.getFirstName()); doc.addField(lastName, doctor.getLastName()); doc.addField(medicine, doctor.getMedicine()); doc.addField(registrationNumber, doctor.getRegistrationNumber()); doc.addField(registrationYear, doctor.getRegistrationYear()); doc.addField(description, doctor.getDescription()); if (doctor.getAddressInfo() != null) { doc.addField(primaryClinic, doctor.getAddressInfo().getPrimaryClinic()); doc.addField(address, doctor.getAddressInfo().getAddress()); doc.addField(state, doctor.getAddressInfo().getState()); doc.addField(country, doctor.getAddressInfo().getCountry()); doc.addField(pincode, doctor.getAddressInfo().getPincode()); doc.addField(mobile, doctor.getAddressInfo().getMobile()); doc.addField(phone1, doctor.getAddressInfo().getPhone1()); doc.addField(email,
[ANNOUNCE] Apache Solr 4.9.0 released
25 June 2014, Apache Solr™ 4.9.0 available The Lucene PMC is pleased to announce the release of Apache Solr 4.9.0 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. Solr 4.9.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html See the CHANGES.txt file included with the release for a full list of details. Solr 4.9.0 Release Highlights: * Numerous optimizations for doc values search-time performance * Allow a client application to request the minium achieved replication factor for an update request (single or batch) by sending an optional parameter min_rf. * Query re-ranking support with the new ReRankingQParserPlugin. * A new [child ...] DocTransformer for optionally including Block-Join decendent documents inline in the results of a search. * A new (default) Lucene49NormsFormat to better compress certain cases such as very short fields. Solr 4.9.0 also includes many other new features as well as numerous optimizations and bugfixes of the corresponding Apache Lucene release. Please report any feedback to the mailing lists (http://lucene.apache.org/solr/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. On behalf of the Lucene PMC, Happy Searching
Re: Solr 4.8 result page desplay changes and highlighting
Vicky - were you able to get the results page formatted how you’d like?You may want to tweak results_list.vm or a sub (or maybe parent?)-template from there to achieve what you want. Erik On Jun 18, 2014, at 10:02 AM, vicky vi...@raytheon.com wrote: Hi Everyone, I just installed solr 4.8 release and playing with DIH and Velocity configuration. I am trying to change result page columns to display more # of fields and type of format to tabular since I have 1 rows to display on one page if I can in out of box configuration. I also tried highlight feature in 4.8 and out of box it is not working. Has anyone ran into this issue? Please advise, All help is appreciated in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-8-result-page-desplay-changes-and-highlighting-tp4142504.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: default query operator ignored by edismax query parser
Thanks Shawn! In this case I will use operators everywhere. Johannes Am 25.06.2014 15:09, schrieb Shawn Heisey: On 6/25/2014 1:05 AM, Johannes Siegert wrote: I have defined the following edismax query parser: requestHandler name=/search class=solr.SearchHandler lst name=defaultsstr name=mm100%/strstr name=defTypeedismax/strfloat name=tie0.01/floatint name=ps100/intstr name=q.alt*:*/strstr name=q.opAND/strstr name=qffield1^2.0 field2/strstr name=rows10/strstr name=fl*/str/lst /requestHandler My search query looks like: q=(word1 word2) OR (word3 word4) Since I specified AND as default query operator, the query should match documents by ((word1 AND word2) OR (word3 AND word4)) but the query matches documents by ((word1 OR word2) OR (word3 OR word4)). Could anyone explain the behaviour? I believe that you are running into this bug: https://issues.apache.org/jira/browse/SOLR-2649 It's a very old bug, coming up on three years. The workaround is to not use boolean operators at all, or to use operators EVERYWHERE so that your intent is explicitly described. It is not much of a workaround, but it does work. Thanks, Shawn
Re: OOM during indexing nested docs
How big is your request size from client to server? I ran into OOM problems too. For me the reason was that I was sending big requests (1+ docs) at too fast a pace. So I put a throttle on the client to control the throughput of the request it sends to the server, and that got rid of the OOM error. Rebecca Tang Applications Developer, UCSF CKM Legacy Tobacco Document Library legacy.library.ucsf.edu/ E: rebecca.t...@ucsf.edu On 6/25/14 1:45 AM, adfel70 adfe...@gmail.com wrote: I made two tests, one with MaxRamBuffer=128 and the second with MaxRamBuffer=256. In both i got OOM. I also made two tests on autocommit: one with commit every 5 min, and the second with commit every 100,000 docs. (disabled softcommit) In both i got OOM. merge policy - Tiered (max segment size of 5000, and merged at once = 2, merge factor = 12). any idea for more tests? -- View this message in context: http://lucene.472066.n3.nabble.com/OOM-during-indexing-nested-docs-tp41437 22p4143966.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr on S3FileSystem, Kosmos, GlusterFS, etc….
Hi paul. Im not using it on S3 -- But yes - I dont think S3 would be ideal for Solr at all. There are several other Hadoop Compatible File Systems, however, some of which might be ideal for certain types of SolrCloud workloads. Anyways... would love to see a Solr wiki page on FileSystem compatibiity, possibly an entry linking here https://wiki.apache.org/hadoop/HCFS. In the meantime, I will update this thread if I find anything interesting when we increase load size. On Wed, Jun 25, 2014 at 1:34 AM, Paul Libbrecht p...@hoplahup.net wrote: I've always been under the impression that file-system-access-speed is crucial for Lucene-based storage and have always advocated to not use NFS for that (for which we had slowness of a factor of 5 approximately). Has there any performance measurement made for such a setting? Is FS-caching suddenly getting so much better that it is not a problem. Also, as far as I know S3 bills by the amount of (giga-)bytes exchanged…. this gives plenty of room but if each starts needs to exchange a big part of the index from the storage to the solr server because of cache filling, it looks like it won't be that cheap. thanks for experience report. paul On 25 juin 2014, at 07:16, Jay Vyas jayunit100.apa...@gmail.com wrote: Hi Solr ! I got this working . Here's how : With the example jetty runner, you can Extract the tarball, and go to the examples/ directory, where you can launch an embedded core. Then, find the solrconfig.xml file. Edit it to contain the following xml: directoryFactory name=DirectoryFactory class=org.apache.solr.core.HdfsDirectoryFactory str name=solr.hdfs.homemyhcfs:///solr/str str name=solr.hdfs.confdir/etc/hadoop/conf/str /directoryFactory the confdir is important: That is where you will have something like a core-site.xml that defines all the parameters for your filesystem (fs.defaultFS, fs.mycfs.impl…. and so on). This tells solr, when launched, to use myhcfs as the underlying file store. You also should make sure that the jar for your plugin (in our case glisters, but hadoop will reference it by looking up the dynamically generated parameters that come from the base uri myhcfs… classes are on the class path, and the hadoop-common jar is also there (Some HCFS shims will need FilterFileSystem to run correctly, which is only in hadoop-common.jar). So - how to modify the running sold core's class path? To do so – you can update the solrconfig.xml jar directives. There are a bunch of regular expression templates you can modify in the examples/.../solrconfig.xml file. You can also copy the jars in at runtime, to be really safe. Once your example core with gluster configuration is setup, launch it with the following properties: java -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.data.dir=glusterfs:///solr -Dsolr.updatelog=glusterfs:///solr -Dlog4j.configuration=file:/opt/solr-4.4.0-cdh5.0.2/example/etc/logging.properties -jar start.jar This starts a basic SOLR server on port 8983. If you are running from the simple jetty based examples which I've used to describe this above, then you should see the collection1 core up and running, and you should see its index sitting inside the /solr directory of your file system. Hope this helps those interested in expanding the use of SolrCloud outside of a single FS. On Jun 23, 2014, at 6:16 PM, Jay Vyas jayunit100.apa...@gmail.com wrote: Hi folks. Does anyone deploy solr indices on other HCFS implementations (S3FileSystem, for example) regularly ? If so I'm wondering 1) Where are the docs for doing this - or examples? Seems like everything, including parameter names for dfs setup, are based around hdfs. Maybe I should file a JIRA similar to https://issues.apache.org/jira/browse/FLUME-2410 (to make the generic deployment of SOLR on any file system explicit / obvious). 2) if there are any interesting requirements (i.e. createNonRecursive, Atomic mkdirs, sharing, blocking expectations etc etc) which need to be implemented -- jay vyas
How much free disk space will I need to optimize my index
Hi, I need to de-fragment my index. My question is, how much free disk space I need before I can do so? My understanding is, I need 1X free disk space of my current index un-optimized index size before I can optimize it. Is this true? That is, let say my index is 20 GB (un-optimized) then I must have 20 GB of free disk space to make sure the optimization is successful. The reason for this is because during optimization the index is re-written (is this the case?) and if it is already optimized, the re-write will create a new 20 GB index before it deletes the old one (is this true?), thus why there must be at least 20 GB free disk space. Can someone help me with this or point me to a wiki on this topic? Thanks!!! - MJ
RE: How much free disk space will I need to optimize my index
-Original message- From:johnmu...@aol.com johnmu...@aol.com Sent: Wednesday 25th June 2014 20:13 To: solr-user@lucene.apache.org Subject: How much free disk space will I need to optimize my index Hi, I need to de-fragment my index. My question is, how much free disk space I need before I can do so? My understanding is, I need 1X free disk space of my current index un-optimized index size before I can optimize it. Is this true? Yes, 20 GB of FREE space to force merge an existing 20 GB index. That is, let say my index is 20 GB (un-optimized) then I must have 20 GB of free disk space to make sure the optimization is successful. The reason for this is because during optimization the index is re-written (is this the case?) and if it is already optimized, the re-write will create a new 20 GB index before it deletes the old one (is this true?), thus why there must be at least 20 GB free disk space. Can someone help me with this or point me to a wiki on this topic? Thanks!!! - MJ
Re: Double cast exception with grouping and sort function
Can you provide some sample data to demonstrates the problem? (ideally using the 4.x example configs - but if you can't reproduce with that then providing your own configs would be helpful) I repo'd using the example config (with sharding). I was missing one necessary condition: the schema needs a * dynamic field. It looks like serializeSearchGroup matches the sort expression as the * field, thus marshalling the double as TextField. Should I enter a ticket with the full repro? Thanks, Nate On Tue, Jun 24, 2014 Chris Hostetter hossman_luc...@fucit.org wrote: : I recently tried upgrading our setup from 4.5.1 to 4.7+, and I'm : seeing an exception when I use (1) a function to sort and (2) result : grouping. The same query works fine with either (1) or (2) alone. : Example below. Did you modify your schema in any way when upgrading? Can you provide some sample data to demonstrates the problem? (ideally using the 4.x example configs - but if you can't reproduce with that then providing your own configs would be helpful) I was unabled to reproduce doing a quick sanity check using the example with a shard param to force a distrib query... http://localhost:8983/solr/select?q=*:*shards=localhost:8983/solrsort=sum%281,1%29%20descgroup=truegroup.field=inStock It's possible that the distributed grouping code has a bug in it related to the marshalling of sort values and i'm just not tickling that bug with my quick check ... but if i remember correctly work was done to fix grouped sorting to correctly deal with this when FieldType.marshalSortValue was introduced. : Example (v4.8.1): : { : responseHeader: { : status: 500, : QTime: 14, : params: { : sort: sum(1,1) desc, : indent: true, : q: title:solr, : _: 1403586036335, : group.field: type, : group: true, : wt: json : } : }, : error: { : msg: java.lang.Double cannot be cast to org.apache.lucene.util.BytesRef, : trace: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: : java.lang.Double cannot be cast to org.apache.lucene.util.BytesRef : code: 500 : } : } : : From the log: : : org.apache.solr.common.SolrException; : null:java.lang.ClassCastException: java.lang.Double cannot be cast to : org.apache.lucene.util.BytesRef : at org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:981) : at org.apache.solr.schema.TextField.marshalSortValue(TextField.java:176) : at org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.serializeSearchGroup(SearchGroupsResultTransformer.java:125) : at org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:65) : at org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:43) : at org.apache.solr.search.grouping.CommandHandler.processResult(CommandHandler.java:193) -Hoss http://www.lucidworks.com/
Re: Double cast exception with grouping and sort function
: I repo'd using the example config (with sharding). I was missing one : necessary condition: the schema needs a * dynamic field. : It looks like serializeSearchGroup matches the sort expression as the : * field, thus marshalling the double as TextField. : : Should I enter a ticket with the full repro? yes please -- i remember a similar problem coming up in the past, and i know we account for it in the distributed sorting tests (i remember adding it) but i guess we missed an edge case here with the distributed group. -Hoss http://www.lucidworks.com/
suggest not working 4.8.1
My configs below are not returning anything in suggest! Any pointers please? solrconf searchComponent class=solr.SuggestComponent name=mysuggestion lst name=suggester str name=namemysuggestion/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookupFactory/str str name=fieldmysuggestion/str !-- the indexed field to derive suggestions from -- float name=threshold0.0/float str name=buildOnCommittrue/str !-- str name=sourceLocationamerican-english/str -- /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarymysuggestion/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strmysuggestion/str /arr /requestHandler schema fieldType name=textspell class=solr.TextField positionIncrementGap=100 omitNorms=true analyzer type=index tokenizer class=solr.StandardTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.LowerCaseFilterFactory / filter class=solr.StandardFilterFactory / /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.LowerCaseFilterFactory / filter class=solr.StandardFilterFactory / /analyzer /fieldType response EMPTY! response lst name=responseHeader int name=status0/int int name=QTime15/int /lst /response
Re: Adding router.field property to an existing collection.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Modassar, I ran into the same issue (Solr 4.8.1) with an existing collection set to implicit routing but with no router.field defined. I managed to set the router.field by modifying /clusterstate.json and pushing it back to Zookeeper. For instance, I use field shard_name for routing. Now, in my /clusterstate.json, I have: router:{ name:implicit, field:shard_name } Warning: you'll probably need to reload your collection (see Collection API) for the change to be taken into account. Or a more brutal way, restart your Solr nodes. Then you should see the update in http://localhost:8983/solr/admin/collections?action=clusterstatus. I'd be curious to know if there's a cleaner method though, rather than modifying /clusterstate.json. Otherwise, if you want to create a collection from scratch with implict routing and a router.field (see Collection API), use: http://localhost:8983/solr/admin/collections?action=CREATEname=my_collectionrouter.name=implicitrouter.field=shard_name Good luck, Damien On 05/06/2014 05:59 AM, Modassar Ather wrote: Hi, I have a setup of two shard with embedded zookeeper and one collection on two tomcat instances. I cannot use uniqueKey i.e the compositeId routing for document routing as per my understanding it will change the uniqueKey. There is another way mentioned on Solr wiki is by using router.field. I could not find a way of setting it in solr.xml/other configuration file to get it added. Kindly share your suggestion on: How I can use router.field in an existing collection? Create a collection with router.field and implicit routing enabled? Thanks, Modassar -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTqyixAAoJENfoFMxpEaCCPGgH/iAyTPeWbEtdgWdLN46kP3RT vnSzf2qFEE4bXgdyVVuuZ/dagEPYUDxn9EhSwOrzuZmJcBNpgaTP8lZtejRo6LCO jYItfO14uq/wEczelyvb3iEAqFYdCG1hQxpmabEi1uuLvLCgwLgbgsvZ8AR7l3ci IGdQvMnD004VRXIAqErpv8E24ChH+qD+gC7ed4FiAhKfb6fBvNmsoIqmPSRcmeZX zXjSZJ3K/c3P+pddKaEGr6BFccb/zIK/yJ/q/ihZIr1kyBnjEBfhhlBhgSvVXBEu l97gvyz84WO5++TGFNbNIAj9quTu6+23Rn2ohjcMpz9TA9RtVbNImoZ5wQ0qjYY= =F0U4 -END PGP SIGNATURE-
Re: POST Vs GET
Ravi, The POST should work. Here's an example that works within tomcat. curl -X POST --data q=*:*rows=1 http://localhost:8080/solr/collection1/select Sameer. On Mon, Jun 23, 2014 at 10:37 AM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote: Hi, I am executing a solr query runs 10 to 12 lines with all the boosting and condition. I change the Http Contentype to POST from GET as post doesn't have any restriction for size. But I am getting an error. I am using Tomcat 7, Is there any place we need to specify in Tomcat to accept POST.. FYI, From my Jetty solr version everthing works good. Thanks Ravi -- *Sameer Maggon* http://measuredsearch.com
Facet for calculated Column
Hi, Is it possible to get the facet count for calculated values, along with regular columns. e.g I have Price MSRP, I like to get how many are in Sale (Price MSRP) Onsale (10) Jeans (20) Shirts (50) Above, Jeans Shirts are there in Schema.xml, I can add there in the facet fields, How can I get the Onsale in the same hit. Thanks Ravi
Re: How to extend the behavior of a common text field (such as text_general) to recognize regex
Thanks, I tried your suggestion today 1. Define a text_num fieldType fieldType name=text_num class=solr.TextField analyzer tokenizer class=solr.PatternTokenizerFactory pattern=\s*[0-9][0-9-\s]*[0-9]?\s* group=0/ filter class=solr.TrimFilterFactory/ /analyzer /fieldType 2. Define a new text field to capture numerical data and link it to the text field via a copyField field name=text_il type=text_num indexed=true stored=false multiValued=true / copyField source=text dest=text_il maxChars=3 / 3. Restart the server and reindex my test data As you can see from a simple analysis test on text copied from my test document (see screenshot), the field and the regex work as expected http://i.imgur.com/o4y2Q9u.png However, when I try and use the same query (for the text_il field, not even trying to combine queries across fields) using the edismax parser, I don't get any hits. Also. when I searched the forums and JIRA, I came across these two https://issues.apache.org/jira/browse/SOLR-6009 http://lucene.472066.n3.nabble.com/Regex-with-local-params-is-not-working-tt4138257.html So my questions are: 1. Do the dismax / edismax parser even support regex syntax ? 2. Am I doing something wrong ? Results Regex usiing the default parser works { responseHeader: { status: 0, QTime: 4, params: { indent: true, q: text_il:/.*[7-8].*/, _: 1403729219835, wt: json } }, response: { numFound: 1, start: 0, docs: [ { id: 1, content_type: parentDocument, _version_: 1471911225402065000 } ] } } Whereas using the edismax parser, it doesn't return any hits. I used this link as a guide to forming my queries http://lucidworks.lucidimagination.com/display/solr/The+Extended+DisMax+Query+Parser { responseHeader: { status: 0, QTime: 3, params: { lowercaseOperators: true, pf: text_il, indent: true, q: /.*[7-8].*/, qf: text_il, _: 1403729594057, stopwords: true, wt: json, defType: edismax } }, response: { numFound: 0, start: 0, docs: [] } } Debug-enabled query at https://gist.github.com/anonymous/625e7669918deba4a071 Thanks On Tue, Jun 24, 2014 at 7:35 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: What about copyField'ing the content into the second field where you apply the alternative processing. Than eDismax searching both. Don't have to store the other field, just index. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 25, 2014 at 5:55 AM, Vinay B, vybe3...@gmail.com wrote: Sorry, previous post got sent prematurely. Here is the complete post: This is easy if I only reqdefine a custom field to identify the desired patterns (numbers, in my case) For example, I could define a field thus: !-- A text field that identifies numberical entities-- fieldType name=text_num class=solr.TextField analyzer tokenizer class=solr.PatternTokenizerFactory pattern=\s*[0-9][0-9-]*[0-9]?\s* group=0/ /analyzer /fieldType Input: hello, world bye 123-45 abcd sdfssdf --- aaa Output: 123-45 , However, I also want to retain the behavior of the default text_general field , that is recognize the usual text tokens (hello, world, bye etc ...). What is the best way to achieve this. I've looked at PatternCaptureGroupFilterFactory ( http://lucene.apache.org/core/4_7_0/analyzers-common/org/apache/lucene/analysis/pattern/PatternCaptureGroupFilterFactory.html ) but I suspect that it too is subject to the behavior of the prior tokenizer (which for text_general is StandardTokenizerFactory ). Thanks
Re: Calculating filterCache size
Thank you for your help! I wrote an article on Performance Testing Solr filterCache Shedding Light on Apache Solr filterCache for VuFind that I am hoping to get published. https://docs.google.com/document/d/1vl-nmlprSULvNZKQNrqp65eLnLhG9s_ydXQtg9iML10 Anyone can comment and I would highly appreciate this! My biggest fear is to have something inaccurate about filterCache or Solr in general in there. Any and all suggestions welcome! Thanks again, Ben On Thu, Jun 19, 2014 at 3:42 PM, Erick Erickson erickerick...@gmail.com wrote: That's specific to using the facet.method=enum, but do admit it's easy to miss that. I added a note about that though... Thanks for pointing that out! On Thu, Jun 19, 2014 at 9:38 AM, Benjamin Wiens benjamin.wi...@gmail.com wrote: Thanks to both of you. Yes the mentioned config is illustrative, we decided for 512 after thorough testing. However, when you google Solr filterCache the first link is the community wiki which has a config even higher than the illustration which is quite different from the official reference guide. It might be a good idea to change this unless there's a very small index. http://wiki.apache.org/solr/SolrCaching#filterCache filterCache class=solr.LRUCache size=16384 initialSize=4096 autowarmCount=4096/ On Thu, Jun 19, 2014 at 9:48 AM, Erick Erickson erickerick...@gmail.com wrote: Ben: As Shawn says, you're on the right track... Do note, though, that a 10K size here is probably excessive, YMMV of course. And an autowarm count of 5,000 is almost _certainly_ far more than you want. All these fq clauses get re-executed whenever a new searcher is opened (soft commit or hard commit with openSearcher=true). I realize this may just be illustrative. Is this your actual setup? And if so, what is your motivation for 5,000 autowarm count? Best, Erick On Wed, Jun 18, 2014 at 11:42 AM, Shawn Heisey s...@elyograg.org wrote: On 6/18/2014 10:57 AM, Benjamin Wiens wrote: Thanks Erick! So let's say I have a config of filterCache class=solr.FastLRUCache size=1 initialSize=1 autowarmCount=5000/ MaxDocuments = 1,000,000 So according to your formula, filterCache should roughly have the potential to consume this much RAM: ((1,000,000 / 8) + 128) * (10,000) = 1,251,280,000 byte / 1,000 = 1,251,280 kb / 1,000 = 1,251.28 mb / 1000 = 1.25 gb Yes, this is essentially correct. If you want to arrive at a number that's more accurate for the way that OS tools will report memory, you'll divide by 1024 instead of 1000 for each of the larger units. That results in a size of 1.16GB instead of 1.25. Computers think in powers of 2, dividing by 1000 assumes a bias to how people think, in powers of 10. It's the same thing that causes your computer to report 931GB for a 1TB hard drive. Thanks, Shawn
Am I being dense? Or are real-time gets not exposed in SolrJ?
The subject line kind of says it all... this is the latest thing we have noticed that doesn't seem to have made it in. Am I missing something? Other awkwardness was doing a deleteByQuery against a collection other than the defaultCollection, and trying to share a CloudSolrServer among different objects that were writing and reading against multiple collections. We managed to hack around the former by doing it with an UpdateRequest. I'm wondering if a valid solution to the latter is actually to create one CloudSolrServer, rip the zkStateReader out of it, and stuff it in subsequent ones. Is that a bad idea? It seems like there might be some overhead to having several going in the same process that could be avoided, but maybe I'm overcomplicating things. Thanks, Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/
Re: SolrCloud multiple data center support
I have just created https://issues.apache.org/jira/browse/SOLR-6205 I hope the description makes sens. Thanks. Arcadius. On 23 June 2014 18:49, Mark Miller markrmil...@gmail.com wrote: We have been waiting for that issue to be finished before thinking too hard about how it can improve things. There have been a couple ideas (I’ve mostly wanted it for improving the internal zk mode situation), but no JIRAs yet that I know of. -- Mark Miller about.me/markrmiller On June 23, 2014 at 10:37:27 AM, Arcadius Ahouansou (arcad...@menelic.com) wrote: On 3 February 2014 22:16, Daniel Collins danwcoll...@gmail.com wrote: One other option is in ZK trunk (but not yet in a release) is the ability to dynamically reconfigure ZK ensembles ( https://issues.apache.org/jira/browse/ZOOKEEPER-107). That would give the ability to create new ZK instances in the event of a DC failure, and reconfigure the Solr Cloud without having to reload everything. That would help to some extent. ZOOKEEPER-107 has now been implemented. I checked the Solr Jira and it seems there is nothing for multi-data-center support. Do we need to create a ticket or is there already one? Thanks. Arcadius. -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
RC for 4.9 Solr Ref-Guide immenient, please help look for formatting mistakes
FYI: The current plan is to call a vote for the 4.9 Solr Ref Guide sometime tomorrow (2014-06-26) morning (~11AM UTC-0500 maybe?) The main thing we are currently waiting on is that sarowe is working on a simple page to document using Solr with SSL -- but now would be a great time for folks to help review the existing documentation for typos or formatting glitches. If you have some time, please download the PDF from the following and review it as much as you can -- if you notice any problems, please feel free to reply to this message (with page # please), or post a comment on the affected cwiki page (if you know what it is)... https://people.apache.org/~hossman/tmp/solr_4.9_shrunk__250614-2148-22971.pdf Thanks! -Hoss http://www.lucidworks.com/
Re: Am I being dense? Or are real-time gets not exposed in SolrJ?
On 6/25/2014 3:27 PM, Michael Della Bitta wrote: The subject line kind of says it all... this is the latest thing we have noticed that doesn't seem to have made it in. Am I missing something? This code: SolrServer server; server = new HttpSolrServer(http://server:port/solr/corename;); ((HttpSolrServer) server).setMaxRetries(1); ((HttpSolrServer) server).setConnectionTimeout(5000); SolrQuery q = new SolrQuery(); q.setRequestHandler(/get); q.set(id,ai_spa509997); System.out.println(q); QueryResponse r = server.query(q); System.out.println(r); Produced this output: qt=%2Fgetid=ai_spa509997 {doc=SolrDocument{location=PARIS,FRANCE, photographer_id=22213,[lots_redacted]}} Other awkwardness was doing a deleteByQuery against a collection other than the defaultCollection, and trying to share a CloudSolrServer among different objects that were writing and reading against multiple collections. If you set the collection parameter on a request to the name of the collection you want to query/update, that should do what you're after. You'll need to do all changes with an UpdateRequest object -- the syntactic sugar methods (add, deleteByQuery, etc) don't handle cases where you need to set parameters on the request. We managed to hack around the former by doing it with an UpdateRequest. I'm wondering if a valid solution to the latter is actually to create one CloudSolrServer, rip the zkStateReader out of it, and stuff it in subsequent ones. Is that a bad idea? It seems like there might be some overhead to having several going in the same process that could be avoided, but maybe I'm overcomplicating things. Another possibility is to create multiple CloudSolrServer objects and use 'setDefaultCollection' on each of them, but that seems like complete overkill unless you've got a small number of collections. If you are absolutely sure that you won't have multiple threads using the CloudSolrServer object, you could call setDefaultCollection before each use ... but IMHO that's sloppy coding. Thanks, Shawn
Sorting date fields
Hey I try to sort my documents over creation date field name=created type=date indexed=true stored=false multiValued=false/ fieldType name=date class=solr.TrieDateField precisionStep=0 positionIncrementGap=0/ I see that result is affected by sorting order (ASC/DESC change order) but result is not precise. For example for query params={mm=2pf=tags^10+title^5sort=created+ascq=queryqf=tags^10+title^5wt=javabinversion=2defType=edismaxrows=10} result is like: 2007-05-14 2007-08-13 2007-05-13 2008-03-26 2008-03-19 2007-07-02 ... general direction is ascending but between singe documents it fluctuate. Is there any way to get strict ordering? Can anybody point me/explain me current behavior? Best Pawel
Re: Sorting date fields
: I see that result is affected by sorting order (ASC/DESC change order) but : result is not precise. For example for query : : params={mm=2pf=tags^10+title^5sort=created+ascq=queryqf=tags^10+title^5wt=javabinversion=2defType=edismaxrows=10} those results don't really make sense -- can you please show us the full complete output you see in your browser from this query... mm=2pf=tags^10+title^5sort=created+ascq=queryqf=tags^10+title^5defType=edismaxrows=10fl=id,createdwt=jsonindent=trueechoParams=all -Hoss http://www.lucidworks.com/
Crawl-Delay in robots.txt and fetcher.threads.per.queue property in Nutch
Hello All If I set fetcher.threads.per.queue property to more than 1 , I believe the behavior would be to have those many number of threads per host from Nutch, in that case would Nutch still respect the Crawl-Delay directive in robots.txt and not crawl at a faster pace that what is specified in robots.txt. In short what I am trying to ask is if setting fetcher.threads.per.queue to 1 is required for being as polite as Crawl-Delay in robots.txt expects? Thx
Re: SOLR 4.4 - Slave always replicates full index
Dominik: If you optimize your index, then the entire thing will be replicated from the master to the slave every time. In general, optimizing isn't necessary even though it sounds like something that's A Good Thing. I suspect that's the nub of the issue. Erick On Tue, Jun 24, 2014 at 11:14 PM, Dominik Siebel m...@dsiebel.de wrote: Hey Suresh, could you get a little more specific on what solved your problem here? I am currently facing the same problem and am trying to find a proper solution. Thanks! ~ Dom 2014-02-28 7:46 GMT+01:00 sureshrk19 sureshr...@gmail.com: Thanks Shawn and Erick. I followed SOLR configuration document and modified index strategy. Looks good now. I haven't seen any problems in last 1 week. Thanks for your suggestions. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-4-Slave-always-replicates-full-index-tp4113089p4120337.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR 4.4 - Slave always replicates full index
Note that this problem can also happen if the RealTimeGet handler is missing from your solrconfig.xml because PeerSync will always fail and a full replication will be triggerred. I added warn-level logging to complain when this happens but it is possible that you are using an older version of Solr which does not have that logging. On Thu, Jun 26, 2014 at 5:27 AM, Erick Erickson erickerick...@gmail.com wrote: Dominik: If you optimize your index, then the entire thing will be replicated from the master to the slave every time. In general, optimizing isn't necessary even though it sounds like something that's A Good Thing. I suspect that's the nub of the issue. Erick On Tue, Jun 24, 2014 at 11:14 PM, Dominik Siebel m...@dsiebel.de wrote: Hey Suresh, could you get a little more specific on what solved your problem here? I am currently facing the same problem and am trying to find a proper solution. Thanks! ~ Dom 2014-02-28 7:46 GMT+01:00 sureshrk19 sureshr...@gmail.com: Thanks Shawn and Erick. I followed SOLR configuration document and modified index strategy. Looks good now. I haven't seen any problems in last 1 week. Thanks for your suggestions. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-4-Slave-always-replicates-full-index-tp4113089p4120337.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Spellchecker causing 500 (ISE)
Hi, We are getting the results for the query but the spellchecker component is returning 500. Please help us out. *query*: http://localhostt:8111/solr/srch/select?q=malerkotlaqt=search *Error:* trace:java.lang.StringIndexOutOfBoundsException: String index out of range: -5 \tat java.lang.AbstractStringBuilder.replace(AbstractStringBuilder.java:789) \tat java.lang.StringBuilder.replace(StringBuilder.java:266) \tat org.apache.solr.spelling.SpellCheckCollator.getCollation(SpellCheckCollator.java:235) \tat org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:92) \tat org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:230) \tat org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:197) \tat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) \tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) \tat org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) \tat org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774) \tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) \tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) \tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) \tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) \tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) \tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) \tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) \tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) \tat org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953) \tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) \tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) \tat org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023) \tat org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) \tat org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) \tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) \tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) \tat java.lang.Thread.run(Thread.java:745) The suggestions when i query with the separate words (maler kotla): http://localhostt:8111/solr/srch/select?q=maler%20kotlaqt=search facet_counts:{ facet_queries:{}, facet_fields:{ city:[ maler kotla,2, ludhiana,1], datatype:[ company,2, product,1]}, facet_dates:{}, facet_ranges:{}}, spellcheck:{ suggestions:[ maler,{ numFound:7, startOffset:0, endOffset:5, origFreq:9, suggestion:[{ word:maker, freq:19751}, { word:mailer, freq:1439}, { word:mayer, freq:271}, { word:mater, freq:214}, { word:malar, freq:183}, { word:maier, freq:123}, { word:male, freq:32169}]}, kotla,{ numFound:3, startOffset:6, endOffset:11, origFreq:30, suggestion:[{ word:koala, freq:282}, { word:kota, freq:5355}, { word:kola, freq:861}]}, correctlySpelled,true, collation,maker koala]}} Full Response for erroed url : http://localhostt:8111/solr/srch/select?q=malerkotlaqt=search { responseHeader:{ status:500, QTime:49}, grouped:{ glusrid:{ matches:2802, ngroups:314, groups:[]}}, facet_counts:{ facet_queries:{}, facet_fields:{ city:[ maler kotla,311, bengaluru,1, ludhiana,1, mohali,1], datatype:[ company,162, product,146, offer,6]}, facet_dates:{}, facet_ranges:{}}, error:{ msg:String index out of range: -5, trace:java.lang.StringIndexOutOfBoundsException: String index out of range: -5\n\tat java.lang.AbstractStringBuilder.replace(AbstractStringBuilder.java:789)\n\tat java.lang.StringBuilder.replace(StringBuilder.java:266)\n\tat org.apache.solr.spelling.SpellCheckCollator.getCollation(SpellCheckCollator.java:235)\n\tat org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:92)\n\tat