cores/shards with no leader
Hello, We're running solr 4.2.0 and recently converted to SolrCloud. We've got 16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each core. We're suddenly having trouble with very slow tomcat restarts (15-45 minutes) and even when we can get a few replicas up, we aren't seeing a leader for many of our cores. I tried issuing a reload command through the cores admin, but it fails because there is no leader. Is there any way to cause an election? Restarting tomcat on individual servers in the cluster doesn't seem to help. We do have some cores that are serving request properly and would prefer not to shut down the whole cluster if possible -- this is a production system. In addition, some cores are reporting a peculiar error, stack trace below. The cores that report this problem seem to be completely down across all replicas. ERROR org.apache.solr.servlet.SolrDispatchFilter - null :org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1415) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1527) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1304) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1239) at org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94) at org.apache.solr.servlet.cache.HttpCacheHeaderUtil.calcLastModified(HttpCacheHeaderUtil.java:145) at org.apache.solr.servlet.cache.HttpCacheHeaderUtil.doCacheHeaderValidation(HttpCacheHeaderUtil.java:218) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:334) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:581) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:879) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: Already closed at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:237) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:222) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:244) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1326) Has anyone see either of these issues before? I'm having trouble finding any information on either situation. Thanks, -Cat
Re: cores/shards with no leader
Thanks Shawn. We'll give an upgrade a try and see if that helps. -Cat On 08/29/2013 04:32 PM, Shawn Heisey wrote: On 8/29/2013 2:16 PM, Cat Bieber wrote: We're running solr 4.2.0 and recently converted to SolrCloud. We've got 16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each core. We're suddenly having trouble ... Solr 4.2.0 had a number of bugs. They were severe enough that a 4.2.1 version was quickly released afterwards. It should be possible to upgrade without changing your config. You should probably upgrade to 4.4, but that would be less straightforward.
Re: alphanumeric interval
I did not use facets in my implementation, so I don't have any facet-specific code snippet that would be helpful to you. However, if your handler extends SearchHandler and calls super.handleRequestBody() it should be running the facet component code. You have access to the SolrQueryResponse built by it, and may be able to get the data out of that object. You'll need to look at the javadoc for NamedList, and I found it helpful to dump the list in debug statements so I could examine its structure and contents. I suspect you need something like rsp.getValues().get(facet_counts) to get the facet data, but haven't tested it. -Cat Bieber On 07/05/2012 04:32 AM, AlexR wrote: Hi, thanks a lot for your answer, and sorry for my late response. It's my first time to write a solr plugin. I already have a plugin with empty handleRequestBody() method and i'm able to call them. I need the list of facetted field person (facet.field=person) in my method. but i don't know how. do you have a code snipped of your implementation? thx Alex -- View this message in context: http://lucene.472066.n3.nabble.com/alphanumeric-interval-tp3990965p3993148.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: alphanumeric interval
I had a similar issue. The solution I ended up using was a custom RequestHandler that extends SearchHandler. In handleRequestBody() it calls super.handleRequestBody(req, rsp), looks for a pageSize parameter (25 in your example), and loops over the array of results inside the response, pulling out the ones I want and building a new array. It performs well enough, and avoids downloading a large result set. It sounds like you're more interested in number of buckets whereas I needed a specific page size, but you could just as easily pass in the number of buckets in your param. This solution does require that you be willing to write some java code and add it to your solr deploy as a plugin. Then you configure your new request handler in solrconfig.xml and you can give it a default page size or bucket size. I actually discussed this with some people at the training before Lucene Revolution and there wasn't a distinct right answer. Because I was looking for a three-letter prefix for my ranges, one suggestion was to add the prefix to the solr index and facet on it. Then, by adding up counts you could tell what the endpoints of an interval were. That would still require doing some calculations on the client side, and it won't be useful if you have full values with few duplicates. -Cat Bieber On 06/22/2012 09:32 AM, AlexR wrote: I need even sized buckets and their borders. 100/4 = 25 entries Border for first interval is entry 1 and entry 25 in this case Alex - John i don't want to load all names and calculate the borders on the client. Is there a way to get the borders from Solr?
phrase query and string/keyword tokenizer
I have documents that are word definitions (basically an online dictionary) that can have alternate titles. For example the document entitled Read-only memory might have an alternate title of ROM. In search results, I want to boost documents with an alternate title that is a case-insensitive exact match for the query text -- e.g. rom should work as well. I'm running solr 3.6 and using edismax. I've gone through a few iterations of this. What I have working best so far is a multi-valued text field for the alternate titles with a big boost: fieldType name=lowerCaseSort class=solr.TextField sortMissingLast=true omitNorms=true analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldType field name=bestMatchTitle type=lowerCaseSort indexed=true stored=false multiValued=true/ This produces great results with single-word searches like the ROM example above. It runs into problems with a multi-word alternate title like Blue Tooth. I have read some of the prior discussions about this, regarding how the query is parsed based on spaces before it gets to the keyword tokenizer for the field type. The question I have is about phrase queries in this case. My request handler has: str name=qfbestMatchTitle^20 title^5 summary^3 metaDescription^1.5 body^1 author^0.5/str str name=pfbestMatchTitle^20 title^5 summary^3 metaDescription^1.5 body^1 author^0.5/str When I run a query, I get this: +((DisjunctionMaxQuery((metaDescription:blue^1.5 | summary:blue^3.0 | author:blue^0.5 | body:blue | title:blue^5.0 | bestMatchTitle:blue^20.0)~0.01) DisjunctionMaxQuery((metaDescription:tooth^1.5 | summary:tooth^3.0 | author:tooth^0.5 | body:tooth | title:tooth^5.0 | bestMatchTitle:tooth^20.0)~0.01))~2) DisjunctionMaxQuery((metaDescription:blue tooth~100^1.5 | summary:blue tooth~100^3.0 | body:blue tooth~100 | title:blue tooth~100^5.0)~0.01) It looks like the phrase isn't being matched against my bestMatchTitle field. It also isn't matched against author, which is type string. So do phrases only get matched against certain field types? When I put the quotes in the query text: /select/?qt=best-matchq=blue+toothdebugQuery=on It builds the query I was hoping to get: +DisjunctionMaxQuery((metaDescription:blue tooth^1.5 | summary:blue tooth^3.0 | author:blue tooth^0.5 | body:blue tooth | title:blue tooth^5.0 | bestMatchTitle:blue tooth^20.0)~0.01) But I still need the query on the individual tokens, otherwise it eliminates results that may be good hits. So far, any way I have tried to combine the two queries either opens up matching a ton of documents that shouldn't really match (e.g. total found goes from 24 to 4800+ documents) or doesn't match the one I want, giving poor results. Does anyone have suggestions for how I can convince the phrase query to match against my bestMatchTitle field, or change the query text I'm passing in to combine these two queries and get the boost I want? Or is there another approach altogether that I'm missing? Thanks for any help with this. -Cat Bieber
Re: String ordering appears different with sort vs range query
Thanks for looking at this. I'll see if we can sneak an upgrade to 3.6 into the project to get this working. -Cat On 04/20/2012 12:03 PM, Erick Erickson wrote: BTW, nice problem statement... Anyway, I see this too in 3.5. I do NOT see this in 3.6 or trunk, so it looks like a bug that got fixed in the 3.6 time-frame. Don't have the time right now to go back over the JIRA's to see... Best Erick On Thu, Apr 19, 2012 at 3:39 PM, Cat Biebercbie...@techtarget.com wrote: I'm trying to use a Solr query to find the next title in alphabetical order after a given string. The issue I'm facing is that the sort param seems to sort non-alphanumeric characters in a different order from the ordering used by a range filter in the q or fq param. I can't filter the non-alphanumeric characters out because they're integral to the data and it would not be a useful ordering if it were based only on the alphanumeric portion of the strings. I'm running Solr version 3.5. In my current approach, I have a field that is a unique string for each document: fieldType name=lowerCaseSort class=solr.TextField sortMissingLast=true omitNorms=true analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldType field name=uniqueSortString type=lowerCaseSort indexed=true stored=true/ I'm passing the value for the current document in a range to query everything after the current string, sorted ascending: /select?fl=uniqueSortStringsort=uniqueSortString+ascq=uniqueSortString:[$1+ZX+Spectrum+HOBETA+format+file+TO+*]wt=xmlrows=5version=2.2 In theory, I expect the first result to be the current item and the second result to be the next one. However, I'm finding that the sort and the range filter seem to use different ordering: result name=response numFound=448 start=0 doc str name=uniqueSortString$1 ZX Spectrum - Emulator/str /doc doc str name=uniqueSortString$1 ZX Spectrum HOBETA format file/str /doc doc str name=uniqueSortString$1 ZX Spectrum Hobetta Picture Format/str /doc doc str name=uniqueSortString$? TR-DOS ZX Spectrum file in HOBETA format/str /doc doc str name=uniqueSortString$A AutoCAD Autosave File ( Autodesk Inc.)/str /doc /result Based on the results ordering, sort believes - precedes H, but the range filter should have excluded that first result if it ordered in the same way. Digging through the code, I think it looks like sorting uses String.compareTo() for ordering on a text/string field. However I haven't been able to track down where the range filter code is. If someone can point me in the right direction to find that code I'd love to look through it. Or, if anyone has suggestions regarding a different approach or changes I can make to this query/field, that would be very helpful. Thanks for your time. -Cat Bieber
String ordering appears different with sort vs range query
I'm trying to use a Solr query to find the next title in alphabetical order after a given string. The issue I'm facing is that the sort param seems to sort non-alphanumeric characters in a different order from the ordering used by a range filter in the q or fq param. I can't filter the non-alphanumeric characters out because they're integral to the data and it would not be a useful ordering if it were based only on the alphanumeric portion of the strings. I'm running Solr version 3.5. In my current approach, I have a field that is a unique string for each document: fieldType name=lowerCaseSort class=solr.TextField sortMissingLast=true omitNorms=true analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldType field name=uniqueSortString type=lowerCaseSort indexed=true stored=true/ I'm passing the value for the current document in a range to query everything after the current string, sorted ascending: /select?fl=uniqueSortStringsort=uniqueSortString+ascq=uniqueSortString:[$1+ZX+Spectrum+HOBETA+format+file+TO+*]wt=xmlrows=5version=2.2 In theory, I expect the first result to be the current item and the second result to be the next one. However, I'm finding that the sort and the range filter seem to use different ordering: result name=response numFound=448 start=0 doc str name=uniqueSortString$1 ZX Spectrum - Emulator/str /doc doc str name=uniqueSortString$1 ZX Spectrum HOBETA format file/str /doc doc str name=uniqueSortString$1 ZX Spectrum Hobetta Picture Format/str /doc doc str name=uniqueSortString$? TR-DOS ZX Spectrum file in HOBETA format/str /doc doc str name=uniqueSortString$A AutoCAD Autosave File ( Autodesk Inc.)/str /doc /result Based on the results ordering, sort believes - precedes H, but the range filter should have excluded that first result if it ordered in the same way. Digging through the code, I think it looks like sorting uses String.compareTo() for ordering on a text/string field. However I haven't been able to track down where the range filter code is. If someone can point me in the right direction to find that code I'd love to look through it. Or, if anyone has suggestions regarding a different approach or changes I can make to this query/field, that would be very helpful. Thanks for your time. -Cat Bieber