Error while adding copy fields to a schema
Hi, I have an requirement in which I have to add some fields in schema at run time and after that i need to add the copy fields for some of the schema fields. To add the fields in schema I used the following REST API, which is giving success response in output as shown below: *Post URL: *http://localhost:8080/solr/bookindex/schema/fields *Content-type :* application/json *Post Data :* [ { indexed: true, name: age, stored: true, type: long }, { indexed: true, name: sex, stored: true, type: string }, { indexed: true, name: _all, stored: true, type: string, multiValued : true } ] *Output Response :* { responseHeader:{ status:0, QTime:202 } } After adding these fields in schema, as I executing the second call to add the copy fields in schema, i am getting an error *Error persisting managed schema at /configs/myconf/managed-schema *in response. Following are the details about REST API which i am using to add the copy fields along with error response. *Post URL: *http://localhost:7070/solr/bookindex/schema/copyfields *Content-type : *application/json *Post Data : * [ { source:age, dest: _all }, { source:sex, dest: _all } ] *Output Response :* { responseHeader:{ status:500, QTime:190}, error:{ msg:Error persisting managed schema at /configs/myconf/managed-schema, trace:org.apache.solr.common.SolrException: Error persisting managed schema at /configs/myconf/managed-schema\n\tat org.apache.solr.schema.ManagedIndexSchema.persistManagedSchemaToZooKeeper(ManagedIndexSchema.java:166)\n\tat org.apache.solr.schema.ManagedIndexSchema.persistManagedSchema(ManagedIndexSchema.java:83)\n\tat org.apache.solr.schema.ManagedIndexSchema.addCopyFields(ManagedIndexSchema.java:281)\n\tat org.apache.solr.rest.schema.CopyFieldCollectionResource.post(CopyFieldCollectionResource.java:174)\n\tat org.restlet.resource.ServerResource.doHandle(ServerResource.java:437)\n\tat org.restlet.resource.ServerResource.doConditionalHandle(ServerResource.java:350)\n\tat org.restlet.resource.ServerResource.handle(ServerResource.java:952)\n\tat org.restlet.resource.Finder.handle(Finder.java:246)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.routing.Router.doHandle(Router.java:431)\n\tat org.restlet.routing.Router.handle(Router.java:648)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:155)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:211)\n\tat org.restlet.engine.application.ApplicationHelper.handle(ApplicationHelper.java:84)\n\tat org.restlet.Application.handle(Application.java:381)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.routing.Router.doHandle(Router.java:431)\n\tat org.restlet.routing.Router.handle(Router.java:648)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.routing.Router.doHandle(Router.java:431)\n\tat org.restlet.routing.Router.handle(Router.java:648)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat org.restlet.routing.Filter.handle(Filter.java:206)\n\tat org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:211)\n\tat org.restlet.Component.handle(Component.java:392)\n\tat org.restlet.Server.handle(Server.java:516)\n\tat org.restlet.engine.ServerHelper.handle(ServerHelper.java:72)\n\tat org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:152)\n\tat org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1089)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:848)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:457)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:575)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
Re: Schema API synchronization question
Yes, that is what we are seeing. Thanks for pointing me to the right issues to track. Where can I find out when 4.10 final is going to be released? Thanks, Matthias On Sat, Aug 30, 2014 at 9:26 PM, Erick Erickson erickerick...@gmail.com wrote: There have been some recent improvements in that area, what version of Solr are you running? Is there any chance you could try with 4.10 when the final version is released? Or perhaps checkout/build the 4.10 release candidate? See, for instance, https://issues.apache.org/jira/browse/SOLR-6137 Still open: https://issues.apache.org/jira/browse/SOLR-6249 Do either of these describe what you are seeing? If not, how exactly are things going wonky? Best, Erick On Sat, Aug 30, 2014 at 7:02 PM, Matthias Broecheler m...@matthiasb.com wrote: Hello everybody, from reading the documentation it is not entirely clear what the synchronization behavior of Solr's schema API is. We are seeing some reliability issues in a multi-machine SolrCloud setup. Granted, being new we might be doing something wrong, but at this point I am confused as to what the expected behavior ought to be. It would be wonderful if somebody could point me to or explain how schema changes made through the API are propagated in a cluster, what happens if documents are added concurrently and any known issues that might exist in that regard. Thank you very much, Matthias -- Matthias Broecheler http://www.matthiasb.com -- Matthias Broecheler http://www.matthiasb.com
Re: Schema API synchronization question
The release vote has passed, the release packages are spreading out to the mirrors, and the announcement should appear in the next 12-24 hours. Steve www.lucidworks.com On Sep 2, 2014, at 11:56 PM, Matthias Broecheler m...@matthiasb.com wrote: Yes, that is what we are seeing. Thanks for pointing me to the right issues to track. Where can I find out when 4.10 final is going to be released? Thanks, Matthias On Sat, Aug 30, 2014 at 9:26 PM, Erick Erickson erickerick...@gmail.com wrote: There have been some recent improvements in that area, what version of Solr are you running? Is there any chance you could try with 4.10 when the final version is released? Or perhaps checkout/build the 4.10 release candidate? See, for instance, https://issues.apache.org/jira/browse/SOLR-6137 Still open: https://issues.apache.org/jira/browse/SOLR-6249 Do either of these describe what you are seeing? If not, how exactly are things going wonky? Best, Erick On Sat, Aug 30, 2014 at 7:02 PM, Matthias Broecheler m...@matthiasb.com wrote: Hello everybody, from reading the documentation it is not entirely clear what the synchronization behavior of Solr's schema API is. We are seeing some reliability issues in a multi-machine SolrCloud setup. Granted, being new we might be doing something wrong, but at this point I am confused as to what the expected behavior ought to be. It would be wonderful if somebody could point me to or explain how schema changes made through the API are propagated in a cluster, what happens if documents are added concurrently and any known issues that might exist in that regard. Thank you very much, Matthias -- Matthias Broecheler http://www.matthiasb.com -- Matthias Broecheler http://www.matthiasb.com
looking for a solr/search expert in Paris
Hello, We are looking for a solr consultant to help us with our devs using solr. We've been working on this for a little while, and we feel we need an expert point of view on what we're doing, who could give us insights about our solr conf, performance issues, error handling issues (big thing). Well everything. The entreprise is in the Paris (France) area. Any suggestion is welcomed. Thanks, Elisabeth
AUTO: Saravanan Chinnadurai is out of the office (returning 04/09/2014)
I will be out of the office starting 03/09/2014 and will not return until 04/09/2014 Please email itsta...@actionimages.com for any urgent queries. Note: This is an automated response to your message How can I set shard members? sent on 9/3/2014 5:00:04. This is the only notification you will receive while this person is away. Action Images is a division of Reuters Limited and your data will therefore be protected in accordance with the Reuters Group Privacy / Data Protection notice which is available in the privacy footer at www.reuters.com Registered in England No. 145516 VAT REG: 397000555
How to stop Solr delta import from creating a log file
I have solr installed on Debian and every time delta import takes place a file gets created in my root directory. The files that get created look like this dataimport?command=delta-import.1 dataimport?command=delta-import.2 . . . dataimport?command=delta-import.30 Every time there is a delta import a file gets created , i opened the file in vi editor and its an xml file. Why are these files getting created and how do i stop solr from creating them. To start solr i use this command Java -jar start.jar According to this command no log files should be created. Please advise and help iam new to solr. -- Regards Madhav Bahuguna
Create collection dynamically in my program
Hi , all: I created collection per day dynamically in my program.Like this: http://lucene.472066.n3.nabble.com/file/n4156601/create1.png But,when I searched data with collection=myCollection-20140903,it showed Collection not found:myCollection-20140903 . I checked the clusterState in debug mode , there was not myCollection-20140903 in it. But,there was myCollection-20140903 in zk clusterstate.json actually. Is there something wrong in my way? If there is new way or better way to create collection dynamically? Thanks! -Xinwu -- View this message in context: http://lucene.472066.n3.nabble.com/Create-collection-dynamically-in-my-program-tp4156601.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Create collection dynamically in my program
Hello Xinwu, does it change anything if you use an underline instead of the dash in the collection name? What is the result of the call? Any status or error message? Did you actually feed data into the collection? Cheers, --Jürgen On 03.09.2014 11:21, xinwu wrote: Hi , all: I created collection per day dynamically in my program.Like this: http://lucene.472066.n3.nabble.com/file/n4156601/create1.png But,when I searched data with collection=myCollection-20140903,it showed Collection not found:myCollection-20140903 . I checked the clusterState in debug mode , there was not myCollection-20140903 in it. But,there was myCollection-20140903 in zk clusterstate.json actually. Is there something wrong in my way? If there is new way or better way to create collection dynamically? Thanks! -Xinwu -- View this message in context: http://lucene.472066.n3.nabble.com/Create-collection-dynamically-in-my-program-tp4156601.html Sent from the Solr - User mailing list archive at Nabble.com. -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением *i.A. Jürgen Wagner* Head of Competence Center Intelligence Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com mailto:juergen.wag...@devoteam.com, URL: www.devoteam.de http://www.devoteam.de/ Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
Re: HTTPS for SolrCloud
Once I upgraded to 4.9.0, the solr.ssl.checkPeerName option was used, and I was able to create a collection. I'm still wondering if there is a good way to remove references to any collections that didn't complete, but block a collection from being made with the same name? Thanks! -- Chris On Tue, Sep 2, 2014 at 2:30 PM, Christopher Gross cogr...@gmail.com wrote: Is the solr.ssl.checkPeerName option available in 4.8.1? I have my Tomcat starting up with that as a -D option, but I'm getting an exception on validating the hostname w/ the cert... -- Chris On Tue, Sep 2, 2014 at 1:44 PM, Christopher Gross cogr...@gmail.com wrote: OK -- so I think my previous attempts were causing the problem. Since this is a dev environment (and is still empty), I just went ahead and wiped out the version-2 directories for the zookeeper nodes, reloaded my solr collections, then ran that command (zkcli.sh in the solr distro). That did work. What is a reliable way to remove a file from Zookeeper? Now I just get this error when trying to create a collection: org.apache.solr.client.solrj.SolrServerException:IOException occured when talking to server at: https://server:8444 This brings up another problem that I have -- if there's an error creating a collection, if I fix the issue and try to re-create the collection, I get something like this: str name=Operation createcollection caused exception:org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: collection already exists: testcollection/str How do I go about cleaning those up? The only reliable thing that I've found is to wipe out the zookeepers and start over. Thanks Hoss! -- Chris On Tue, Sep 2, 2014 at 1:08 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : ./zkcli.sh -zkhost localhost:2181 -cmd put /clusterprops.json : '{urlScheme:https}' ... : Next I start Tomcat, I get this: : 482 [localhost-startStop-1] ERROR org.apache.solr.core.SolrCore â : null:org.noggit.JSONParser$ParseException: JSON Parse Error: : char=',position=0 BEFORE=''' AFTER='{urlScheme:https}'' I can't reproduce the erorr you are describing when i follow all the steps on the SSL doc page (using bash, and the outer single quotes, just like you)... https://cwiki.apache.org/confluence/display/solr/Enabling+SSL#EnablingSSL-SolrCloud Are you certain that you your solr nodes are talking to the same zookeeper instance? (Because according to that error, there is a stray sigle-quote at the begining of the clusterprops.json file in the ZK server solr is talking to, and as you already confirmed there's no single quotes in the string you read back from the zk server you are talking to ... perhaps there are 2 zk instances setup somewhere and the one solr is using still has crufty data from before you got the quoting issue straightened out?) do you see log messages early on in Solr's startup from ZkContainer that say... 1359 [main] INFO org.apache.solr.core.ZkContainer – Zookeeper client=localhost:2181 ? -Hoss http://www.lucidworks.com/
Re: looking for a solr/search expert in Paris
Don't forget to check out the Solr Support wiki where consultants advertise their services: http://wiki.apache.org/solr/Support And any Solr or Lucene consultants on this mailing list should be sure that they are registered on that support wiki. Hey, it's free! And be sure to keep your listing up to date, including regional availability and any specialties. -- Jack Krupansky -Original Message- From: elisabeth benoit Sent: Wednesday, September 3, 2014 4:02 AM To: solr-user@lucene.apache.org Subject: looking for a solr/search expert in Paris Hello, We are looking for a solr consultant to help us with our devs using solr. We've been working on this for a little while, and we feel we need an expert point of view on what we're doing, who could give us insights about our solr conf, performance issues, error handling issues (big thing). Well everything. The entreprise is in the Paris (France) area. Any suggestion is welcomed. Thanks, Elisabeth
Re: How to stop Solr delta import from creating a log file
On 9/3/2014 3:19 AM, madhav bahuguna wrote: I have solr installed on Debian and every time delta import takes place a file gets created in my root directory. The files that get created look like this I figure there's one of two possibilities: 1) You've got a misconfiguration in the dataimport handler. 2) Solr has a bug that doesn't show up for most people, because most people don't run Solr with full root/administrator privileges. On Linux systems only root typically has write privileges on the root directory. You'll need to share your configs to see if there's anything obviously wrong in them. We'll also need to know which version you're on. Thanks, Shawn
Re: How to stop Solr delta import from creating a log file
Is ' dataimport?command=delta-import.1' actually a file name? If this the case, are you running the trigger from a cron job or similar? If I am still on the right track, check your cron job/script and see if you have misplaced new line, quote (e.g. MSWord quote instead of normal) or some other abnormality. It looks like a Bobby Tables situation with run away quotes. Regards, Alex. P.s. https://xkcd.com/327/ Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Wed, Sep 3, 2014 at 5:19 AM, madhav bahuguna madhav.bahug...@gmail.com wrote: I have solr installed on Debian and every time delta import takes place a file gets created in my root directory. The files that get created look like this dataimport?command=delta-import.1 dataimport?command=delta-import.2 . . . dataimport?command=delta-import.30 Every time there is a delta import a file gets created , i opened the file in vi editor and its an xml file. Why are these files getting created and how do i stop solr from creating them. To start solr i use this command Java -jar start.jar According to this command no log files should be created. Please advise and help iam new to solr. -- Regards Madhav Bahuguna
Re: WordDelimiter filter, expanding to multiple words, unexpected results
Thanks Erick and Diego. Yes, I noticed in my last message I'm not actually using defaults, not sure why I chose non-defaults originally. I still need to find time to make a smaller isolation/reproduction case, I'm getting confusing results that suggest some other part of my field def may be pertinent. I'll come back when I've done that (hopefully next week), and include the _parsed_ from debug=query then. Thanks! Jonathan On 9/2/14 4:26 PM, Erick Erickson wrote: What happens if you append debug=query to your query? IOW, what does the _parsed_ query look like? Also note that the defaults for WDFF are _not_ identical. catenateWords and catenateNumbers are 1 in the index portion and 0 in the query section. Still, this shouldn't be a problem all other things being equal. Best, Erick On Tue, Sep 2, 2014 at 12:43 PM, Jonathan Rochkind rochk...@jhu.edu wrote: On 9/2/14 1:51 PM, Erick Erickson wrote: bq: In my actual index, query MacBook is matching ONLY mac book, and not macbook I suspect your query parameters for WordDelimiterFilterFactory doesn't have catenate words set. What do you see when you enter these in both the index and query portions of the admin/analysis page? Thanks Erick! Our WordDelimiterFilterFactory does have catenate words set, in both index and query phases (is that right?): filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ It's hard to cut and paste the results of the analysis page into email (or anywhere!), I'll give you screenshots, sorry -- and I'll give them for our whole real world app complex field definition. I'll also paste in our entire field definition below. But I realize my next step is probably creating a simpler isolation/reproduction case (unless you have a magic answer from this!). Again, the problem is that MacBook seems to be only matching on indexed macbook and not indexed mac book. MacBook query analysis: https://www.dropbox.com/s/b8y11usjdlc88un/mixedcasequery.png MacBook index analysis: https://www.dropbox.com/s/fwae3nz4tdtjhjv/mixedcaseindex.png mac book index analysis: https://www.dropbox.com/s/mihd58f6zs3rfu8/twowordindex.png Our entire actual field definition: fieldType name=text class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer !-- the rulefiles thing is to keep ICUTokenizerFactory from stripping punctuation, so our synonym filter involving C++ etc can still work. From: https://mail-archives.apache. org/mod_mbox/lucene-solr-user/201305.mbox/%3C51965E70. 6070...@elyograg.org%3E the rbbi file is in our local ./conf, copied from lucene source tree -- tokenizer class=solr.ICUTokenizerFactory rulefiles=Latn:Latin-break-only-on-whitespace.rbbi/ filter class=solr.SynonymFilterFactory synonyms=punctuation-whitelist.txt ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ !-- folding need sto be after WordDelimiter, so WordDelimiter can do it's thing with full cases and such -- filter class=solr.ICUFoldingFilterFactory / !-- ICUFolding already includes lowercasing, no need for seperate lowercasing step filter class=solr.LowerCaseFilterFactory/ -- filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType
Re: How to stop Solr delta import from creating a log file
: I have solr installed on Debian and every time delta import takes place a : file gets created in my root directory. The files that get created look : like this : : : dataimport?command=delta-import.1 that is exactly the output you would expect to see if you have a cron somewhere, running wget against the DIH, as root... hossman@frisbee:~/tmp/dh$ wget --quiet http://localhost:8983/solr/rss/dataimport?command=delta-import; hossman@frisbee:~/tmp/dh$ ls dataimport?command=delta-import hossman@frisbee:~/tmp/dh$ wget --quiet http://localhost:8983/solr/rss/dataimport?command=delta-import; hossman@frisbee:~/tmp/dh$ wget --quiet http://localhost:8983/solr/rss/dataimport?command=delta-import; hossman@frisbee:~/tmp/dh$ wget --quiet http://localhost:8983/solr/rss/dataimport?command=delta-import; hossman@frisbee:~/tmp/dh$ wget --quiet http://localhost:8983/solr/rss/dataimport?command=delta-import; hossman@frisbee:~/tmp/dh$ ls dataimport?command=delta-importdataimport?command=delta-import.3 dataimport?command=delta-import.1 dataimport?command=delta-import.4 dataimport?command=delta-import.2 -Hoss http://www.lucidworks.com/
How to change search component parameters dynamically using query
Hi, I use the below highlight search component in one of my request handler. I am trying to figure out a way to change the value of highlight search component dynamically from the query. Is it possible to modify the parameters dynamically using the query (without creating another searchcomponent)? searchComponent class=solr.HighlightComponent name=highlight highlighting boundaryScanner class=solr.highlight.SimpleBoundaryScanner default=false name=simple lst name=defaults str name=hl.bs.maxScan200/str str name=hl.bs.chars./str /lst /boundaryScanner boundaryScanner class=solr.highlight.BreakIteratorBoundaryScanner default=true name=breakIterator lst name=defaults str name=hl.bs.typeSENTENCE/str str name=hl.bs.languageen/str str name=hl.bs.countryUS/str /lst /boundaryScanner /highlighting -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-change-search-component-parameters-dynamically-using-query-tp4156672.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Importing RDF/XML in Solr
iirc, Lucene In Action describes http://rdelbru.github.io/SIREn/ in the one of appendixes. I know that they spoke at LuenceRevolution recently. that's all what I know. On Wed, Sep 3, 2014 at 2:40 PM, Pragati Meena pme...@bostonanalytics.com wrote: Hi, I want to index rdf/xml document into solr. I am attaching the XML input document . I want to identify person, location and organization in solr. So I have made changes in data-config, schema.xml and added request handler in solrconfig.xml. But person, organization, location are not indexed into solr. Please tell me what is it that I am missing here. Thanks Pragati Meena Big Data Engineer *Pragati Meena* *Big Data Engineer* [image: cid:image001.png@01CA6F51.41A1B030] (È)+91-9910584024 (*)pme...@bostonanalytics.com DISCLAIMER: The information contained in this e-mail message or any attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and / or attachments to it is strictly prohibited. If you have received this communication in error, please notify us by reply e-mail and immediately delete the e-mail. Thank you.- -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: looking for a solr/search expert in Paris
Thanks a lot for your answers. Best regards, Elisabeth 2014-09-03 17:18 GMT+02:00 Jack Krupansky j...@basetechnology.com: Don't forget to check out the Solr Support wiki where consultants advertise their services: http://wiki.apache.org/solr/Support And any Solr or Lucene consultants on this mailing list should be sure that they are registered on that support wiki. Hey, it's free! And be sure to keep your listing up to date, including regional availability and any specialties. -- Jack Krupansky -Original Message- From: elisabeth benoit Sent: Wednesday, September 3, 2014 4:02 AM To: solr-user@lucene.apache.org Subject: looking for a solr/search expert in Paris Hello, We are looking for a solr consultant to help us with our devs using solr. We've been working on this for a little while, and we feel we need an expert point of view on what we're doing, who could give us insights about our solr conf, performance issues, error handling issues (big thing). Well everything. The entreprise is in the Paris (France) area. Any suggestion is welcomed. Thanks, Elisabeth
Is there a way to modify the request handler parameters dynamically?
Hi, I need to change the components (inside a request handler) dynamically using query parameters instead of creating multiple request handlers. Is it possible to do this on the fly from the query? For Ex: change the highlight search component to use different search component based on a query parameter requestHandler class=solr.StandardRequestHandler name=/test arr name=components strfilterbyrole/str strlandingPage/str strfirstRulesComp/str strquery/str * strhighlight/str * strfacet/str strspellcheck/str strlastRulesComp/str strdebug/str strelevator/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-modify-the-request-handler-parameters-dynamically-tp4156697.html Sent from the Solr - User mailing list archive at Nabble.com.
Server is shutting down due to threads
We have SolrCloud instance with 2 solr nodes and 3 zk ensemble. One of the solr node goes down as soon as we send search traffic to it, but update works fine. When I analyzed thread dump I saw lot of blocked threads with following error message. This explains why it couldn't create any native threads and ran out of memory. The thread count went from 48 to 900 within minutes and server came down. The other node with same configuration is taking all the search and update traffic, and it running fine. Any pointers would be appreciated. http-bio-52158-exec-59 - Thread t@589 java.lang.Thread.State: BLOCKED on org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b owned by: http-bio-52158-exec-61 at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:209) at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97) at org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) - locked org.apache.tomcat.util.net.SocketWrapper@5b4530c8 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Locked ownable synchronizers: - locked java.util.concurrent.ThreadPoolExecutor$Worker@63d2720 -E
Re: Server is shutting down due to threads
Forgot to add the source thread thats blocking every other thread http-bio-52158-exec-61 - Thread t@591 java.lang.Thread.State: RUNNABLE at org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312) at org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212) - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97) at org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) - locked org.apache.tomcat.util.net.SocketWrapper@7826692 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Locked ownable synchronizers: - locked java.util.concurrent.ThreadPoolExecutor$Worker@2463aef On Wed, Sep 3, 2014 at 2:31 PM, Ethan eh198...@gmail.com wrote: We have SolrCloud instance with 2 solr nodes and 3 zk ensemble. One of the solr node goes down as soon as we send search traffic to it, but update works fine. When I analyzed thread dump I saw lot of blocked threads with following error message. This explains why it couldn't create any native threads and ran out of memory. The thread count went from 48 to 900 within minutes and server came down. The other node with same configuration is taking all the search and update traffic, and it running fine. Any pointers would be appreciated. http-bio-52158-exec-59 - Thread t@589 java.lang.Thread.State: BLOCKED on org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b owned by: http-bio-52158-exec-61 at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:209) at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97) at org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501) at
Re: Is there a way to modify the request handler parameters dynamically?
Hi, You can skip certain components. Every component has a name, if you set its name to false, it is skipped. Example : facet=false or query=false but you cannot change order of them. You need a custom RequestHandler for that. Ahmet On Wednesday, September 3, 2014 10:12 PM, bbarani bbar...@gmail.com wrote: Hi, I need to change the components (inside a request handler) dynamically using query parameters instead of creating multiple request handlers. Is it possible to do this on the fly from the query? For Ex: change the highlight search component to use different search component based on a query parameter requestHandler class=solr.StandardRequestHandler name=/test arr name=components strfilterbyrole/str strlandingPage/str strfirstRulesComp/str strquery/str * strhighlight/str * strfacet/str strspellcheck/str strlastRulesComp/str strdebug/str strelevator/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-modify-the-request-handler-parameters-dynamically-tp4156697.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: WordDelimiter filter, expanding to multiple words, unexpected results
Jonathan: If at all possible, delete your collection/data directory (the whole directory, including data) between runs after you've changed your schema (at least any of your analysis that pertains to indexing). Mixing old and new schema definitions can add to the confusion! Good luck! Erick On Wed, Sep 3, 2014 at 8:48 AM, Jonathan Rochkind rochk...@jhu.edu wrote: Thanks Erick and Diego. Yes, I noticed in my last message I'm not actually using defaults, not sure why I chose non-defaults originally. I still need to find time to make a smaller isolation/reproduction case, I'm getting confusing results that suggest some other part of my field def may be pertinent. I'll come back when I've done that (hopefully next week), and include the _parsed_ from debug=query then. Thanks! Jonathan On 9/2/14 4:26 PM, Erick Erickson wrote: What happens if you append debug=query to your query? IOW, what does the _parsed_ query look like? Also note that the defaults for WDFF are _not_ identical. catenateWords and catenateNumbers are 1 in the index portion and 0 in the query section. Still, this shouldn't be a problem all other things being equal. Best, Erick On Tue, Sep 2, 2014 at 12:43 PM, Jonathan Rochkind rochk...@jhu.edu wrote: On 9/2/14 1:51 PM, Erick Erickson wrote: bq: In my actual index, query MacBook is matching ONLY mac book, and not macbook I suspect your query parameters for WordDelimiterFilterFactory doesn't have catenate words set. What do you see when you enter these in both the index and query portions of the admin/analysis page? Thanks Erick! Our WordDelimiterFilterFactory does have catenate words set, in both index and query phases (is that right?): filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ It's hard to cut and paste the results of the analysis page into email (or anywhere!), I'll give you screenshots, sorry -- and I'll give them for our whole real world app complex field definition. I'll also paste in our entire field definition below. But I realize my next step is probably creating a simpler isolation/reproduction case (unless you have a magic answer from this!). Again, the problem is that MacBook seems to be only matching on indexed macbook and not indexed mac book. MacBook query analysis: https://www.dropbox.com/s/b8y11usjdlc88un/mixedcasequery.png MacBook index analysis: https://www.dropbox.com/s/fwae3nz4tdtjhjv/mixedcaseindex.png mac book index analysis: https://www.dropbox.com/s/mihd58f6zs3rfu8/twowordindex.png Our entire actual field definition: fieldType name=text class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer !-- the rulefiles thing is to keep ICUTokenizerFactory from stripping punctuation, so our synonym filter involving C++ etc can still work. From: https://mail-archives.apache. org/mod_mbox/lucene-solr-user/201305.mbox/%3C51965E70. 6070...@elyograg.org%3E the rbbi file is in our local ./conf, copied from lucene source tree -- tokenizer class=solr.ICUTokenizerFactory rulefiles=Latn:Latin-break-only-on-whitespace.rbbi/ filter class=solr.SynonymFilterFactory synonyms=punctuation-whitelist.txt ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ !-- folding need sto be after WordDelimiter, so WordDelimiter can do it's thing with full cases and such -- filter class=solr.ICUFoldingFilterFactory / !-- ICUFolding already includes lowercasing, no need for seperate lowercasing step filter class=solr.LowerCaseFilterFactory/ -- filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType
Re: How to change search component parameters dynamically using query
Depends on which ones. Any parameter in the defaults sections can be overridden on dynamically, i.e. .hl.bs.language=fr Best, Erick On Wed, Sep 3, 2014 at 10:38 AM, bbarani bbar...@gmail.com wrote: Hi, I use the below highlight search component in one of my request handler. I am trying to figure out a way to change the value of highlight search component dynamically from the query. Is it possible to modify the parameters dynamically using the query (without creating another searchcomponent)? searchComponent class=solr.HighlightComponent name=highlight highlighting boundaryScanner class=solr.highlight.SimpleBoundaryScanner default=false name=simple lst name=defaults str name=hl.bs.maxScan200/str str name=hl.bs.chars./str /lst /boundaryScanner boundaryScanner class=solr.highlight.BreakIteratorBoundaryScanner default=true name=breakIterator lst name=defaults str name=hl.bs.typeSENTENCE/str str name=hl.bs.languageen/str str name=hl.bs.countryUS/str /lst /boundaryScanner /highlighting -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-change-search-component-parameters-dynamically-using-query-tp4156672.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Server is shutting down due to threads
Do you have indexing traffic going to it? b/c this _looks_ like the node is just starting up or a searcher is being opened and you're loading your index first time. This happens when you index data and when you start up your nodes. Adding some autowarming (firstSearcher in this case) might load up the underlying caches earlier. This could also be a problem due to very short commit intervals, although this latter should be identical for both nodes. And when you say 2 solr nodes, is this one shard or two? I'm guessing that you have some setting that's significantly different, memory perhaps? Best, Erick On Wed, Sep 3, 2014 at 2:40 PM, Ethan eh198...@gmail.com wrote: Forgot to add the source thread thats blocking every other thread http-bio-52158-exec-61 - Thread t@591 java.lang.Thread.State: RUNNABLE at org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312) at org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212) - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97) at org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) - locked org.apache.tomcat.util.net.SocketWrapper@7826692 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Locked ownable synchronizers: - locked java.util.concurrent.ThreadPoolExecutor$Worker@2463aef On Wed, Sep 3, 2014 at 2:31 PM, Ethan eh198...@gmail.com wrote: We have SolrCloud instance with 2 solr nodes and 3 zk ensemble. One of the solr node goes down as soon as we send search traffic to it, but update works fine. When I analyzed thread dump I saw lot of blocked threads with following error message. This explains why it couldn't create any native threads and ran out of memory. The thread count went from 48 to 900 within minutes and server came down. The other node with same configuration is taking all the search and update traffic, and it running fine. Any pointers would be appreciated. http-bio-52158-exec-59 - Thread t@589 java.lang.Thread.State: BLOCKED on org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b owned by: http-bio-52158-exec-61 at
[ANNOUNCE] Apache Lucene 4.10.0 released
3 September 2014, Apache Lucene™ 4.10.0 available The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.0 Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html Lucene 4.10.0 Release Highlights: * New TermAutomatonQuery using an automaton for proximity queries. http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html * New OrdsBlockTree terms dictionary supporting ord lookup. * Simplified matchVersion handling for Analyzers with new setVersion method, as well as Analyzer constructors not requiring Version. * Fixed possible corruption when opening a 3.x index with NRT reader. * Fixed edge case in StandardTokenizer that caused extremely slow parsing times with long text which partially matched grammar rules. This release contains numerous bug fixes, optimizations, and improvements. Please read CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/core/4_10_0/changes/Changes.html Please report any feedback to the mailing lists (http://lucene.apache.org/core/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. On behalf of the Lucene PMC, Happy Searching
[ANNOUNCE] Apache Solr 4.10.0 released
3 September 2014, Apache Solr™ 4.10.0 available The Lucene PMC is pleased to announce the release of Apache Solr 4.10.0 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. Solr 4.10.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html Solr 4.10.0 Release Highlights: * This release upgrades Solr Cell's (contrib/extraction) dependency on Apache POI to mitigate 2 security vulnerabilities: http://s.apache.org/solr-cell-security-notice * Scripts for starting, stopping, and running Solr examples * Distributed query support for facet.pivot * Interval Faceting for Doc Values fields * New terms QParser for efficiently filtering documents by a list of values Solr 4.10.0 also includes many other new features as well as numerous optimizations and bugfixes of the corresponding Apache Lucene release. Please read CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/solr/4_10_0/changes/Changes.html Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. On behalf of the Lucene PMC, Happy Searching
DELETEREPLICA
I'm confused, wondering if it's a mismatch between the docs and the intent or just a bug or whether I'm just not understanding the point: The DELETEREPLICA docs say: Delete a replica from a given collection and shard. If the corresponding core is up and running the core is unloaded and the entry is removed from the clusterstate. If the node/core is down, the entry is taken off the clusterstate and if the core comes up later it is automatically unregistered. However, if I do the following: 1 create a follower on nodeX 2 shut down nodeX (at this point, the clusterstate has indicates the follower is down) 3 issue a DELETEREPLICA for the follower (clusterstate entry for this follower is removed) 4 restart nodeX (clusterstate shows this node is back, it's visible in cloud veiw, gets sync'd etc.). Based on the docs, I didn't expect to see the node present in step 4, what am I missing? The core has docs (i.e. it's synched from the leader) etc. So this bit of the documentation is confusing me: If the node/core is down, the entry is taken off the clusterstate and if the core comes up later it is automatically unregistered. That doesn't square with what I'm seeing so either the docs are wrong or I'm misunderstanding the intent. If the node _is_ up, then it's removed from the node and clusterstate and stays gone. Personally, I don't particularly like the idea of queueing up the DELETEREPLICAS for later execution, seems like it's overly complex. Having the clusterstate info removed if the node is down seems very useful though. Thanks, Erick
Solr add document over 20 times slower after upgrade from 4.0 to 4.9
I have a Solr server indexes 2500 documents (up to 50MB each, ave 3MB) to Solr server. When running on Solr 4.0 I managed to finish index in 3 hours. However after we upgrade to Solr 4.9, the index need 3 days to finish. I've done some profiling, numbers I get are: size figure of document,time for adding to Solr server (4.0), time for adding to Solr server (4.9) 1.18, 6 sec, 123 sec 2.26 12sec 444 sec 3.35 18sec over 600 sec 9.6546sec timeout. From what I can see index seems has an o(n) performance for Solr 4.0 and is almost o(log n) for Solr 4.9. I also tried to comment out some copied fields to narrow down the problem, seems size of the document after index(we copy fields and the more fields we copy, the bigger the index size is) is the dominating factor for index time. Just wondering has any one experience similar problem? Does that sound like a bug of Solr or just we have use Solr 4.9 wrong? Here is one example of field definition in my schema file. fieldType name=text_stem class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ charFilter class=solr.PatternReplaceCharFilterFactory pattern='+ replacement= / !-- strip off all apostrophe (') characters -- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory expand=true ignoreCase=true synonyms=../../resources/type-index-synonyms.txt/ filter class=solr.SnowballPorterFilterFactory language=English / !-- Used to have language=English - seems this param is gone in 4.9 -- filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ charFilter class=solr.PatternReplaceCharFilterFactory pattern='+ replacement= / !-- strip off all apostrophe (') characters -- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory expand=true ignoreCase=true synonyms=../../resources/type-query-colloq-synonyms.txt/ filter class=solr.SnowballPorterFilterFactory language=English / !-- Used to have language=English - seems this param is gone in 4.9 -- filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer /fieldType Field: field name=majorTextSignalStem type=text_stem indexed=true stored=false multiValued=true omitNorms=false/ Copy: copyField dest=majorTextSignalStem source=majorTextSignalRaw / Thanks, Ryan
Re: Server is shutting down due to threads
Erick, It is just one shard. Indexing traffic is going to the other node and then synched with this one(both are part of cloud). We kept that setting running for 5 days as defective node would just go down with search traffic. So both were in sync when search was turned on. Soft commit is very low, around 2 secs, but that doesn't seem to affect the other node which is functioning normally. Memory settings for both nodes are identical, including m/c configuration. On Wed, Sep 3, 2014 at 4:23 PM, Erick Erickson erickerick...@gmail.com wrote: Do you have indexing traffic going to it? b/c this _looks_ like the node is just starting up or a searcher is being opened and you're loading your index first time. This happens when you index data and when you start up your nodes. Adding some autowarming (firstSearcher in this case) might load up the underlying caches earlier. This could also be a problem due to very short commit intervals, although this latter should be identical for both nodes. And when you say 2 solr nodes, is this one shard or two? I'm guessing that you have some setting that's significantly different, memory perhaps? Best, Erick On Wed, Sep 3, 2014 at 2:40 PM, Ethan eh198...@gmail.com wrote: Forgot to add the source thread thats blocking every other thread http-bio-52158-exec-61 - Thread t@591 java.lang.Thread.State: RUNNABLE at org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312) at org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212) - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97) at org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) - locked org.apache.tomcat.util.net.SocketWrapper@7826692 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Locked ownable synchronizers: - locked java.util.concurrent.ThreadPoolExecutor$Worker@2463aef On Wed, Sep 3, 2014 at 2:31 PM, Ethan eh198...@gmail.com wrote: We have SolrCloud instance with 2 solr nodes and 3 zk ensemble. One of
Re: Server is shutting down due to threads
Hmmm, I'm puzzled then. I'm guessing that the node that keeps going down is the follower, which means it should have _less_ work to do than the node that stays up. Not a lot less, but less still. I'd try lengthening out my commit interval. I realize you've set it to 2 seconds for a reason, this is mostly to see if it has any effect and have a place to _start_ looking. I'm assuming your hard commit has openSearcher set to false. Just to double check, these two nodes are just a leader and follower, right? IOW, they're part of the same collection, your collection just has one shard. m/c configuration? What's that? If it's a typo for m/s (master/slave) then that may be an issue. In a SolrCloud setup there is no master/slave and you shouldn't configure them Best, Erick On Wed, Sep 3, 2014 at 8:52 PM, Ethan eh198...@gmail.com wrote: Erick, It is just one shard. Indexing traffic is going to the other node and then synched with this one(both are part of cloud). We kept that setting running for 5 days as defective node would just go down with search traffic. So both were in sync when search was turned on. Soft commit is very low, around 2 secs, but that doesn't seem to affect the other node which is functioning normally. Memory settings for both nodes are identical, including m/c configuration. On Wed, Sep 3, 2014 at 4:23 PM, Erick Erickson erickerick...@gmail.com wrote: Do you have indexing traffic going to it? b/c this _looks_ like the node is just starting up or a searcher is being opened and you're loading your index first time. This happens when you index data and when you start up your nodes. Adding some autowarming (firstSearcher in this case) might load up the underlying caches earlier. This could also be a problem due to very short commit intervals, although this latter should be identical for both nodes. And when you say 2 solr nodes, is this one shard or two? I'm guessing that you have some setting that's significantly different, memory perhaps? Best, Erick On Wed, Sep 3, 2014 at 2:40 PM, Ethan eh198...@gmail.com wrote: Forgot to add the source thread thats blocking every other thread http-bio-52158-exec-61 - Thread t@591 java.lang.Thread.State: RUNNABLE at org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312) at org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212) - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901) at org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97) at org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at