Re:
To expand a bit on the other replies, yes, your order data should definitely be denormalized into one single order scheme. We store orders this way in Solr, since near real-time search among live orders is a requirement for several of our systems. Something non-Solr though - consider denormalizing your order data in your relational database as well. Sooner or later, you will get into trouble with keeping orders and associated products separated via normalization - unless you keep a history of all previous versions of a product, or you never change products. Say that a product changes its name one month after an order is placed - if you keep the data normalized, all previous orders will get the new name of the product - not the name it had when the order was placed. This behaviour is usually not sought after in my experience. This would, of course, also make a direct map to Solr more straightforward. -- Henrik Ossipoff Hansen On 1. dec. 2013 at 02.06.54, subacini Arunkumar (subac...@gmail.commailto://subac...@gmail.com) wrote: Thanks Walter for the reply. Here is my complete requirement. Please let me know the possible solutions to address my requirement. * Two tables might have millions of records with 50 columns in each table * Expected output is same as what we get in SQL inner join say For Eg, I have two tables Product , Order table. *Product Table * id Name P1 ipad P2 iphone 4 P3 iphone 5 *Order Table* id order date product_id O1 1-Dec-2012 P1 O2 1-Dec-2012 P2 O3 2-Dec-2012 P2 *Expected Output: *I want to show the details in UI as below [SQL inner join ] O1 01-Dec-2012 ipad O2 1-Dec-2012 iPhone 4 O3 2-Dec-2012 iPhone 5 I tried setting up two solr cores , Product core Order Core *Option 1: Using Solr Join* I got expected result but i was able to get columns only from one core (ie) total 3 records but only product table columns http://…./product/select?q=*fq={!join from=product_id to=id fromIndex=order}* *Option 2: Using shards* Created third core, but number of records is sum of(Product core + Order Core) as documents are of different types and they are all unique(ie) 6 records So how could i generate a single document with all fields from two different document types in different cores. On Sat, Nov 30, 2013 at 8:04 AM, Walter Underwood wun...@wunderwood.orgwrote: 1. Flatten the data into a single table. 2. Solr does not seem like a good solution for order data, especially live orders that need to be transactional. That is a great match to a standard relational DB. wunder On Nov 30, 2013, at 12:15 AM, subacini Arunkumar subac...@gmail.com wrote: Hi We are using solr 4.4 . Please let me know the possible solutions to address my requirement. We have to fetch data from two tables Product , Order table. Product Table id Name P1 ipad P2 iphone 4 P3 iphone 5 Order Table id order date product_id O1 1-Dec-2012 P1 O2 1-Dec-2012 P2 O3 2-Dec-2012 P2 I want to show the details in UI as below O1 01-Dec-2012 On Sat, Nov 30, 2013 at 12:13 AM, subacini Arunkumar subac...@gmail.com wrote: Hi We are using solr 4.4 . Please let me know the possible solutions to address my requirement. We have to fetch data from two tables Product , Order table. Product Table id Name P1 ipad P2 iphone 4 P3 iphone 5 Order Table id order date product_id O1 -- Walter Underwood wun...@wunderwood.org
Re: SolrCloud unstable
Hello, I’m experiencing sort of the same issue, but with much smaller indexes - although with much higher latency on disks during backup sessions on our NFS. I have a feeling the solution could be the same, so I’ll just leave my story here just in case, no solution found yet. http://lucene.472066.n3.nabble.com/SolrCloud-never-fully-recovers-after-slow-disks-td4099350.html -- Henrik Ossipoff Hansen Developer, Entertainment Trading On 12. nov. 2013 at 09.47.01, Martin de Vries (mar...@downnotifier.commailto://mar...@downnotifier.com) wrote: Hi, We have: Solr 4.5.1 - 5 servers 36 cores, 2 shards each, 2 servers per shard (every core is on 4 servers) about 4.5 GB total data on disk per server 4GB JVM-Memory per server, 3GB average in use Zookeeper 3.3.5 - 3 servers (one shared with Solr) haproxy load balancing Our Solrcloud is very unstable. About one time a week some cores go in recovery state or down state. Many timeouts occur and we have to restart servers to get them back to work. The failover doesn't work in many cases, because one server has the core in down state, the other in recovering state. Other cores work fine. When the cloud is stable I sometimes see log messages like: - shard update error StdNode: http://033.downnotifier.com:8983/solr/dntest_shard2_replica1/:org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://033.downnotifier.com:8983/solr/dntest_shard2_replica1 - forwarding update to http://033.downnotifier.com:8983/solr/dn_shard2_replica2/ failed - retrying ... - null:ClientAbortException: java.io.IOException: Broken pipe Before the the cloud problems start there are many large Qtime's in the log (sometimes over 50 seconds), but there are no other errors until the recovery problems start. Any clue about what can be wrong? Kinds regards, Martin
RE: Why do people want to deploy to Tomcat?
I agree with previous statements about the ‘example’ name is putting people off. Not only that though, I believe there are still some of the official wiki pages that directly states that the shipped Jetty is not appropriate for production use, which was what made us use Tomcat for a long while (that, and one developer had previous experience with Tomcat configuration). -- Henrik Ossipoff Hansen Developer, Entertainment Trading On 12. nov. 2013 at 15.45.42, Hoggarth, Gil (gil.hogga...@bl.ukmailto://gil.hogga...@bl.uk) wrote: For me, a side-affect of 'example' is that it's just that, not appropriate for production. But also, there's the organisation factor beyond Solr that is about staff expertise - we don't have any systems that utilise jetty so we're unfamiliar with its configuration, issues, or oddities. Tomcat is our defacto container so it makes sense for us to implement Solr within Tomcat. If we ruled out these reasons, I'd still be looking for a container that: - was a standalone installation (i.e., outside of Solr tarball) so that it would be managed via yum (we run on RHEL). This separates any issues of Solr from issues of jetty, which given a current lack of jetty knowledge would be a helpful thing. - the container service could be managed via standard SysV startup processes. To be fair, I've implemented our own for Tomcat and could do this for jetty, but I'd prefer jetty included this (which would suggest it is more prepared for enterprise use). - Likewise, I assume all of jetty's configuration can be reset to use normal RHEL /etc/ and /var/ directories, but I'd prefer that jetty did this for me (to demonstrate again it's enterprise-ready status). Yes, I could do all the necessary bespoke configuration so that jetty follows the above reasons, but because I'd have to I question if it's ready for our enterprise setup (which mainly means that our Operations team will fight against unusual configurations). Having added all of this, I have to admit that I like the idea of using jetty because you guys tell me that Solr is affectively pre-configured for jetty. But then I'd want to know what in particular these jetty configurations were! BTW Very pleased that this is being discussed - the views can help me argue our case to use jetty if it is indeed more beneficial to do so. Gil -Original Message- From: Sebastián Ramírez [mailto:sebastian.rami...@senseta.com] Sent: 12 November 2013 13:38 To: solr-user@lucene.apache.org Subject: Re: Why do people want to deploy to Tomcat? I agree with Doug, when I started I had to spend some time figuring out what was just an example and what I would have to change in a production environment... until I found that all the example was ready for production. Of course, you commonly have to change the settings, parameters, fields, etc. of your Solr system, but the example doesn't have anything that is not for production. Sebastián Ramírez [image: SENSETA – Capture Analyze] http://www.senseta.com/ On Tue, Nov 12, 2013 at 8:18 AM, Amit Aggarwal amit.aggarwa...@gmail.comwrote: Agreed with Doug On 12-Nov-2013 6:46 PM, Doug Turnbull dturnb...@opensourceconnections.com wrote: As an aside, I think one reason people feel compelled to deviate from the distributed jetty distribution is because the folder is named example. I've had to explain to a few clients that this is a bit of a misnomer. The IT dept especially sees example and feels uncomfortable using that as a starting point for a jetty install. I wish it was called default or bin or something where its more obviously the default jetty distribution of Solr. On Tue, Nov 12, 2013 at 7:06 AM, Roland Everaert reveatw...@gmail.com wrote: In my case, the first time I had to deploy and configure solr on tomcat (and jboss) it was a requirement to reuse as much as possible the application/web server already in place. The next deployment I also use tomcat, because I was used to deploy on tomcat and I don't know jetty at all. I could ask the same question with regard to jetty. Why use/bundle(/ if not recommend) jetty with solr over other webserver solutions? Regards, Roland Everaert. On Tue, Nov 12, 2013 at 12:33 PM, Alvaro Cabrerizo topor...@gmail.com wrote: In my case, the selection of the servlet container has never been a hard requirement. I mean, some customers provide us a virtual machine configured with java/tomcat , others have a tomcat installed and want to share it with solr, others prefer jetty because their sysadmins are used to configure it... At least in the projects I've been working in, the selection of the servlet engine has not been a key factor in the project success. Regards. On Tue, Nov 12, 2013 at 12:11 PM, Andre Bois-Crettez andre.b...@kelkoo.comwrote: We are using Solr running
RE: SolrCloud never fully recovers after slow disks
The joy was short-lived. Tonight our environment was “down/slow” a bit longer than usual. It looks like two of our nodes never recovered, clusterstate says everything is active. All nodes are throwing this in the log (the nodes they have trouble reaching are the ones that are affected) - the error comes about several cores: ERROR - 2013-11-11 09:16:42.735; org.apache.solr.common.SolrException; Error while trying to recover. core=products_se_shard1_replica2:org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://solr04.cd-et.com:8080/solr at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:717) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) ... 4 more ERROR - 2013-11-11 09:16:42.736; org.apache.solr.cloud.RecoveryStrategy; Recovery failed - trying again... (30) core=products_se_shard1_replica2 -- Henrik Ossipoff Hansen Developer, Entertainment Trading On 10. nov. 2013 at 21.07.32, Henrik Ossipoff Hansen (h...@entertainment-trading.commailto://h...@entertainment-trading.com) wrote: Solr version is 4.5.0. I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg and the Zookeeper timeout in solr.xml seemed to somewhat minimize the problem, but it still did occur. I next stopped all larger batch indexing in the period where the issues happened, which also seemed to help somewhat. Now the next thing weirds me out a bit - I switched from using Tomcat7 to using the Jetty that ships with Solr, and that actually seems to have fixed the last issues (together with stopping a few smaller updates - very few). During the slow period in the night, I get something like this: 03:11:49 ERROR ZkController There was a problem finding the leader in zk:org.apache.solr.common.SolrException: Could not get leader props 03:06:47 ERROR Overseer Could not create Overseer node 03:06:47 WARN LeaderElector 03:06:47 WARN ZkStateReader ZooKeeper watch triggered, but Solr cannot talk to ZK 03:07:41 WARN RecoveryStrategy Stopping recovery for zkNodeName=solr04.cd-et.com:8080_solr_auto_suggest_shard1_replica2core=auto_suggest_shard1_replica2 After this, the cluster state seems to be fine, and I'm not being spammed with errors in the log files. Bottom line is that the issues are fixed for now it seems, but I still find it weird that Solr was not able to fully receover. // Henrik Ossipoff -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: 10. november 2013 19:27 To: solr-user@lucene.apache.org Subject: Re: SolrCloud never fully recovers after
Re: SolrCloud never fully recovers after slow disks
I will file a JIRA later today. What I don’t get though (I haven’t looked much into any actual Solr code) is that at this point, our systems are running fine, so timeouts shouldn’t be an issue. Those two nodes though, is somehow left in a state where their response time is up to around 120k ms - which is fairly high - everything else is running like normal at this point. -- Henrik Ossipoff Hansen Developer, Entertainment Trading On 11. nov. 2013 at 16.01.58, Mark Miller (markrmil...@gmail.commailto://markrmil...@gmail.com) wrote: The socket read timeouts are actually fairly short for recovery - we should probably bump them up. Can you file a JIRA issue? It may be a symptom rather than a cause, but given a slow env, bumping them up makes sense. - Mark On Nov 11, 2013, at 8:27 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: The joy was short-lived. Tonight our environment was “down/slow” a bit longer than usual. It looks like two of our nodes never recovered, clusterstate says everything is active. All nodes are throwing this in the log (the nodes they have trouble reaching are the ones that are affected) - the error comes about several cores: ERROR - 2013-11-11 09:16:42.735; org.apache.solr.common.SolrException; Error while trying to recover. core=products_se_shard1_replica2:org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://solr04.cd-et.com:8080/solr at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:717) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) ... 4 more ERROR - 2013-11-11 09:16:42.736; org.apache.solr.cloud.RecoveryStrategy; Recovery failed - trying again... (30) core=products_se_shard1_replica2 -- Henrik Ossipoff Hansen Developer, Entertainment Trading On 10. nov. 2013 at 21.07.32, Henrik Ossipoff Hansen (h...@entertainment-trading.commailto://h...@entertainment-trading.com) wrote: Solr version is 4.5.0. I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg and the Zookeeper timeout in solr.xml seemed to somewhat minimize the problem, but it still did occur. I next stopped all larger batch indexing in the period where the issues happened, which also seemed to help somewhat. Now the next thing weirds me out a bit - I switched from using Tomcat7 to using the Jetty that ships with Solr, and that actually seems to have fixed the last issues (together with stopping a few smaller updates - very few). During the slow period in the night, I get something like this: 03:11:49 ERROR ZkController There was a problem finding the leader in zk:org.apache.solr.common.SolrException: Could not get leader props 03:06:47 ERROR Overseer
RE: SolrCloud never fully recovers after slow disks
Solr version is 4.5.0. I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg and the Zookeeper timeout in solr.xml seemed to somewhat minimize the problem, but it still did occur. I next stopped all larger batch indexing in the period where the issues happened, which also seemed to help somewhat. Now the next thing weirds me out a bit - I switched from using Tomcat7 to using the Jetty that ships with Solr, and that actually seems to have fixed the last issues (together with stopping a few smaller updates - very few). During the slow period in the night, I get something like this: 03:11:49 ERROR ZkController There was a problem finding the leader in zk:org.apache.solr.common.SolrException: Could not get leader props 03:06:47 ERROR Overseer Could not create Overseer node 03:06:47 WARN LeaderElector 03:06:47 WARN ZkStateReader ZooKeeper watch triggered, but Solr cannot talk to ZK 03:07:41 WARN RecoveryStrategy Stopping recovery for zkNodeName=solr04.cd-et.com:8080_solr_auto_suggest_shard1_replica2core=auto_suggest_shard1_replica2 After this, the cluster state seems to be fine, and I'm not being spammed with errors in the log files. Bottom line is that the issues are fixed for now it seems, but I still find it weird that Solr was not able to fully receover. // Henrik Ossipoff -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: 10. november 2013 19:27 To: solr-user@lucene.apache.org Subject: Re: SolrCloud never fully recovers after slow disks Which version of solr are you using? Regardless of your env, this is a fail safe that you should not hit. - Mark On Nov 5, 2013, at 8:33 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: I previously made a post on this, but have since narrowed down the issue and am now giving this another try, with another spin to it. We are running a 4 node setup (over Tomcat7) with a 3-ensemble external ZooKeeper. This is running no a total of 7 (4+3) different VMs, and each VM is using our Storage system (NFS share in VMWare). Now I do realize and have heard, that NFS is not the greatest way to run Solr on, but we have never had this issue on non-SolrCloud setups. Basically, each night when we run our backup jobs, our storage becomes a bit slow in response - this is obviously something we’re trying to solve, but bottom line is, that all our other systems somehow stays alive or recovers gracefully when bandwidth exists again. SolrCloud - not so much. Typically after a session like this, 3-5 nodes will either go into a Down state or a Recovering state - and stay that way. Sometimes such node will even be marked as leader. A such node will have something like this in the log: ERROR - 2013-11-05 08:57:45.764; org.apache.solr.update.processor.DistributedUpdateProcessor; ClusterState says we are the leader, but locally we don't think so ERROR - 2013-11-05 08:57:45.768; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: ClusterState says we are the leader (http://solr04.cd-et.com:8080/solr/products_fi_shard1_replica2), but locally we don't think so. Request came from http://solr01.cd-et.com:8080/solr/products_fi_shard2_replica1/ at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:381) at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:243) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168
Re: SolrCloud never fully recovers after slow disks
Hey Erick, I have tried upping the timeouts quite a bit now, and have tried upping the zkTimeout setting in Solr itself (I found a few old posts on the mailing list suggesting this). I realise this is a sort of weird situation, where we are actually trying to work around some horrible hardware setup. Thank you for your post - I will make another post in a day or two after I see how it performs. -- Henrik Ossipoff Hansen Developer, Entertainment Trading On 7. nov. 2013 at 13.23.59, Erick Erickson (erickerick...@gmail.commailto://erickerick...@gmail.com) wrote: Right, can you up your ZK timeouts significantly? It sounds like your ZK timeout is short enough that when your system slows down, the timeout is exceeded and it's throwing Solr into a tailspin See zoo.cfg. Best, Erick On Tue, Nov 5, 2013 at 3:33 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: I previously made a post on this, but have since narrowed down the issue and am now giving this another try, with another spin to it. We are running a 4 node setup (over Tomcat7) with a 3-ensemble external ZooKeeper. This is running no a total of 7 (4+3) different VMs, and each VM is using our Storage system (NFS share in VMWare). Now I do realize and have heard, that NFS is not the greatest way to run Solr on, but we have never had this issue on non-SolrCloud setups. Basically, each night when we run our backup jobs, our storage becomes a bit slow in response - this is obviously something we’re trying to solve, but bottom line is, that all our other systems somehow stays alive or recovers gracefully when bandwidth exists again. SolrCloud - not so much. Typically after a session like this, 3-5 nodes will either go into a Down state or a Recovering state - and stay that way. Sometimes such node will even be marked as leader. A such node will have something like this in the log: ERROR - 2013-11-05 08:57:45.764; org.apache.solr.update.processor.DistributedUpdateProcessor; ClusterState says we are the leader, but locally we don't think so ERROR - 2013-11-05 08:57:45.768; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: ClusterState says we are the leader ( http://solr04.cd-et.com:8080/solr/products_fi_shard1_replica2), but locally we don't think so. Request came from http://solr01.cd-et.com:8080/solr/products_fi_shard2_replica1/ at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:381) at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:243) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:307) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) On the other nodes, an error similar to this will be in the log: 09:27:34 - ERROR - SolrCmdDistributor shard update error RetryNode: http://solr04.cd-et.com:8080/solr/products_dk_shard1_replica2
SolrCloud never fully recovers after slow disks
I previously made a post on this, but have since narrowed down the issue and am now giving this another try, with another spin to it. We are running a 4 node setup (over Tomcat7) with a 3-ensemble external ZooKeeper. This is running no a total of 7 (4+3) different VMs, and each VM is using our Storage system (NFS share in VMWare). Now I do realize and have heard, that NFS is not the greatest way to run Solr on, but we have never had this issue on non-SolrCloud setups. Basically, each night when we run our backup jobs, our storage becomes a bit slow in response - this is obviously something we’re trying to solve, but bottom line is, that all our other systems somehow stays alive or recovers gracefully when bandwidth exists again. SolrCloud - not so much. Typically after a session like this, 3-5 nodes will either go into a Down state or a Recovering state - and stay that way. Sometimes such node will even be marked as leader. A such node will have something like this in the log: ERROR - 2013-11-05 08:57:45.764; org.apache.solr.update.processor.DistributedUpdateProcessor; ClusterState says we are the leader, but locally we don't think so ERROR - 2013-11-05 08:57:45.768; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: ClusterState says we are the leader (http://solr04.cd-et.com:8080/solr/products_fi_shard1_replica2), but locally we don't think so. Request came from http://solr01.cd-et.com:8080/solr/products_fi_shard2_replica1/ at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:381) at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:243) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:307) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) On the other nodes, an error similar to this will be in the log: 09:27:34 - ERROR - SolrCmdDistributor shard update error RetryNode: http://solr04.cd-et.com:8080/solr/products_dk_shard1_replica2/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server at http://solr04.cd-et.com:8080/solr/products_dk_shard1_replica2 returned non ok status:503, message:Service Unavailable 09:27:34 -ERROR - SolrCmdDistributor forwarding update to http://solr04.cd-et.com:8080/solr/products_dk_shard1_replica2/ failed - retrying ... Does anyone have any ideas or leads towards a solution - one that doesn’t involve getting a new storage system (a solution we *are* actively working on, but that’s not a quick fix in our case). Shouldn’t a setup like this be possible? And even more so - shouldn’t SolrCloud be able to gracefully recover after issues like this? -- Henrik Ossipoff Hansen Developer, Entertainment
Pivot faceting not working after upgrading to 4.5
Hello, We have a rather weird behavior I don't really understand. As written in a few other threads, we're migrating from a master/slave setup running 4.3 to a SolrCloud setup running 4.5. Both run on the same data set (the 4.5 instances have been re-indexed under 4.5 obviously). The following query works fine under our 4.3 setup: ?q=*:*facet.pivot=facet_category,facet_platformfacet=truerows=0 However, in our 4.5 setup, the facet_pivot entry in the facet_count is straight up missing in the response. I've been digging around the logs for a bit, but I'm unable to find something relating to this. If I remove one of the facet.pivot elements (i.e. only having facet.pivot=facet_category) I get an error as expected, so that part of the component is at least working. Does anyone have an idea to something obvious I might have missed? I've been unable to find any change logs suggesting changes to this part of the facet component. Thanks. Regards, Henrik
Re: Pivot faceting not working after upgrading to 4.5
I realise now that distributed pivotal faceting is not implemented yet in SolrCloud after some digging through the internet. Apologies :) Den 21/10/2013 kl. 18.20 skrev Henrik Ossipoff Hansen h...@entertainment-trading.com: Hello, We have a rather weird behavior I don't really understand. As written in a few other threads, we're migrating from a master/slave setup running 4.3 to a SolrCloud setup running 4.5. Both run on the same data set (the 4.5 instances have been re-indexed under 4.5 obviously). The following query works fine under our 4.3 setup: ?q=*:*facet.pivot=facet_category,facet_platformfacet=truerows=0 However, in our 4.5 setup, the facet_pivot entry in the facet_count is straight up missing in the response. I've been digging around the logs for a bit, but I'm unable to find something relating to this. If I remove one of the facet.pivot elements (i.e. only having facet.pivot=facet_category) I get an error as expected, so that part of the component is at least working. Does anyone have an idea to something obvious I might have missed? I've been unable to find any change logs suggesting changes to this part of the facet component. Thanks. Regards, Henrik
Re: SolrCloud Query Balancing
What you could do (and what we do) is to have a simple proxy in front of your Solr instances. We for example run with Nginx in front of all of our Tomcats, and use Nginx's upstream capabilities to do a simple loadbalancer for our SolrCloud cluster. http://wiki.nginx.org/HttpUpstreamModule I'm sure other web servers have similar modules. Den 16/10/2013 kl. 12.08 skrev michael.boom my_sky...@yahoo.commailto:my_sky...@yahoo.com: Thanks! I've read a lil' bit about that, but my app is php-based so I'm afraid I can't use that. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.com.
Re: SolrCloud Query Balancing
I did not actually realize this, I apologize for my previous reply! Haproxy would definitely be the right choice then for the posters setup for redundancy. Den 16/10/2013 kl. 15.53 skrev Shawn Heisey s...@elyograg.org: On 10/16/2013 3:52 AM, michael.boom wrote: I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3 machines along with 3 Zookeeper instances. My web application makes queries to Solr specifying the hostname of one of the machines. So that machine will always get the request and the other ones will just serve as an aid. So I would like to setup a load balancer that would fix that, balancing the queries to all machines. Maybe doing the same while indexing. SolrCloud actually handles load balancing for you. You'll find that when you send requests to one server, they are actually being re-directed across the entire cloud, unless you include a distrib=false parameter on the request, but that would also limit the search to one shard, which is probably not what you want. The only thing that you don't get with a non-Java client is redundancy. If you can't build in failover capability yourself, which is a very advanced programming technique, then you need a load balancer. For my large non-Cloud Solr install, I use haproxy as a load balancer. Most of the time, it doesn't actually balance the load, just makes sure that Solr is always reachable even if part of it goes down. The haproxy program is simple and easy to use, but performs extremely well. I've got a pacemaker cluster making sure that the shared IP address, haproxy, and other homegrown utility applications related to Solr are only running on one machine. Thanks, Shawn
Hardware dimension for new SolrCloud cluster
We're in the process of moving onto SolrCloud, and have gotten to the point where we are considering how to do our hardware setup. We're limited to VMs running on our server cluster and storage system, so buying new physical servers is out of the question - the question is how we should dimension the new VMs. Our document area is somewhat small, with about 1.2 million orders (rising of course), 75k products (divided into 5 countries - each which will be their own collection/core) and some million customers. In our current master/slave setup, we only index the products, with each country taking up about 35 MB of disk space. The index frequency i more or less updating the indexes 8 times per hour (mostly this is not all data thought, but atomic updates with new stock data, new prices etc.). Our upcoming order and customer indexes however will more or less receive updates on the fly as it happens (softcommit) and we expect the same to be the case for products in the near future. - For hardware, it's down to 1 or 2 cores - current master runs with 2 cores - RAM - currently our master runs with 6 GB only - How much heap space should we allocate for max heap? We currently plan on this setup: - 1 machine for a simple loadbalancer - 4 VMs totally for the Solr machines themselves (for both leaders and replicas, just one replica per shard is enough for our use case) - A qorum of 3 ZKs Question is - is this machine setup enough? And how exactly do we dimension the Solr machines? Any help, pointers or resources will be much appreciated :) Thank you!
SolrCloud looses connection to Zookeeper but stays down?
We are slowly starting to move from a Master/slave setup into SolrCloud, and with the addition some new functionality on our site, we decided to give it a go in production (with a very minimal setup so far). We are experiencing that our nodes looses connection to ZK during the night according to the log: 02:17:33 WARN OverseerCollectionProcessor Overseer cannot talk to ZK 02:17:33 WARN Overseer Solr cannot talk to ZK, exiting Overseer main queue loop The node is listed as down in the cloud window in Solr admin. However, as I'm speaking, it seems to be able to speak with ZK just fine; I can just fine update configurations from ZK to SolrCloud nodes - but the nodes are still listed as down. Everything seems to work. Is this a known bug, that they are still listed as down even though they're up and active? We're running 4.4.0.
RE: Facet sorting seems weird
This is indeed an interesting idea so to speak, but I think it's a bit too manual, so to speak, for our use case. I do see it would solve the problem though, so thank you for sharing it with the community! :) -Original Message- From: James Thomas [mailto:jtho...@camstar.com] Sent: 15. juli 2013 17:08 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, We did something related to this that I'll share. I'm rather new to Solr so take this idea cautiously :-) Our requirement was to show exact values but have case-insensitive sorting and facet filtering (prefix filtering). We created an index field (type=string) for creating facets so that the values are indexed as-is. The values we indexed were given the format lowercase value|exact value So for example, given the value bObles, we would index the string bobles|bObles. When displaying the facet we split the facet value from Solr in half and display the second half to the user. Of course the caveat is that you could have 2 facets that differ only in case, but to me that's a data cleansing issue. James -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: Monday, July 15, 2013 10:57 AM To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
RE: Facet sorting seems weird
Hi Alex, Yes this makes sense. My Java is a bit dusty, but depending on how much in need we will become at this feature, it's definitely something we will look into creating, and if successful, we will definitely be submitting a patch. Thank you for your time and detailed answer! Best regards, Henrik Ossipoff -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 15. juli 2013 17:16 To: solr-user@lucene.apache.org Subject: Re: Facet sorting seems weird Hi Henrik, If I understand the question correctly (case-insensitive sorting of the facet values), then this is the limitation of the current Facet component. You can see the full implementation at: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818 If you are comfortable with Java code, the easiest thing might be to copy/fix the component and use your own one for faceting. The components are defined in solrconfig.xml and FacetComponent is in a default chain. See: https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194 If you do manage to do this (I would recommend doing it as an extra option), it would be nice to have it contributed back to Solr. I think you are not the only one with this requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Facet sorting seems weird
Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
RE: Facet sorting seems weird
Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff