Re: Solr Hangs During Updates for over 10 minutes
We are planning an upgrade to 4.4 but it's still weeks out. We offer a high availability search service and there are a number of changes in 4.4 that are not backward compatible. (i.e. Clusterstate.json and no solr.xml) So there must be lots of testing, additionally this upgrade cannot be performed without downtime. Regardless, I need to find a band-aid right now. Does anyone know if it's possible to set the timeout for distributed update request to/from leader. Currently we see it's set to 0. Maybe via -D startup param, or something? Jed On 7/10/13 1:23 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Jed, This is really with Solr 4.0? If so, it may be wiser to jump on 4.4 that is about to be released. We did not have fun working with 4.0 in SolrCloud mode a few months ago. You will save time, hair, and money if you convince your manager to let you use Solr 4.4. :) Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner jglaz...@adobe.com wrote: Hi Shawn, I have been trying to duplicate this problem without success for the last 2 weeks which is one reason I'm getting flustered. It seems reasonable to be able to duplicate it but I can't. We do have a story to upgrade but that is still weeks if not months before that gets rolled out to production. We have another cluster running the same version but with 8 shards and 8 replicas with each shard at 100gb and more load and more indexing requests without this problem but we send docs in batches here and all fields are stored. Where as the trouble index has only 1 or 2 stored fields and only send docs 1 at a time. Could that have anything to do with it? Jed Von Samsung Mobile gesendet Ursprüngliche Nachricht Von: Shawn Heisey s...@elyograg.org Datum: 07.09.2013 18:33 (GMT+01:00) An: solr-user@lucene.apache.org Betreff: Re: Solr Hangs During Updates for over 10 minutes On 7/9/2013 9:50 AM, Jed Glazner wrote: I'll give you the high level before delving deep into setup etc. I have been struggeling at work with a seemingly random problem when solr will hang for 10-15 minutes during updates. This outage always seems to immediately be proceeded by an EOF exception on the replica. Then 10-15 minutes later we see an exception on the leader for a socket timeout to the replica. The leader will then tell the replica to recover which in most cases it does and then the outage is over. Here are the setup details: We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines. After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced and have since been fixed. You're five releases and about nine months behind what's current. My recommendation: Upgrade to 4.3.1, ensure your configuration is up to date with changes to the example config between 4.0.0 and 4.3.1, and reindex. Ideally, you should set up a 4.0.0 testbed, duplicate your current problem, and upgrade the testbed to see if the problem goes away. A testbed will also give you practice for a smooth upgrade of your production system. Thanks, Shawn
Re: Solr Hangs During Updates for over 10 minutes
Hey Daniel, Thanks for the response. I think we'll give this a try to see if this helps. Jed. On 7/10/13 10:48 AM, Daniel Collins danwcoll...@gmail.com wrote: We had something similar in terms of update times suddenly spiking up for no obvious reason. We never got quite as bad as you in terms of the other knock on effects, but we certainly saw updates jumping from 10ms up to 3ms, all our external queues backed up and we rejected some updates, then after a while things quietened down. We were running Solr 4.3.0 but with Java 6 and the CMS GC. We swapped to Java 7, G1 GC (and increased heap size from 8Gb to 12Gb) and the problem went away. Now, I admit its not exactly the same as your case, we never had the follow-on effects, but I'd consider Java 7 and the G1 GC, it has certainly reduced the spikes in our indexing times. We run the following settings now (the usual caveats apply, it might not work for you). GC_OPTIONS=-XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringCache -XX:+OptimizeStringConcat -XX:-UseSplitVerifier -XX:+UseNUMA -XX:MaxGCPauseMillis=50 -XX:GCPauseIntervalMillis=1000 I set the MaxGCPauseMillis/GCPauseIntervalMillis to try to minimise application pauses, that's our goal, if we have to use more memory in the short term then so be it, but we couldn't afford application pauses, because we are using NRT (soft commits every 1s, hard commits every 60s) and we get a lot of updates. I know there have been other discussion on G1 and it has received mixed results overall, but for us, it seems to be a winner. Hope that helps, On 10 July 2013 08:32, Jed Glazner jglaz...@adobe.com wrote: We are planning an upgrade to 4.4 but it's still weeks out. We offer a high availability search service and there are a number of changes in 4.4 that are not backward compatible. (i.e. Clusterstate.json and no solr.xml) So there must be lots of testing, additionally this upgrade cannot be performed without downtime. Regardless, I need to find a band-aid right now. Does anyone know if it's possible to set the timeout for distributed update request to/from leader. Currently we see it's set to 0. Maybe via -D startup param, or something? Jed On 7/10/13 1:23 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Jed, This is really with Solr 4.0? If so, it may be wiser to jump on 4.4 that is about to be released. We did not have fun working with 4.0 in SolrCloud mode a few months ago. You will save time, hair, and money if you convince your manager to let you use Solr 4.4. :) Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner jglaz...@adobe.com wrote: Hi Shawn, I have been trying to duplicate this problem without success for the last 2 weeks which is one reason I'm getting flustered. It seems reasonable to be able to duplicate it but I can't. We do have a story to upgrade but that is still weeks if not months before that gets rolled out to production. We have another cluster running the same version but with 8 shards and 8 replicas with each shard at 100gb and more load and more indexing requests without this problem but we send docs in batches here and all fields are stored. Where as the trouble index has only 1 or 2 stored fields and only send docs 1 at a time. Could that have anything to do with it? Jed Von Samsung Mobile gesendet Ursprüngliche Nachricht Von: Shawn Heisey s...@elyograg.org Datum: 07.09.2013 18:33 (GMT+01:00) An: solr-user@lucene.apache.org Betreff: Re: Solr Hangs During Updates for over 10 minutes On 7/9/2013 9:50 AM, Jed Glazner wrote: I'll give you the high level before delving deep into setup etc. I have been struggeling at work with a seemingly random problem when solr will hang for 10-15 minutes during updates. This outage always seems to immediately be proceeded by an EOF exception on the replica. Then 10-15 minutes later we see an exception on the leader for a socket timeout to the replica. The leader will then tell the replica to recover which in most cases it does and then the outage is over. Here are the setup details: We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines. After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced and have since been fixed. You're five releases and about nine months behind what's current. My recommendation: Upgrade to 4.3.1, ensure your configuration is up to date with changes to the example config between 4.0.0 and 4.3.1, and reindex. Ideally, you should set up a 4.0.0 testbed, duplicate your current problem, and upgrade the testbed to see if the problem goes away. A testbed will also give you practice for a smooth upgrade of your production system. Thanks, Shawn
Re: Solr Hangs During Updates for over 10 minutes
It is certainly 'more' possible, as we have additional code that revolves around reading the clusterstate.json and since solr decided to change the format of the clusterstate.json from 4.0 to 4.1 it requires additional code changes to our service since the solrj lib from 4.0 isn't compatible with anything after 4.0 due to the clusterstate.json change. I can however run java7 with these GC in a dev env under load to see if they blow up or if it's even possible, and then roll it out to the replica, and then to to the leader. I cannot however do this with a solr upgrade without significant coding changes to our service, which would require us to roll out new code for our service, as well as new solr instances. So, while it's 'just as risky' as you say, it's 'less risky' than a new version of java and is possible to implement without downtime. It is actually something of a pain point that the upgrade path to solrcloud seems to frequently require downtime. (clusterstate.json changes in 4.1, and then again this big change in 4.4 with no solr.xml). So we'll do what we can quickly to see if we can 'band-aid' the problem until we can upgrade to solr 4.4 Speaking of band-aids - does anyone know of a way to change the socket timeout/connection timeout for distributed updates? Jed. On 7/10/13 2:38 PM, Erick Erickson erickerick...@gmail.com wrote: Jed: I'm not sure changing Java runtime is any less scary than upgrading Solr Wait, I know! Ask your manager if you can do both at once evil smirk. I have a t-shirt that says I don't test, but when I do it's in production... Erick On Wed, Jul 10, 2013 at 8:08 AM, Jed Glazner jglaz...@adobe.com wrote: Hey Daniel, Thanks for the response. I think we'll give this a try to see if this helps. Jed. On 7/10/13 10:48 AM, Daniel Collins danwcoll...@gmail.com wrote: We had something similar in terms of update times suddenly spiking up for no obvious reason. We never got quite as bad as you in terms of the other knock on effects, but we certainly saw updates jumping from 10ms up to 3ms, all our external queues backed up and we rejected some updates, then after a while things quietened down. We were running Solr 4.3.0 but with Java 6 and the CMS GC. We swapped to Java 7, G1 GC (and increased heap size from 8Gb to 12Gb) and the problem went away. Now, I admit its not exactly the same as your case, we never had the follow-on effects, but I'd consider Java 7 and the G1 GC, it has certainly reduced the spikes in our indexing times. We run the following settings now (the usual caveats apply, it might not work for you). GC_OPTIONS=-XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringCache -XX:+OptimizeStringConcat -XX:-UseSplitVerifier -XX:+UseNUMA -XX:MaxGCPauseMillis=50 -XX:GCPauseIntervalMillis=1000 I set the MaxGCPauseMillis/GCPauseIntervalMillis to try to minimise application pauses, that's our goal, if we have to use more memory in the short term then so be it, but we couldn't afford application pauses, because we are using NRT (soft commits every 1s, hard commits every 60s) and we get a lot of updates. I know there have been other discussion on G1 and it has received mixed results overall, but for us, it seems to be a winner. Hope that helps, On 10 July 2013 08:32, Jed Glazner jglaz...@adobe.com wrote: We are planning an upgrade to 4.4 but it's still weeks out. We offer a high availability search service and there are a number of changes in 4.4 that are not backward compatible. (i.e. Clusterstate.json and no solr.xml) So there must be lots of testing, additionally this upgrade cannot be performed without downtime. Regardless, I need to find a band-aid right now. Does anyone know if it's possible to set the timeout for distributed update request to/from leader. Currently we see it's set to 0. Maybe via -D startup param, or something? Jed On 7/10/13 1:23 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Jed, This is really with Solr 4.0? If so, it may be wiser to jump on 4.4 that is about to be released. We did not have fun working with 4.0 in SolrCloud mode a few months ago. You will save time, hair, and money if you convince your manager to let you use Solr 4.4. :) Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner jglaz...@adobe.com wrote: Hi Shawn, I have been trying to duplicate this problem without success for the last 2 weeks which is one reason I'm getting flustered. It seems reasonable to be able to duplicate it but I can't. We do have a story to upgrade but that is still weeks if not months before that gets rolled out to production. We have another cluster running the same version but with 8 shards and 8 replicas with each shard at 100gb and more load and more indexing requests without this problem but we send docs in batches here and all fields are stored. Where
AW: Solr Hangs During Updates for over 10 minutes
Hi Shawn this code is für the solrj lib which we already use. I'm talking about solr s internal communication from leader to replica via the DistributedCmdUpdate class. I want to force the leader to time out after a fixed period instead of waiting for 15 minutes für the server to figure out the other end of the socket was closed.I don't know of any flags or settings in the solrconfig.xml to do this or if it's even possible with out modifying source code. Jed Von Samsung Mobile gesendet Ursprüngliche Nachricht Von: Shawn Heisey s...@elyograg.org Datum: 07.10.2013 17:35 (GMT+01:00) An: solr-user@lucene.apache.org Betreff: Re: Solr Hangs During Updates for over 10 minutes On 7/10/2013 6:57 AM, Jed Glazner wrote: So we'll do what we can quickly to see if we can 'band-aid' the problem until we can upgrade to solr 4.4 Speaking of band-aids - does anyone know of a way to change the socket timeout/connection timeout for distributed updates? If you need to change HttpClient parameters for CloudSolrServer, here's how you can do it: String zkHost = zk1.REDACTED.com:2181,zk2.REDACTED.com:2181,zk3.REDACTED.com:2181/chroot; ModifiableSolrParams params = new ModifiableSolrParams(); params.set(HttpClientUtil.PROP_MAX_CONNECTIONS, 1000); params.set(HttpClientUtil.PROP_MAX_CONNECTIONS_PER_HOST, 200); params.set(HttpClientUtil.PROP_SO_TIMEOUT, 30); params.set(HttpClientUtil.PROP_CONNECTION_TIMEOUT, 5000); HttpClient client = HttpClientUtil.createClient(params); ResponseParser parser = new BinaryResponseParser(); LBHttpSolrServer lbServer = new LBHttpSolrServer(client, parser); CloudSolrServer server = new CloudSolrServer(zkHost, lbServer); Thanks, Shawn
Solr Hangs During Updates for over 10 minutes
I'll give you the high level before delving deep into setup etc. I have been struggeling at work with a seemingly random problem when solr will hang for 10-15 minutes during updates. This outage always seems to immediately be proceeded by an EOF exception on the replica. Then 10-15 minutes later we see an exception on the leader for a socket timeout to the replica. The leader will then tell the replica to recover which in most cases it does and then the outage is over. Here are the setup details: We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines. We have 2 active collections each with only 1 shard (we have in total about 15 collections but most are empty or have less than 100 docs). The first index (collection1) is 6.5GB and has ~18M documents. The 2nd index (collection2) is 9GB and has about 13M documents. In all cases the leader resides on 1 server and the replica resides on the other. Both servers are AWS XL High Mem instances. (8 CPUs @ 2.67Ghz, 70GB Ram) with the index residing on a 1TB raid 10 using ephemeral storage disks. We are starting solr using the embedded jetty with the following java memory and GC options: -Xmx16382m -Xms4092m -XX:MaxPermSize=256m -Xss256k -XX:NewSize=1536m -XX:SurvivorRatio=16 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:ParallelCMSThreads=2 -XX:+CMSClassUnloadingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=80 -XX:+CMSParallelRemarkEnabled Both collections receive a constant stream of updates ~10k per hour (both adds/deletes). Approximately once per day the following events transpire: 1. We see a log entry for a distributed update that takes just over 5 ms followed by an EOF write exception on the replica. In all cases this exception is triggered by an update to the 9GB collection. 2. Occasionally we'll see a 503 shard update error on the leader but usually not. 3. Approximately 15 minutes after this exception we see a timeout error for a this distributed update request on the leader. 4. The leader then creates a new connection and tells the replica to recover, which it does and everything is OK again. 5. During the 15 minute window from when the replica throws the EOF until the SocketTimeout by the leader no other updates are processed: ERROR ON REPLICA: Jul 8, 2013 6:38:16 PM org.apache.solr.core.SolrCore execute INFO: [collection2_0] webapp=/solr path=/update params={distrib.from=http://Solr4-1-1.domain.com:8983/solr/collection2_0/update.distrib=FROMLEADERwt=javabinversion=2} status=0 QTime=50012 Jul 8, 2013 6:38:16 PM org.apache.solr.common.SolrException log SEVERE: null:org.eclipse.jetty.io.EofException at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:154) at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:101) at org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:203) at org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:196) at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:94) at org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:49) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:404) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:289) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:351) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47) at
AW: Solr Hangs During Updates for over 10 minutes
Hi Shawn, I have been trying to duplicate this problem without success for the last 2 weeks which is one reason I'm getting flustered. It seems reasonable to be able to duplicate it but I can't. We do have a story to upgrade but that is still weeks if not months before that gets rolled out to production. We have another cluster running the same version but with 8 shards and 8 replicas with each shard at 100gb and more load and more indexing requests without this problem but we send docs in batches here and all fields are stored. Where as the trouble index has only 1 or 2 stored fields and only send docs 1 at a time. Could that have anything to do with it? Jed Von Samsung Mobile gesendet Ursprüngliche Nachricht Von: Shawn Heisey s...@elyograg.org Datum: 07.09.2013 18:33 (GMT+01:00) An: solr-user@lucene.apache.org Betreff: Re: Solr Hangs During Updates for over 10 minutes On 7/9/2013 9:50 AM, Jed Glazner wrote: I'll give you the high level before delving deep into setup etc. I have been struggeling at work with a seemingly random problem when solr will hang for 10-15 minutes during updates. This outage always seems to immediately be proceeded by an EOF exception on the replica. Then 10-15 minutes later we see an exception on the leader for a socket timeout to the replica. The leader will then tell the replica to recover which in most cases it does and then the outage is over. Here are the setup details: We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines. After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced and have since been fixed. You're five releases and about nine months behind what's current. My recommendation: Upgrade to 4.3.1, ensure your configuration is up to date with changes to the example config between 4.0.0 and 4.3.1, and reindex. Ideally, you should set up a 4.0.0 testbed, duplicate your current problem, and upgrade the testbed to see if the problem goes away. A testbed will also give you practice for a smooth upgrade of your production system. Thanks, Shawn
Re: Writing new indexes from index readers slow!
Thanks Otis, I had not considered that approach, however not all of our fields are stored so that's not going to work for me. I'm wondering if its slow because there is just the one reader getting passed to the index writer... I noticed today that the addIndexes method can take an array of readers. Maybe if I can send in an array of readers for the individual segments in the index... I'll try that tomorrow. Jed Otis Gospodnetic otis.gospodne...@gmail.com wrote: Jed, While this is something completely different, have you considered using SolrEntityProcessor instead? (assuming all your fields are stored) http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at 2:25 PM, Jed Glazner jglaz...@adobe.commailto:jglaz...@adobe.com wrote: Hey Hey Everybody!, I'm not sure if I should have posted this to the developers list... if i'm totally barking up the wrong tree here, please let me know! Anywho, I've developed a command line utility based on the MultiPassIndexSplitter class from the lucene library, but I'm finding that on our large index (350GB), it's taking WAY to long to write the newly split indexes! It took 20.5 hours for execution to finish. I should note that solr is not running while I'm splitting the index. Because solr can't really be running while I run this tool performance is critical as our service will be down! I am aware that there is an api currently under development on trunk in solr cloud (https://issues.apache.org/jira/browse/SOLR-3755) but I need something now as our large index wreaking havoc on our service. Here is some basic context info: The Index: == Solr/Lucene 4.1 Index Size: 350GB Documents: 185,194,528 The Hardware (http://aws.amazon.com/ec2/instance-types/): === AWS High-Memory X-Large (m2.xlarge) instance CPU: 8 cores (2 virtual cores with 3.25 EC2 Compute Units each) 17.1 GB ram 1.2TB ebs raid The Process (splitting 1 index into 8): === I'm trying to split this index into 8 separate indexes using this tool. To do this I create 8 worker threads. Each thread creates gets a new FakeDeleteIndexReader object, and loops over every document, and uses a hash algorithm to decide if it should keep or delete the document. Note that the documents are not actually deleted at this point because (as I understand it) the FakeDeleteIndexReader emulates deletes without actually modifying the underlying index. After each worker has determined which documents it should keep I create a new Directory object, Instanciate a new IndexWriter, and pass the FakeDeleteIndexReader object to the addIndexs method. (this is the part that takes forever!) It only takes about an hour for all of the threads to hash/delete the documents it doesn't want. However it takes 19+ hours to write all of the new indexes! Watching iowait The disk doesn't look to be over worked (about 85% idle), so i'm baffled as to why it would take that long! I've tried running the write operations inside the worker threads, and serialy with no real difference! Here is the relevant code that I'm using to write the indexes: /** * Creates/merges a new index with a FakeDeleteIndexReader. The reader should have marked/deleted all * of the documents that should not be included in this new index. When the index is written/committed * these documents will be removed. * * @param directory *The directory object of the new index * @param version *The lucene version of the index * @param reader *A FakeDeleteIndexReader that contains lots of uncommitted deletes. * @throws IOException */ private void writeToDisk(Directory directory, Version version, FakeDeleteIndexReader reader) throws IOException { IndexWriterConfig cfg = new IndexWriterConfig(version, new WhitespaceAnalyzer(version)); cfg.setOpenMode(OpenMode.CREATE); IndexWriter w = new IndexWriter(directory, cfg); w.addIndexes(reader); w.commit(); w.close(); reader.close(); } Any Ideas?? I'm happy to share more snippets of source code if that is helpful.. -- [cid:part1.06000602.0109@adobe.com] Jed Glazner Sr. Software Engineer Adobe Social 385.221.1072tel:385.221.1072 (tel) 801.360.0181tel:801.360.0181 (cell) jglaz...@adobe.commailto:jglaz...@adobe.com 550 East Timpanogus Circle Orem, UT 84097-6215, USA www.adobe.comhttp://www.adobe.com
Writing new indexes from index readers slow!
Hey Hey Everybody!, I'm not sure if I should have posted this to the developers list... if i'm totally barking up the wrong tree here, please let me know! Anywho, I've developed a command line utility based on the MultiPassIndexSplitter class from the lucene library, but I'm finding that on our large index (350GB), it's taking WAY to long to write the newly split indexes! It took 20.5 hours for execution to finish. I should note that solr is not running while I'm splitting the index. Because solr can't really be running while I run this tool performance is critical as our service will be down! I am aware that there is an api currently under development on trunk in solr cloud (https://issues.apache.org/jira/browse/SOLR-3755) but I need something now as our large index wreaking havoc on our service. Here is some basic context info: The Index: == Solr/Lucene 4.1 Index Size: 350GB Documents: 185,194,528 The Hardware (http://aws.amazon.com/ec2/instance-types/): === AWS High-Memory X-Large (m2.xlarge) instance CPU: 8 cores (2 virtual cores with 3.25 EC2 Compute Units each) 17.1 GB ram 1.2TB ebs raid The Process (splitting 1 index into 8): === I'm trying to split this index into 8 separate indexes using this tool. To do this I create 8 worker threads. Each thread creates gets a new FakeDeleteIndexReader object, and loops over every document, and uses a hash algorithm to decide if it should keep or delete the document. Note that the documents are not actually deleted at this point because (as I understand it) the FakeDeleteIndexReader emulates deletes without actually modifying the underlying index. After each worker has determined which documents it should keep I create a new Directory object, Instanciate a new IndexWriter, and pass the FakeDeleteIndexReader object to the addIndexs method. (this is the part that takes forever!) It only takes about an hour for all of the threads to hash/delete the documents it doesn't want. However it takes 19+ hours to write all of the new indexes! Watching iowait The disk doesn't look to be over worked (about 85% idle), so i'm baffled as to why it would take that long! I've tried running the write operations inside the worker threads, and serialy with no real difference! Here is the relevant code that I'm using to write the indexes: /** * Creates/merges a new index with a FakeDeleteIndexReader. The reader should have marked/deleted all * of the documents that should not be included in this new index. When the index is written/committed * these documents will be removed. * * @param directory * The directory object of the new index * @param version * The lucene version of the index * @param reader * A FakeDeleteIndexReader that contains lots of uncommitted deletes. * @throws IOException */ private void writeToDisk(Directory directory, Version version, FakeDeleteIndexReader reader) throws IOException { IndexWriterConfig cfg = new IndexWriterConfig(version, new WhitespaceAnalyzer(version)); cfg.setOpenMode(OpenMode.CREATE); IndexWriter w = new IndexWriter(directory, cfg); w.addIndexes(reader); w.commit(); w.close(); reader.close(); } Any Ideas?? I'm happy to share more snippets of source code if that is helpful.. -- Jed Glazner Sr. Software Engineer Adobe Social 385.221.1072 (tel) 801.360.0181 (cell) jglaz...@adobe.com 550 East Timpanogus Circle Orem, UT 84097-6215, USA www.adobe.com
Re: How to make a server become a replica / leader for a collection at startup
Hey Mark, Thanks for the extra effort in responding :) Are you ok if I file a jira ticket and complete this feature on trunk? We need this feature for a project. Jed Glazner Sr. Software Engineer Adobe jglaz...@adobe.com - Reply message - From: Mark Miller markrmil...@gmail.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Subject: How to make a server become a replica / leader for a collection at startup Date: Sun, Aug 19, 2012 9:11 am Hmm...last email was blocked from the list as spam :) Let me try again forcing plain text: Hey Jed, I think what you are looking for is something I have proposed, but is not implemented yet. We started with a fairly simple collections API since we just wanted to make sure we had something in 4.0. I would like it to be better though. My proposal was that when you create a new collection with n shards and z replicas, that should be recorded in ZooKeeper by the Overseer. The Overseer should then watch for when a new node comes up - then a trigger a process that compares the config for the collection against the real world - and remove or add based on that info. I don't think it's that difficult to do, but given a lot of other things we are working on, and the worry of destabilizing anything before the 4 release, I think it's more likely to come in a point release later. It's not super complicated work, but there are some tricky corner cases I think. - Mark
How to make a server become a replica / leader for a collection at startup
Hello All, I'm working to solve an interesting problem. The problem that I have is that when I pull a server out of the cloud (to do maintenance say) and then bring it back up, it won't automatically sync up with zookeeper and become a leader or replica for any collections that I have created while it was off-line even though I specified a number of shards or replicas higher than the number of servers that are registered with zookeeper. Here is my setup: External Zookeeper(v 3.3.5) Ensemble (zk1, zk2, zk3) SolrCloud (4.0.0-BETA) with 2 shards and 2 replicas (shard1, shard2, shard1a, shard2a) Here is the detailed scenario: I create a new collection name 'collection2' using the collection api and specify 2 shards and 2 replicas. (curl 'http://shard1:8983/solr/admin/collections?action="">')The result of the call creates (as I would expect) 2 shards and 2 replicas. I then push some docs into 'collection2' and I see the documents are distributed between shard1 and shard2 and are replicated to 1a and 2a. So far so good. Now to simulate a node failure I take down shard1a while pushing some more docs into 'collection2'. Additionally while shard1a is down I also create a new collection named 'collection3' using the collections api and specify 2 shards and 2 replicas. The result of the call creates (as I would expect) 2 shards and 1 replica since shard1a is down there are not enough servers to create all of the replicas. Before bringing backup shard1a I push some documents into 'collection3' and see the docs are distributed between shard1 and shard2 with shard2a replicating shard2. Everything looks great and working as expected. Thus far. When I bring shard1a back on-line however, here is what I would expect to happen: 1. Shard1a registers with zookeeper, zookeeper assigns it as a replica of shard1 for 'collection2' (it knows about collection2 because it's stored in the solr.xml) 2. Shard1a asks zookeeper if there are any collections that have missing replicas, or not enough shards. 2. Zookeeper responds that 'collection3' on shard1 doesn't have a replica (remember I created the collection with 2 replicas but only one is present). 4. Shard1a creates a new core and becomes a replica for 'collection3' on shard1 5. Shard1a synchronizes with shard1 and replicates the missing documents for 'collection2' and 'collection3'. However here is what really happens: 1. shard1a registers with zookeeper and is assigned a replica of shard1 for 'collection2' 2. shard1a synchronizes with shard1 and replicates the missing documents for 'collection2' Nothing else happens. How I can I make shard1a automatically become a replica or a leader for missing cores within a collection when it comes online? -- Jed Glazner Sr. Software Engineer Adobe Social 385.221.1072 (tel) 801.360.0181 (cell) jglaz...@adobe.com 550 East Timpanogus Circle Orem, UT 84097-6215, USA www.adobe.com
Re: Replicaiton Fails with Unreachable error when master host is responding.
So it turns out that it's the host names. According the DNS RFC underscores are not valid in host names. Most DNS servers now support them, but it's not in the rfc strictly speaking. So there must be something in the underlying java classes that bork when using underscores in host names, though I didn't see anything in the stack trace that indicated an invalid host name exception. That was most the issue though. Once I changed the host name to the master's IP address replication worked great. So I'm working with our IT to remove underscores from our host names. Just thought I would post my answer here in case anyone else had that issue. Thanks. Jed. On 04/28/2011 02:03 PM, Mike Sokolov wrote: No clue. Try wireshark to gather more data? On 04/28/2011 02:53 PM, Jed Glazner wrote: Anybody? On 04/27/2011 01:51 PM, Jed Glazner wrote: Hello All, I'm having a very strange problem that I just can't figure out. The slave is not able to replicate from the master, even though the master is reachable from the slave machine. I can telnet to the port it's running on, I can use text based browsers to navigate the master from the slave. I just don't understand why it won't replicate. The admin screen gives me an Unreachable in the status, and in the log there is an exception thrown. Details below: BACKGROUND: OS: Arch Linux Solr Version: svn revision 1096983 from https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ No custom plugins, just whatever came with the version above. Java Setup: java version 1.6.0_22 OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) We have 3 cores running, all 3 cores are not able to replicate. The admin on the slave shows the Master as http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable* Replicaiton def on the slave 529requestHandler name=/replication class=solr.ReplicationHandler 530lst name=${slave:slave} 531str name=masterUrlhttp://solr-master-01_dev.la.bo:8983/solr/music/replication/str 532str name=pollInterval00:15:00/str 533/lst 534/requestHandler Replication def on the master: 529requestHandler name=/replication class=solr.ReplicationHandler 530lst name=${master:master} 531str name=replicateAftercommit/str 532str name=replicateAfterstartup/str 533str name=confFilesschema.xml,stopwords.txt/str 534/lst 535/requestHandler Below is the log start to finish for replication attempts, note that it says connection refused, however, I can telnet to 8983 from the slave to the master, so I know it's up and reachable from the slave: telnet solr-master-01_dev.la.bo 8983 Trying 172.12.65.58... Connected to solr-master-01_dev.la.bo. Escape character is '^]'. I double checked the master to make sure that it didn't have replication turned off, and it's not. So I should be able to replicate but it can't. I just dont' know what else to check. The log from the slave is below. Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponseinit WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please use the corresponding class in org.apache.solr.response Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler getReplicationDetails WARNING: Exception while invoking 'details' method for replication on master java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43
Re: Replicaiton Fails with Unreachable error when master host is responding.
Anybody? On 04/27/2011 01:51 PM, Jed Glazner wrote: Hello All, I'm having a very strange problem that I just can't figure out. The slave is not able to replicate from the master, even though the master is reachable from the slave machine. I can telnet to the port it's running on, I can use text based browsers to navigate the master from the slave. I just don't understand why it won't replicate. The admin screen gives me an Unreachable in the status, and in the log there is an exception thrown. Details below: BACKGROUND: OS: Arch Linux Solr Version: svn revision 1096983 from https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ No custom plugins, just whatever came with the version above. Java Setup: java version "1.6.0_22" OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) We have 3 cores running, all 3 cores are not able to replicate. The admin on the slave shows the Master as http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable* Replicaiton def on the slave 529 requestHandler name="/replication" class="solr.ReplicationHandler" 530 lst name="${slave:slave}" 531 str name="masterUrl"http://solr-master-01_dev.la.bo:8983/solr/music/replication/str 532 str name="pollInterval"00:15:00/str 533 /lst 534 /requestHandler Replication def on the master: 529 requestHandler name="/replication" class="solr.ReplicationHandler" 530 lst name="${master:master}" 531 str name="replicateAfter"commit/str 532 str name="replicateAfter"startup/str 533 str name="confFiles"schema.xml,stopwords.txt/str 534 /lst 535 /requestHandler Below is the log start to finish for replication attempts, note that it says connection refused, however, I can telnet to 8983 from the slave to the master, so I know it's up and reachable from the slave: telnet solr-master-01_dev.la.bo 8983 Trying 172.12.65.58... Connected to solr-master-01_dev.la.bo. Escape character is '^]'. I double checked the master to make sure that it didn't have replication turned off, and it's not. So I should be able to replicate but it can't. I just dont' know what else to check. The log from the slave is below. Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse init WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please use the corresponding class in org.apache.solr.response Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler getReplicationDetails WARNING: Exception while invoking 'details' method for replication on master java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125) at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) at org.apache.commons.httpclient.HttpMethodDirector.e
Replicaiton Fails with Unreachable error when master host is responding.
Hello All, I'm having a very strange problem that I just can't figure out. The slave is not able to replicate from the master, even though the master is reachable from the slave machine. I can telnet to the port it's running on, I can use text based browsers to navigate the master from the slave. I just don't understand why it won't replicate. The admin screen gives me an Unreachable in the status, and in the log there is an exception thrown. Details below: BACKGROUND: OS: Arch Linux Solr Version: svn revision 1096983 from https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ No custom plugins, just whatever came with the version above. Java Setup: java version 1.6.0_22 OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) We have 3 cores running, all 3 cores are not able to replicate. The admin on the slave shows the Master as http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable* Replicaiton def on the slave 529 requestHandler name=/replication class=solr.ReplicationHandler 530 lst name=${slave:slave} 531 str name=masterUrlhttp://solr-master-01_dev.la.bo:8983/solr/music/replication/str 532 str name=pollInterval00:15:00/str 533 /lst 534 /requestHandler Replication def on the master: 529 requestHandler name=/replication class=solr.ReplicationHandler 530 lst name=${master:master} 531 str name=replicateAftercommit/str 532 str name=replicateAfterstartup/str 533 str name=confFilesschema.xml,stopwords.txt/str 534 /lst 535 /requestHandler Below is the log start to finish for replication attempts, note that it says connection refused, however, I can telnet to 8983 from the slave to the master, so I know it's up and reachable from the slave: telnet solr-master-01_dev.la.bo 8983 Trying 172.12.65.58... Connected to solr-master-01_dev.la.bo. Escape character is '^]'. I double checked the master to make sure that it didn't have replication turned off, and it's not. So I should be able to replicate but it can't. I just dont' know what else to check. The log from the slave is below. Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse init WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please use the corresponding class in org.apache.solr.response Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler getReplicationDetails WARNING: Exception while invoking 'details' method for replication on master java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125) at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at
Re: HTTP ERROR 400 undefined field: *
So I re-indexed some of the content, but no dice. Per Hoss, I tried disabling the TVC and it worked great. We're not really using tvc right now since we made a decision to turn off highlighting for the moment, so this isn't a huge deal. I'll create a new jira issue. FYI here is my query from the logs: --this one breaks (undefined field) webapp=/solr path=/select params={explainOther=fl=*,scoreindent=onstart=0q=brucehl.fl=qt=standardwt=standardfq=version=2.2rows=10} hits=114 status=400 QTime=21 this one works: webapp=/solr path=/select params={explainOther=indent=onhl.fl=wt=standardversion=2.2rows=10fl=*,scorestart=0q=brucetv=falseqt=standardfq=} hits=128 status=0 QTime=48 Though i'm not sure why when the tvc is disabled there are more hits, but the qtime is slower. That's a different issue though, and something I can work though. Thanks for your help. On 02/07/2011 11:38 AM, Chris Hostetter wrote: : The stack trace is attached. I also saw this warning in the logs not sure From your attachment... 853 SEVERE: org.apache.solr.common.SolrException: undefined field: score 854 at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142) 855 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) 856 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 857 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357) ...this is one of the key pieces of info that was missing from your earlier email: that you are using the TermVectorComponent. It's likely that something changed in the TVC on 3x between the two versions you were using and thta change freaks out now on * or score in the fl. you still haven't given us an example of the full URLs you are using that trigger this error. (it's posisble there is something slightly off in your syntax - we don't know because you haven't shown us) All in: this sounds like a newly introduced bug in TVC, please post the details into a new Jira issue. as to the warning you asked about... : Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion : WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24 : emulation. You should at some point declare and reindex to at least 3.0, : because 2.4 emulation is deprecated and will be removed in 4.0. This parameter : will be mandatory in 4.0. if you look at the example configs on the 3x branch it should be explained. it's basically just a new feature that lets you specify which quirks of the underlying lucene code you want (so on upgrading you are in control of wether you eliminate old quirks or not) -Hoss
Re: HTTP ERROR 400 undefined field: *
here is the ticket: https://issues.apache.org/jira/browse/SOLR-2352 On 02/08/2011 11:27 AM, Jed Glazner wrote: So I re-indexed some of the content, but no dice. Per Hoss, I tried disabling the TVC and it worked great. We're not really using tvc right now since we made a decision to turn off highlighting for the moment, so this isn't a huge deal. I'll create a new jira issue. FYI here is my query from the logs: --this one breaks (undefined field) webapp=/solr path=/select params={explainOther=fl=*,scoreindent=onstart=0q=brucehl.fl=qt=standardwt=standardfq=version=2.2rows=10} hits=114 status=400 QTime=21 this one works: webapp=/solr path=/select params={explainOther=indent=onhl.fl=wt=standardversion=2.2rows=10fl=*,scorestart=0q=brucetv=falseqt=standardfq=} hits=128 status=0 QTime=48 Though i'm not sure why when the tvc is disabled there are more hits, but the qtime is slower. That's a different issue though, and something I can work though. Thanks for your help. On 02/07/2011 11:38 AM, Chris Hostetter wrote: : The stack trace is attached. I also saw this warning in the logs not sure From your attachment... 853 SEVERE: org.apache.solr.common.SolrException: undefined field: score 854 at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142) 855 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) 856 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 857 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357) ...this is one of the key pieces of info that was missing from your earlier email: that you are using the TermVectorComponent. It's likely that something changed in the TVC on 3x between the two versions you were using and thta change freaks out now on * or score in the fl. you still haven't given us an example of the full URLs you are using that trigger this error. (it's posisble there is something slightly off in your syntax - we don't know because you haven't shown us) All in: this sounds like a newly introduced bug in TVC, please post the details into a new Jira issue. as to the warning you asked about... : Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion : WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24 : emulation. You should at some point declare and reindex to at least 3.0, : because 2.4 emulation is deprecated and will be removed in 4.0. This parameter : will be mandatory in 4.0. if you look at the example configs on the 3x branch it should be explained. it's basically just a new feature that lets you specify which quirks of the underlying lucene code you want (so on upgrading you are in control of wether you eliminate old quirks or not) -Hoss
Re: HTTP ERROR 400 undefined field: *
Thanks Otis, I'll give that a try. Jed. On 02/06/2011 08:06 PM, Otis Gospodnetic wrote: Yup, here it is, warning about needing to reindex: http://twitter.com/#!/lucene/status/28694113180192768 Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Erick Ericksonerickerick...@gmail.com To: solr-user@lucene.apache.org Sent: Sun, February 6, 2011 9:43:00 AM Subject: Re: HTTP ERROR 400 undefined field: * I *think* that there was a post a while ago saying that if you were using trunk 3_x one of the recent changes required re-indexing, but don't quote me on that. Have you tried that? Best Erick On Fri, Feb 4, 2011 at 2:04 PM, Jed Glazner jglaz...@beyondoblivion.comwrote: Sorry for the lack of details. It's all clear in my head.. :) We checked out the head revision from the 3.x branch a few weeks ago ( https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/). We picked up r1058326. We upgraded from a previous checkout (r960098). I am using our customized schema.xml and the solrconfig.xml from the old revision with the new checkout. After upgrading I just copied the data folders from each core into the new checkout (hoping I wouldn't have to re-index the content, as this takes days). Everything seems to work fine, except that now I can't get the score to return. The stack trace is attached. I also saw this warning in the logs not sure exactly what it's talking about: Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24 emulation. You should at some point declare and reindex to at least 3.0, because 2.4 emulation is deprecated and will be removed in 4.0. This parameter will be mandatory in 4.0. Here is my request handler, the actual fields here are different than what is in mine, but I'm a little uncomfortable publishing how our companies search service works to the world: requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=defTypeedismax/str bool name=tvtrue/bool !-- standard field to query on -- str name=qffield_a^2 field_b^2 field_c^4/str !-- automatic phrase boosting! -- str name=pffield_d^10/str !-- boost function -- !-- we'll comment this out for now becuase we're passing it to solr as a paramter. Once we finalize the exact function we should move it here and take it out of the query string. -- !--str name=bflog(linear(field_e,0.001,1))^10/str-- str name=tie0.1/str /lst arr name=last-components strtvComponent/str /arr /requestHandler Anyway Hopefully this is enough info, let me know if you need more. Jed. On 02/03/2011 10:29 PM, Chris Hostetter wrote: : I was working on an checkout of the 3.x branch from about 6 months ago. : Everything was working pretty well, but we decided that we should update and : get what was at the head. However after upgrading, I am now getting this FWIW: please be specific. head of what? the 3x branch? or trunk? what revision in svn does that corrispond to? (the svnversion command will tell you) : HTTP ERROR 400 undefined field: * : : If I clear the fl parameter (default is set to *, score) then it works fine : with one big problem, no score data. If I try and set fl=score I get the same : error except it says undefined field: score?! : : This works great in the older version, what changed? I've googled for about : an hour now and I can't seem to find anything. i can't reproduce this using either trunk (r1067044) or 3x (r1067045) all of these queries work just fine... http://localhost:8983/solr/select/?q=* http://localhost:8983/solr/select/?q=solrfl=*,score http://localhost:8983/solr/select/?q=solrfl=score http://localhost:8983/solr/select/?q=solr ...you'll have to proivde us with a *lot* more details to help understand why you might be getting an error (like: what your configs look like, what the request looks like, what the full stack trace of your error is in the logs, etc...) -Hoss
Re: HTTP ERROR 400 undefined field: *
Sorry for the lack of details. It's all clear in my head.. :) We checked out the head revision from the 3.x branch a few weeks ago (https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/). We picked up r1058326. We upgraded from a previous checkout (r960098). I am using our customized schema.xml and the solrconfig.xml from the old revision with the new checkout. After upgrading I just copied the data folders from each core into the new checkout (hoping I wouldn't have to re-index the content, as this takes days). Everything seems to work fine, except that now I can't get the score to return. The stack trace is attached. I also saw this warning in the logs not sure exactly what it's talking about: Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24 emulation. You should at some point declare and reindex to at least 3.0, because 2.4 emulation is deprecated and will be removed in 4.0. This parameter will be mandatory in 4.0. Here is my request handler, the actual fields here are different than what is in mine, but I'm a little uncomfortable publishing how our companies search service works to the world: requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=defTypeedismax/str bool name=tvtrue/bool !-- standard field to query on -- str name=qffield_a^2 field_b^2 field_c^4 /str !-- automatic phrase boosting! -- str name=pffield_d^10/str !-- boost function -- !-- we'll comment this out for now becuase we're passing it to solr as a paramter. Once we finalize the exact function we should move it here and take it out of the query string. -- !--str name=bflog(linear(field_e,0.001,1))^10/str-- str name=tie0.1/str /lst arr name=last-components strtvComponent/str /arr /requestHandler Anyway Hopefully this is enough info, let me know if you need more. Jed. On 02/03/2011 10:29 PM, Chris Hostetter wrote: : I was working on an checkout of the 3.x branch from about 6 months ago. : Everything was working pretty well, but we decided that we should update and : get what was at the head. However after upgrading, I am now getting this FWIW: please be specific. head of what? the 3x branch? or trunk? what revision in svn does that corrispond to? (the svnversion command will tell you) : HTTP ERROR 400 undefined field: * : : If I clear the fl parameter (default is set to *, score) then it works fine : with one big problem, no score data. If I try and set fl=score I get the same : error except it says undefined field: score?! : : This works great in the older version, what changed? I've googled for about : an hour now and I can't seem to find anything. i can't reproduce this using either trunk (r1067044) or 3x (r1067045) all of these queries work just fine... http://localhost:8983/solr/select/?q=* http://localhost:8983/solr/select/?q=solrfl=*,score http://localhost:8983/solr/select/?q=solrfl=score http://localhost:8983/solr/select/?q=solr ...you'll have to proivde us with a *lot* more details to help understand why you might be getting an error (like: what your configs look like, what the request looks like, what the full stack trace of your error is in the logs, etc...) -Hoss 844 Feb 3, 2011 8:16:58 PM org.apache.solr.core.SolrCore execute 845 INFO: [music] webapp=/solr path=/select params={explainOther=fl=*,scoreindent=onstart=0q=testhl.fl=qt=standardwt=standardfq=version=2.2rows=10} hits=2201 status=400 QTime=143 846 Feb 3, 2011 8:17:00 PM org.apache.solr.core.SolrCore execute 847 INFO: [rovi] webapp=/solr path=/replication params={command=indexversionwt=javabin} status=0 QTime=0 848 Feb 3, 2011 8:17:00 PM org.apache.solr.core.SolrCore execute 849 INFO: [rovi] webapp=/solr path=/replication params={command=filelistwt=javabinindexversion=1277332208072} status=0 QTime=0 850 Feb 3, 2011 8:17:00 PM org.apache.solr.core.SolrCore execute 851 INFO: [rovi] webapp=/solr path=/replication params={command=indexversionwt=javabin} status=0 QTime=0 852 Feb 3, 2011 8:17:09 PM org.apache.solr.common.SolrException log 853 SEVERE: org.apache.solr.common.SolrException: undefined field: score 854 at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142) 855 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) 856 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 857 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357) 858 at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) 859 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) 860 at
HTTP ERROR 400 undefined field: *
Hey Guys, I was working on an checkout of the 3.x branch from about 6 months ago. Everything was working pretty well, but we decided that we should update and get what was at the head. However after upgrading, I am now getting this error through the admin: HTTP ERROR 400 undefined field: * If I clear the fl parameter (default is set to *, score) then it works fine with one big problem, no score data. If I try and set fl=score I get the same error except it says undefined field: score?! This works great in the older version, what changed? I've googled for about an hour now and I can't seem to find anything. Jed.
Solr Highlighting Question
Thanks for taking time to read through this. I'm using a checkout from the solr 3.x branch My problem is with the highlighter and wildcards I can get the highlighter to work with wild cards just fine, the problem is that solr is returning the term matched, when what I want it to do is highlight the chars in the term that were matched. Example: http://192.168.1.75:8983/solr/music/select?indent=onq=name_title:wel*qt=beyondhl=truehl.fl=name_titlef.name_title.hl.usePhraseHighlighter=truef.name_title.hl.highlightMultiTerm=true The results that come back look like this: emWelcome/em to the Jungle What I want them to look like is this: emWel/emcome to the Jungle From what I gathered by searching the archives is that solr 1.1 used to do this... Is there a way to get that functionality? Thanks!
Re: Solr Highlighting Question
Anybody? On 09/08/2010 11:26 AM, Jed Glazner wrote: Thanks for taking time to read through this. I'm using a checkout from the solr 3.x branch My problem is with the highlighter and wildcards I can get the highlighter to work with wild cards just fine, the problem is that solr is returning the term matched, when what I want it to do is highlight the chars in the term that were matched. Example: http://192.168.1.75:8983/solr/music/select?indent=onq=name_title:wel*qt=beyondhl=truehl.fl=name_titlef.name_title.hl.usePhraseHighlighter=truef.name_title.hl.highlightMultiTerm=true The results that come back look like this: emWelcome/em to the Jungle What I want them to look like is this: emWel/emcome to the Jungle From what I gathered by searching the archives is that solr 1.1 used to do this... Is there a way to get that functionality? Thanks! -- This email and its attachments (if any) are for the sole use of the intended recipient, and may contain private, confidential, and privileged material. Any review, copying, or distribution of this email, its attachments or the information contained herein is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments.
Help with partial term highlighting
Hello Everyone, Thanks for taking time to read through this. I'm using a checkout from the solr 3.x branch My problem is with the highlighter and wildcards, and is exactly the same as this guy's but I can't find a reply to his problem: http://search-lucene.com/m/EARFMs6eR4/partial+highlight+wildcardsubj=Re+old+wildcard+highlighting+behaviour I can get the highlighter to work with wild cards just fine, the problem is that solr is returning the term matched, when what I want it to do is highlight the chars in the term that were matched. Example: http://192.168.1.75:8983/solr/music/select?indent=onq=name_title:wel*qt=beyondhl=truehl.fl=name_titlef.name_title.hl.usePhraseHighlighter=truef.name_title.hl.highlightMultiTerm=true The results that come back look like this: emWelcome/em to the Jungle What I want them to look like is this: emWel/emcome to the Jungle From what I gathered by searching the archives is that solr 1.1 used to do this... Is there anyway to get what I want without customizing the highlighting feature? Thanks!