Re: Solr 1.3 - response time very long
Thanks a lot guys for your time, I appreciate it. I will follow all your advice. Yonik Seeley wrote: On Wed, Dec 3, 2008 at 11:49 AM, sunnyfr [EMAIL PROTECTED] wrote: Sorry the request is more : /select?q=text:svr09\+tutorial+AND+status_published:1+AND+status_moderated:0+AND+status_personal:0+AND+status_explicit:0+AND+status_private:0+AND+status_deleted:0+AND+status_error:0+AND+status_read or even I tried : There are a bunch of things you could try to speed things up a bit: 1) optimize the index if you haven't 2) use a faster response writer with a more compact format (i.e. add wt=javabin for a binary format or wt=json for JSON) 3) use fl (field list) to restrict the results to only the fields you need 4) never use debugQuery to benchmark performance (I don't think you actually did, but you did list it in the example dismax URL) 5) pull out clauses that match many documents and that are common across many queries into filters. /select?q=text:svr09\+tutorialfq=status_published:1+AND+status_moderated:0+AND+status_personal:0+AND+status_explicit:0+AND+status_private:0+AND+status_deleted:0+AND+status_error:0+AND+status_read You can also use multiple filter queries for better caching if some of the clauses appear in smaller groups or in isolation. If you can give more examples, we can tell what the common parts are. -Yonik -- View this message in context: http://www.nabble.com/Solr-1.3---response-time-very-long-tp20795134p20829777.html Sent from the Solr - User mailing list archive at Nabble.com.
changing schema is dynamic or not
Hi, Every time I make any change in schema , I have to restart the server. Is this because I have made some mistake or It is like this only I mean, I have this doubt that if we make any kind of changes to schema.xml , do we need to restart the server or we can continue without restarting the server. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: changing schema is dynamic or not
you have to restart the server You may also need to re-index the data if the changes are incompatible On Thu, Dec 4, 2008 at 3:09 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Hi, Every time I make any change in schema , I have to restart the server. Is this because I have made some mistake or It is like this only I mean, I have this doubt that if we make any kind of changes to schema.xml , do we need to restart the server or we can continue without restarting the server. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul
RE: changing schema is dynamic or not
Is there any way by which this can be avoided. -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 3:12 PM To: solr-user@lucene.apache.org Subject: Re: changing schema is dynamic or not you have to restart the server You may also need to re-index the data if the changes are incompatible On Thu, Dec 4, 2008 at 3:09 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Hi, Every time I make any change in schema , I have to restart the server. Is this because I have made some mistake or It is like this only I mean, I have this doubt that if we make any kind of changes to schema.xml , do we need to restart the server or we can continue without restarting the server. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: changing schema is dynamic or not
It is possible you can reload a core through the http API but if the changes are incompatible you will have to re-index the data On Thu, Dec 4, 2008 at 3:16 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Is there any way by which this can be avoided. -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 3:12 PM To: solr-user@lucene.apache.org Subject: Re: changing schema is dynamic or not you have to restart the server You may also need to re-index the data if the changes are incompatible On Thu, Dec 4, 2008 at 3:09 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Hi, Every time I make any change in schema , I have to restart the server. Is this because I have made some mistake or It is like this only I mean, I have this doubt that if we make any kind of changes to schema.xml , do we need to restart the server or we can continue without restarting the server. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul
RE: changing schema is dynamic or not
Could you brief ,What exactly I need to do? -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 3:47 PM To: solr-user@lucene.apache.org Subject: Re: changing schema is dynamic or not It is possible you can reload a core through the http API but if the changes are incompatible you will have to re-index the data On Thu, Dec 4, 2008 at 3:16 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Is there any way by which this can be avoided. -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 3:12 PM To: solr-user@lucene.apache.org Subject: Re: changing schema is dynamic or not you have to restart the server You may also need to re-index the data if the changes are incompatible On Thu, Dec 4, 2008 at 3:09 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Hi, Every time I make any change in schema , I have to restart the server. Is this because I have made some mistake or It is like this only I mean, I have this doubt that if we make any kind of changes to schema.xml , do we need to restart the server or we can continue without restarting the server. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: changing schema is dynamic or not
http://wiki.apache.org/solr/CoreAdmin#head-3f125034c6a64611779442539812067b8b430930 On Thu, Dec 4, 2008 at 4:06 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Could you brief ,What exactly I need to do? -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 3:47 PM To: solr-user@lucene.apache.org Subject: Re: changing schema is dynamic or not It is possible you can reload a core through the http API but if the changes are incompatible you will have to re-index the data On Thu, Dec 4, 2008 at 3:16 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Is there any way by which this can be avoided. -Original Message- From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 3:12 PM To: solr-user@lucene.apache.org Subject: Re: changing schema is dynamic or not you have to restart the server You may also need to re-index the data if the changes are incompatible On Thu, Dec 4, 2008 at 3:09 PM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Hi, Every time I make any change in schema , I have to restart the server. Is this because I have made some mistake or It is like this only I mean, I have this doubt that if we make any kind of changes to schema.xml , do we need to restart the server or we can continue without restarting the server. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- --Noble Paul
Re: Multi Language Search
On Dec 2, 2008, at 4:52 AM, tushar kapoor wrote: 1. Russian Word 1 AND Russian Word 2 This is the way the query should look, but there's no reason why you can't let you're users input AND in Russian and then you substitute it when you create the query. or rather, 2 . Russian Word 1 AND in Russian Russian Word 2 Now over to solr specific question. In case the answer to above is either 1. or 2. how does one do it using Solr. I tried using the Language anallyzers but I m not too sure how exactly it works. Just send the string, with AND into Solr and the default query parser will know what to do. -Grant -- Grant Ingersoll Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Newbie question - using existing Lucene Index
On Dec 3, 2008, at 11:53 AM, Sudarsan, Sithu D. wrote: Hi All, Using Lucene, index has been created. It has five different fields. How to just use those index from SOLR for searching? I tried changing the schema as in tutorial, and copied the index to the data directory, but all searches return empty and no error message! You also need to make sure the analyzers in the schema.xml are setup the same way as the one you indexed with to create the index originally. You might also try going to wherever you are running, i.e. http://localhost:8983/solr/admin (or whatever you're URL is) and entering *:* into the query box there. This should return all documents in the index and doesn't require any analysis. Also, check your logs to see if there were any exceptions on startup. Is there a sample project available which shows using tomcat as the web engine rather than jetty? Instructions should be on the Wiki: http://wiki.apache.org/solr Your help is appreciated, Sincerely, Sithu D Sudarsan ORISE Fellow, DESE/OSEL/CDRH WO62 - 3209 GRA, UALR [EMAIL PROTECTED] [EMAIL PROTECTED] -- Grant Ingersoll Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Solr 1.3 - response time very long
Hi Yonik, I've tried everything but it's doesn't change anything, I tried as well the last trunk version but nothing changed. There is nothings that I can do about the indexation ...maybe I can optimize something before searching ??? I'm using linux system, apache 5.5, last solr version updated. Memory : 8G Intel Do you think its a lot for the index size 7.6G for 8.5M of document? And idea what can I do ??? Thanks a lot for your time Yonik Seeley wrote: On Wed, Dec 3, 2008 at 11:49 AM, sunnyfr [EMAIL PROTECTED] wrote: Sorry the request is more : /select?q=text:svr09\+tutorial+AND+status_published:1+AND+status_moderated:0+AND+status_personal:0+AND+status_explicit:0+AND+status_private:0+AND+status_deleted:0+AND+status_error:0+AND+status_read or even I tried : There are a bunch of things you could try to speed things up a bit: 1) optimize the index if you haven't 2) use a faster response writer with a more compact format (i.e. add wt=javabin for a binary format or wt=json for JSON) 3) use fl (field list) to restrict the results to only the fields you need 4) never use debugQuery to benchmark performance (I don't think you actually did, but you did list it in the example dismax URL) 5) pull out clauses that match many documents and that are common across many queries into filters. /select?q=text:svr09\+tutorialfq=status_published:1+AND+status_moderated:0+AND+status_personal:0+AND+status_explicit:0+AND+status_private:0+AND+status_deleted:0+AND+status_error:0+AND+status_read You can also use multiple filter queries for better caching if some of the clauses appear in smaller groups or in isolation. If you can give more examples, we can tell what the common parts are. -Yonik -- View this message in context: http://www.nabble.com/Solr-1.3---response-time-very-long-tp20795134p20833091.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Re[4]: solr performance
Hi, I was reading this post and I wondering how can I parallelize document processing??? Thanks Erik Erik Hatcher wrote: On Feb 21, 2007, at 4:25 PM, Jack L wrote: couple of times today at around 158 documents / sec. This is not bad at all. How about search performance? How many concurrent queries have people been having? What does the response time look like? I'm the only user :) What I've done is a proof-of-concept for our library. We have 3.7M records that I've indexed and faceted. Search performance (in my unrealistic single user scenario) is blazing (50ms or so) for purely full-text queries. For queries that return facets, the response times are actually quite good too (~900ms, or less depending on the request) - provided the filter cache is warmed and large enough. This is running on my laptop (MacBook Pro, 2GB RAM, 1.83GHz) - I'm sure on a beefier box it'll only get better. Thanks to the others that clarified. I run my indexers in parallel... but a single instance of Solr (which in turn handles requests in parallel as well). Do you feel if multi-threaded posting is helpful? It depends. If the data processing can be parallelized and your hardware supports it, it can certainly make a big difference... it did in my case. Both CPUs were cooking during my parallel indexing runs. Erik -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833421.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Resolr performance
Hi, When I check my CPU, all my CPU are not full, how can I change this ? Do I have to change a parameter ?? Thanks a lot , Johanna Walter Underwood wrote: Try running your submits while watching a CPU load meter. Do this on a multi-CPU machine. If all CPUs are busy, you are running as fast as possible. If one CPU is busy (around 50% usage on a dual-CPU system), parallel submits might help. If no CPU is 100% busy, the bottleneck is probably disk or network. wunder On 2/20/07 10:46 AM, Jack L [EMAIL PROTECTED] wrote: Thanks to all who replied. It's encouraging :) The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. I have a question for Erik: you mentioned single threaded indexer (below). I'm not familiar with solr at all and did a search on solr wiki for thread and didn't find anything. Is it so that I can actually configure solr to be single-threaded and multi-threaded? And I'm not sure what you meant by parallelizing the indexer? Running multiple instances of the indexer, or multiple instances of solr? Thanks, Jack My largest Solr index is currently at 1.4M and it takes a max of 3ms to add a document (according to Solr's console), most of them 1ms. My single threaded indexer is indexing around 1000 documents per minute, but I think I can get this number even faster by parallelizing the indexer. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr performance
Kick off some indexing more than once - eg, post a folder of docs, and while thats working, post another. I've been thinking about a multi threaded UpdateProcessor as well - that could be interesting. - Mark sunnyfr wrote: Hi, I was reading this post and I wondering how can I parallelize document processing??? Thanks Erik Erik Hatcher wrote: On Feb 21, 2007, at 4:25 PM, Jack L wrote: couple of times today at around 158 documents / sec. This is not bad at all. How about search performance? How many concurrent queries have people been having? What does the response time look like? I'm the only user :) What I've done is a proof-of-concept for our library. We have 3.7M records that I've indexed and faceted. Search performance (in my unrealistic single user scenario) is blazing (50ms or so) for purely full-text queries. For queries that return facets, the response times are actually quite good too (~900ms, or less depending on the request) - provided the filter cache is warmed and large enough. This is running on my laptop (MacBook Pro, 2GB RAM, 1.83GHz) - I'm sure on a beefier box it'll only get better. Thanks to the others that clarified. I run my indexers in parallel... but a single instance of Solr (which in turn handles requests in parallel as well). Do you feel if multi-threaded posting is helpful? It depends. If the data processing can be parallelized and your hardware supports it, it can certainly make a big difference... it did in my case. Both CPUs were cooking during my parallel indexing runs. Erik
Re: solr performance
Ok ... Actually my problem is more multi thread which take long time ... like 3sec when 100 threads/sec. I thought that could have helped me .. but no link actually :s sorry markrmiller wrote: Kick off some indexing more than once - eg, post a folder of docs, and while thats working, post another. I've been thinking about a multi threaded UpdateProcessor as well - that could be interesting. - Mark sunnyfr wrote: Hi, I was reading this post and I wondering how can I parallelize document processing??? Thanks Erik Erik Hatcher wrote: On Feb 21, 2007, at 4:25 PM, Jack L wrote: couple of times today at around 158 documents / sec. This is not bad at all. How about search performance? How many concurrent queries have people been having? What does the response time look like? I'm the only user :) What I've done is a proof-of-concept for our library. We have 3.7M records that I've indexed and faceted. Search performance (in my unrealistic single user scenario) is blazing (50ms or so) for purely full-text queries. For queries that return facets, the response times are actually quite good too (~900ms, or less depending on the request) - provided the filter cache is warmed and large enough. This is running on my laptop (MacBook Pro, 2GB RAM, 1.83GHz) - I'm sure on a beefier box it'll only get better. Thanks to the others that clarified. I run my indexers in parallel... but a single instance of Solr (which in turn handles requests in parallel as well). Do you feel if multi-threaded posting is helpful? It depends. If the data processing can be parallelized and your hardware supports it, it can certainly make a big difference... it did in my case. Both CPUs were cooking during my parallel indexing runs. Erik -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833662.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr performance
On Thu, Dec 4, 2008 at 8:39 AM, Mark Miller [EMAIL PROTECTED] wrote: Kick off some indexing more than once - eg, post a folder of docs, and while thats working, post another. I've been thinking about a multi threaded UpdateProcessor as well - that could be interesting. Not sure how that would work (unless you didn't want responses), but I've thought about it from the SolrJ side - something you could quickly add documents to and it would manage a number of threads under the covers to maximize throughput. Not sure what would be the best for error handling though - perhaps just polling (allow user to ask for failed or successful operations). -Yonik
Re: Resolr performance
On Thu, Dec 4, 2008 at 8:36 AM, sunnyfr [EMAIL PROTECTED] wrote: When I check my CPU, all my CPU are not full, how can I change this ? If this is while you are indexing, then it simply means that you are not feeding documents to Solr fast enough (use multiple threads to send to Solr, and send multiple documents in each update request if possible). If CPU utilization is still low, then it means you are IO (disk) bound... if you want to go faster, get faster disks. -Yonik Do I have to change a parameter ?? Thanks a lot , Johanna Walter Underwood wrote: Try running your submits while watching a CPU load meter. Do this on a multi-CPU machine. If all CPUs are busy, you are running as fast as possible. If one CPU is busy (around 50% usage on a dual-CPU system), parallel submits might help. If no CPU is 100% busy, the bottleneck is probably disk or network. wunder On 2/20/07 10:46 AM, Jack L [EMAIL PROTECTED] wrote: Thanks to all who replied. It's encouraging :) The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. I have a question for Erik: you mentioned single threaded indexer (below). I'm not familiar with solr at all and did a search on solr wiki for thread and didn't find anything. Is it so that I can actually configure solr to be single-threaded and multi-threaded? And I'm not sure what you meant by parallelizing the indexer? Running multiple instances of the indexer, or multiple instances of solr? Thanks, Jack My largest Solr index is currently at 1.4M and it takes a max of 3ms to add a document (according to Solr's console), most of them 1ms. My single threaded indexer is indexing around 1000 documents per minute, but I think I can get this number even faster by parallelizing the indexer. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Resolr performance
When I run my stress test ..sending multi thread ... around 100/sec I don't start indexation at all ... ? maybe my cache ??? will check that Yonik Seeley wrote: On Thu, Dec 4, 2008 at 8:36 AM, sunnyfr [EMAIL PROTECTED] wrote: When I check my CPU, all my CPU are not full, how can I change this ? If this is while you are indexing, then it simply means that you are not feeding documents to Solr fast enough (use multiple threads to send to Solr, and send multiple documents in each update request if possible). If CPU utilization is still low, then it means you are IO (disk) bound... if you want to go faster, get faster disks. -Yonik Do I have to change a parameter ?? Thanks a lot , Johanna Walter Underwood wrote: Try running your submits while watching a CPU load meter. Do this on a multi-CPU machine. If all CPUs are busy, you are running as fast as possible. If one CPU is busy (around 50% usage on a dual-CPU system), parallel submits might help. If no CPU is 100% busy, the bottleneck is probably disk or network. wunder On 2/20/07 10:46 AM, Jack L [EMAIL PROTECTED] wrote: Thanks to all who replied. It's encouraging :) The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. I have a question for Erik: you mentioned single threaded indexer (below). I'm not familiar with solr at all and did a search on solr wiki for thread and didn't find anything. Is it so that I can actually configure solr to be single-threaded and multi-threaded? And I'm not sure what you meant by parallelizing the indexer? Running multiple instances of the indexer, or multiple instances of solr? Thanks, Jack My largest Solr index is currently at 1.4M and it takes a max of 3ms to add a document (according to Solr's console), most of them 1ms. My single threaded indexer is indexing around 1000 documents per minute, but I think I can get this number even faster by parallelizing the indexer. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833521.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833790.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Resolr performance
On Thu, Dec 4, 2008 at 8:52 AM, sunnyfr [EMAIL PROTECTED] wrote: When I run my stress test ..sending multi thread ... around 100/sec I don't start indexation at all ... If you can't go higher than 100 requests / sec and the CPUs arent at 100% then the possibilities are: - If the index is bigger than free memory the OS can use to cache, then cache misses (at the OS level) can cause CPU to go lower - these cache mises are most likely to happen when retrieving stored fields for hits. - You can also be network IO bound if you are doing requests from a different machine. - Internal locking contention... pretty much every system will reach a peak number of requests/sec and then start declining as you add more concurrent requests. If you haven't yet, try a nightly build from December - the index-level locking should be improved under high load for non-Windows systems. -Yonik maybe my cache ??? will check that Yonik Seeley wrote: On Thu, Dec 4, 2008 at 8:36 AM, sunnyfr [EMAIL PROTECTED] wrote: When I check my CPU, all my CPU are not full, how can I change this ? If this is while you are indexing, then it simply means that you are not feeding documents to Solr fast enough (use multiple threads to send to Solr, and send multiple documents in each update request if possible). If CPU utilization is still low, then it means you are IO (disk) bound... if you want to go faster, get faster disks. -Yonik Do I have to change a parameter ?? Thanks a lot , Johanna Walter Underwood wrote: Try running your submits while watching a CPU load meter. Do this on a multi-CPU machine. If all CPUs are busy, you are running as fast as possible. If one CPU is busy (around 50% usage on a dual-CPU system), parallel submits might help. If no CPU is 100% busy, the bottleneck is probably disk or network. wunder On 2/20/07 10:46 AM, Jack L [EMAIL PROTECTED] wrote: Thanks to all who replied. It's encouraging :) The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. I have a question for Erik: you mentioned single threaded indexer (below). I'm not familiar with solr at all and did a search on solr wiki for thread and didn't find anything. Is it so that I can actually configure solr to be single-threaded and multi-threaded? And I'm not sure what you meant by parallelizing the indexer? Running multiple instances of the indexer, or multiple instances of solr? Thanks, Jack My largest Solr index is currently at 1.4M and it takes a max of 3ms to add a document (according to Solr's console), most of them 1ms. My single threaded indexer is indexing around 1000 documents per minute, but I think I can get this number even faster by parallelizing the indexer. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833521.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/solr-performance-tp9055437p20833790.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 1.3 - response time very long
On Thu, Dec 4, 2008 at 8:13 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I've tried everything but it's doesn't change anything, I tried as well the last trunk version but nothing changed. There is nothings that I can do about the indexation ...maybe I can optimize something before searching ??? Did you optimize the index (send in the optimize command) after indexing but before searching? curl http://localhost:8983/solr/update?optimize=true I'm using linux system, apache 5.5, last solr version updated. Memory : 8G Intel Do you think its a lot for the index size 7.6G for 8.5M of document? So it could be due to the index being slightly to big - subtract out memory for Solr and other stuff, and there's not enough left for everything to fully be cached by the OS. You can make it bigger or smaller depending on how you have the schema configured. The example schema isn't necessarily optimized for speed or size - it serves as an example of many field types and operations. Make sure you only index fields you need to search, sort, or facet on. Make sure you only store fields (marked as stored in the schema) that you really need returned in results. The example schema as copyFields and default values that you don't need - hopefully you've removed them. What's your schema, and do you have more examples of URLs you are sending to Solr (all the parameters)? -Yonik
Response status
In the standard response format, what does the status mean? It always seems to be 0. Thanks Rob
Re: Solr 1.3 - response time very long
Huge thanks for your help Yonik, I optimized the index so I will try to reduce the size ... like I explained you I stored all language text ... So I will reduce my stored data. Cheers... I will let you know :) Yonik Seeley wrote: On Thu, Dec 4, 2008 at 8:13 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I've tried everything but it's doesn't change anything, I tried as well the last trunk version but nothing changed. There is nothings that I can do about the indexation ...maybe I can optimize something before searching ??? Did you optimize the index (send in the optimize command) after indexing but before searching? curl http://localhost:8983/solr/update?optimize=true I'm using linux system, apache 5.5, last solr version updated. Memory : 8G Intel Do you think its a lot for the index size 7.6G for 8.5M of document? So it could be due to the index being slightly to big - subtract out memory for Solr and other stuff, and there's not enough left for everything to fully be cached by the OS. You can make it bigger or smaller depending on how you have the schema configured. The example schema isn't necessarily optimized for speed or size - it serves as an example of many field types and operations. Make sure you only index fields you need to search, sort, or facet on. Make sure you only store fields (marked as stored in the schema) that you really need returned in results. The example schema as copyFields and default values that you don't need - hopefully you've removed them. What's your schema, and do you have more examples of URLs you are sending to Solr (all the parameters)? -Yonik -- View this message in context: http://www.nabble.com/Solr-1.3---response-time-very-long-tp20795134p20834935.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Response status
It means the request was successful. If the status is non-zero (err, 1) then there was an error of some sort. Erik On Dec 4, 2008, at 9:32 AM, Robert Young wrote: In the standard response format, what does the status mean? It always seems to be 0. Thanks Rob
Re: Response status
Thanks On Thu, Dec 4, 2008 at 2:53 PM, Erik Hatcher [EMAIL PROTECTED]wrote: It means the request was successful. If the status is non-zero (err, 1) then there was an error of some sort. Erik On Dec 4, 2008, at 9:32 AM, Robert Young wrote: In the standard response format, what does the status mean? It always seems to be 0. Thanks Rob
Re: Solr 1.3 - response time very long
Hi Yonik, I will index my data again Can you advice me to optimize a lot my data and tell me if you see something very wrong or bad for the memory, according to the fact that I just need to show back the ID, that's it. But I need to boost some field ... like description .. description_country according to the country ... Thanks a lot, I would appreciate it, fields field name=idtype=sintindexed=true stored=true omitNorms=true / field name=status_privatetype=boolean indexed=true stored=false omitNorms=true / field name=status_... 6 more like that field name=duration type=sintindexed=true stored=false omitNorms=true / field name=created type=dateindexed=true stored=false omitNorms=true / field name=modified type=dateindexed=true stored=false omitNorms=true / field name=rating_binratetype=sintindexed=true stored=false omitNorms=true / field name=user_id type=sintindexed=true stored=false omitNorms=true / field name=country type=string indexed=true stored=false omitNorms=true / field name=language type=string indexed=true stored=false omitNorms=true / field name=creative_type type=string indexed=true stored=false omitNorms=true / field name=rel_group_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / ... 3more like that field name=rel_featured_user_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / field name=rel_featured_group_idstype=sintindexed=true stored=false omitNorms=true multiValued=true / field name=stat_viewstype=sintindexed=true stored=false omitNorms=true / ...6more like that ... field name=title type=text indexed=true stored=false / field name=title_fr type=text_frindexed=true stored=false / field name=title_en type=text_enindexed=true stored=false / field name=title_de type=text_deindexed=true stored=false / field name=title_es type=text_esindexed=true stored=false / field name=title_ru type=text_ruindexed=true stored=false / field name=title_pt type=text_ptindexed=true stored=false / field name=title_nl type=text_nlindexed=true stored=false / field name=title_el type=text_elindexed=true stored=false / field name=title_ja type=text_jaindexed=true stored=false / field name=title_it type=text_itindexed=true stored=false / field name=description type=textindexed=true stored=false / field name=description_frtype=text_frindexed=true stored=false / field name=description_entype=text_enindexed=true stored=false / field name=description_detype=text_deindexed=true stored=false / field name=description_estype=text_esindexed=true stored=false / field name=description_rutype=text_ruindexed=true stored=false / field name=description_pttype=text_ptindexed=true stored=false / field name=description_nltype=text_nlindexed=true stored=false / field name=description_eltype=text_elindexed=true stored=false / field name=description_jatype=text_jaindexed=true stored=false / field name=description_ittype=text_itindexed=true stored=false / field name=tag1 type=string indexed=true stored=false omitNorms=true / field name=tag2 type=string indexed=true stored=false omitNorms=true / field name=tag3 type=string indexed=true stored=false omitNorms=true / field name=tag4 type=string indexed=true stored=false omitNorms=true / field name=tags type=string indexed=true stored=false omitNorms=true multiValued=true termVectors=true / field name=owner_login type=string indexed=true stored=false omitNorms=true / field name=text type=text indexed=true stored=false multiValued=false/ field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ field name=spell type=textSpell indexed=true stored=false multiValued=true/ dynamicField name=random* type=random / /fields Yonik Seeley wrote: On Thu, Dec 4, 2008 at 8:13 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I've tried everything but it's doesn't change anything, I tried as well the last trunk version but nothing changed. There is nothings that I can do about the indexation ...maybe I can optimize something before
Re: Solr 1.3 - response time very long
remove this entry from the example schema unless you need the timestamp when it was indexed: field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ Also, only index fields you really need to search separately. For example, if the description field is also indexed into the text field via copyField, and you only search it via the text field, then don't store or index the description field. Retrieving only ids is something that could be optimized in Solr, but hasn't been done yet. -Yonik On Thu, Dec 4, 2008 at 11:10 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I will index my data again Can you advice me to optimize a lot my data and tell me if you see something very wrong or bad for the memory, according to the fact that I just need to show back the ID, that's it. But I need to boost some field ... like description .. description_country according to the country ... Thanks a lot, I would appreciate it, fields field name=idtype=sintindexed=true stored=true omitNorms=true / field name=status_privatetype=boolean indexed=true stored=false omitNorms=true / field name=status_... 6 more like that field name=duration type=sintindexed=true stored=false omitNorms=true / field name=created type=dateindexed=true stored=false omitNorms=true / field name=modified type=dateindexed=true stored=false omitNorms=true / field name=rating_binratetype=sintindexed=true stored=false omitNorms=true / field name=user_id type=sintindexed=true stored=false omitNorms=true / field name=country type=string indexed=true stored=false omitNorms=true / field name=language type=string indexed=true stored=false omitNorms=true / field name=creative_type type=string indexed=true stored=false omitNorms=true / field name=rel_group_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / ... 3more like that field name=rel_featured_user_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / field name=rel_featured_group_idstype=sintindexed=true stored=false omitNorms=true multiValued=true / field name=stat_viewstype=sintindexed=true stored=false omitNorms=true / ...6more like that ... field name=title type=text indexed=true stored=false / field name=title_fr type=text_frindexed=true stored=false / field name=title_en type=text_enindexed=true stored=false / field name=title_de type=text_deindexed=true stored=false / field name=title_es type=text_esindexed=true stored=false / field name=title_ru type=text_ruindexed=true stored=false / field name=title_pt type=text_ptindexed=true stored=false / field name=title_nl type=text_nlindexed=true stored=false / field name=title_el type=text_elindexed=true stored=false / field name=title_ja type=text_jaindexed=true stored=false / field name=title_it type=text_itindexed=true stored=false / field name=description type=textindexed=true stored=false / field name=description_frtype=text_frindexed=true stored=false / field name=description_entype=text_enindexed=true stored=false / field name=description_detype=text_deindexed=true stored=false / field name=description_estype=text_esindexed=true stored=false / field name=description_rutype=text_ruindexed=true stored=false / field name=description_pttype=text_ptindexed=true stored=false / field name=description_nltype=text_nlindexed=true stored=false / field name=description_eltype=text_elindexed=true stored=false / field name=description_jatype=text_jaindexed=true stored=false / field name=description_ittype=text_itindexed=true stored=false / field name=tag1 type=string indexed=true stored=false omitNorms=true / field name=tag2 type=string indexed=true stored=false omitNorms=true / field name=tag3 type=string indexed=true stored=false omitNorms=true / field name=tag4 type=string indexed=true stored=false omitNorms=true / field name=tags type=string indexed=true stored=false omitNorms=true multiValued=true termVectors=true / field name=owner_login type=string indexed=true
Re: solr performance
Yonik Seeley wrote: Not sure what would be the best for error handling though - perhaps just polling (allow user to ask for failed or successful operations). Thats how I've handled similar situations in the past. Your submitting a batch of data to be processed, and if your so inclined to see how it went, you can inspect some kind of report object. If the batch process blocks, you could return the report object, or if not, you could return a batch/job id (with reports valid for x amount of time after they are done?). It seems like a sound enough method to me, but it would be interesting to hear if someone has a better idea. - Mark
Re: Solr 1.3 - response time very long
Ok thanks a lot, so I can remove all this part field name=title type=text indexed=true stored=false / field name=description type=textindexed=true stored=false / field name=tag1 type=string indexed=true stored=false omitNorms=true / field name=tag2 type=string indexed=true stored=false omitNorms=true / field name=tag3 type=string indexed=true stored=false omitNorms=true / field name=tag4 type=string indexed=true stored=false omitNorms=true / field name=tags type=string indexed=true stored=false omitNorms=true multiValued=true termVectors=true / and just keep ... : copyField source=title dest=text/ copyField source=title_en dest=text/ ..title_es ... copyField source=descriptiondest=text/ copyField source=description_endest=text/ ..title_en ... copyField source=tag1 dest=text/ copyField source=tag2 dest=text/ copyField source=tag3 dest=text/ copyField source=tag4 dest=text/ Just to be sure ... I index, title and description if for exemple i need to boost them sepearetly... bf. title^2 description^1.5 Languages one ... need to be index to apply analyzer/stemmer, and boost them differently according to the country.. but I copy them to be searchable. thanks so much for your time .. again and again... Yonik Seeley wrote: remove this entry from the example schema unless you need the timestamp when it was indexed: field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ Also, only index fields you really need to search separately. For example, if the description field is also indexed into the text field via copyField, and you only search it via the text field, then don't store or index the description field. Retrieving only ids is something that could be optimized in Solr, but hasn't been done yet. -Yonik On Thu, Dec 4, 2008 at 11:10 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I will index my data again Can you advice me to optimize a lot my data and tell me if you see something very wrong or bad for the memory, according to the fact that I just need to show back the ID, that's it. But I need to boost some field ... like description .. description_country according to the country ... Thanks a lot, I would appreciate it, fields field name=idtype=sintindexed=true stored=true omitNorms=true / field name=status_privatetype=boolean indexed=true stored=false omitNorms=true / field name=status_... 6 more like that field name=duration type=sintindexed=true stored=false omitNorms=true / field name=created type=dateindexed=true stored=false omitNorms=true / field name=modified type=dateindexed=true stored=false omitNorms=true / field name=rating_binratetype=sintindexed=true stored=false omitNorms=true / field name=user_id type=sintindexed=true stored=false omitNorms=true / field name=country type=string indexed=true stored=false omitNorms=true / field name=language type=string indexed=true stored=false omitNorms=true / field name=creative_type type=string indexed=true stored=false omitNorms=true / field name=rel_group_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / ... 3more like that field name=rel_featured_user_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / field name=rel_featured_group_idstype=sintindexed=true stored=false omitNorms=true multiValued=true / field name=stat_viewstype=sintindexed=true stored=false omitNorms=true / ...6more like that ... field name=title type=text indexed=true stored=false / field name=title_fr type=text_fr indexed=true stored=false / field name=title_en type=text_en indexed=true stored=false / field name=title_de type=text_de indexed=true stored=false / field name=title_es type=text_es indexed=true stored=false / field name=title_ru type=text_ru indexed=true stored=false / field name=title_pt type=text_pt indexed=true stored=false / field name=title_nl type=text_nl indexed=true stored=false / field name=title_el type=text_el indexed=true stored=false / field name=title_ja type=text_ja indexed=true stored=false / field name=title_it type=text_it indexed=true stored=false /
Re: Solr 1.3 - response time very long
On Thu, Dec 4, 2008 at 11:41 AM, sunnyfr [EMAIL PROTECTED] wrote: Ok thanks a lot, so I can remove all this part I wouldn't remove them if they are the source of a copyField (with the destination being text). Simply change to indexed=false stored=false otherwise you may get an undefined field exception. -Yonik field name=title type=text indexed=true stored=false / field name=description type=textindexed=true stored=false / field name=tag1 type=string indexed=true stored=false omitNorms=true / field name=tag2 type=string indexed=true stored=false omitNorms=true / field name=tag3 type=string indexed=true stored=false omitNorms=true / field name=tag4 type=string indexed=true stored=false omitNorms=true / field name=tags type=string indexed=true stored=false omitNorms=true multiValued=true termVectors=true / and just keep ... : copyField source=title dest=text/ copyField source=title_en dest=text/ ..title_es ... copyField source=descriptiondest=text/ copyField source=description_endest=text/ ..title_en ... copyField source=tag1 dest=text/ copyField source=tag2 dest=text/ copyField source=tag3 dest=text/ copyField source=tag4 dest=text/ Just to be sure ... I index, title and description if for exemple i need to boost them sepearetly... bf. title^2 description^1.5 Languages one ... need to be index to apply analyzer/stemmer, and boost them differently according to the country.. but I copy them to be searchable. thanks so much for your time .. again and again... Yonik Seeley wrote: remove this entry from the example schema unless you need the timestamp when it was indexed: field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ Also, only index fields you really need to search separately. For example, if the description field is also indexed into the text field via copyField, and you only search it via the text field, then don't store or index the description field. Retrieving only ids is something that could be optimized in Solr, but hasn't been done yet. -Yonik On Thu, Dec 4, 2008 at 11:10 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I will index my data again Can you advice me to optimize a lot my data and tell me if you see something very wrong or bad for the memory, according to the fact that I just need to show back the ID, that's it. But I need to boost some field ... like description .. description_country according to the country ... Thanks a lot, I would appreciate it, fields field name=idtype=sintindexed=true stored=true omitNorms=true / field name=status_privatetype=boolean indexed=true stored=false omitNorms=true / field name=status_... 6 more like that field name=duration type=sintindexed=true stored=false omitNorms=true / field name=created type=dateindexed=true stored=false omitNorms=true / field name=modified type=dateindexed=true stored=false omitNorms=true / field name=rating_binratetype=sintindexed=true stored=false omitNorms=true / field name=user_id type=sintindexed=true stored=false omitNorms=true / field name=country type=string indexed=true stored=false omitNorms=true / field name=language type=string indexed=true stored=false omitNorms=true / field name=creative_type type=string indexed=true stored=false omitNorms=true / field name=rel_group_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / ... 3more like that field name=rel_featured_user_ids type=sintindexed=true stored=false omitNorms=true multiValued=true / field name=rel_featured_group_idstype=sintindexed=true stored=false omitNorms=true multiValued=true / field name=stat_viewstype=sintindexed=true stored=false omitNorms=true / ...6more like that ... field name=title type=text indexed=true stored=false / field name=title_fr type=text_fr indexed=true stored=false / field name=title_en type=text_en indexed=true stored=false / field name=title_de type=text_de indexed=true stored=false / field name=title_es type=text_es indexed=true stored=false / field name=title_ru type=text_ru indexed=true stored=false / field name=title_pt type=text_pt indexed=true stored=false / field name=title_nl type=text_nl indexed=true stored=false / field
Re: Solr 1.3 - response time very long
right !!! Yonik Seeley wrote: On Thu, Dec 4, 2008 at 11:41 AM, sunnyfr [EMAIL PROTECTED] wrote: Ok thanks a lot, so I can remove all this part I wouldn't remove them if they are the source of a copyField (with the destination being text). Simply change to indexed=false stored=false otherwise you may get an undefined field exception. -Yonik field name=title type=text indexed=true stored=false / field name=description type=textindexed=true stored=false / field name=tag1 type=string indexed=true stored=false omitNorms=true / field name=tag2 type=string indexed=true stored=false omitNorms=true / field name=tag3 type=string indexed=true stored=false omitNorms=true / field name=tag4 type=string indexed=true stored=false omitNorms=true / field name=tags type=string indexed=true stored=false omitNorms=true multiValued=true termVectors=true / and just keep ... : copyField source=title dest=text/ copyField source=title_en dest=text/ ..title_es ... copyField source=descriptiondest=text/ copyField source=description_endest=text/ ..title_en ... copyField source=tag1 dest=text/ copyField source=tag2 dest=text/ copyField source=tag3 dest=text/ copyField source=tag4 dest=text/ Just to be sure ... I index, title and description if for exemple i need to boost them sepearetly... bf. title^2 description^1.5 Languages one ... need to be index to apply analyzer/stemmer, and boost them differently according to the country.. but I copy them to be searchable. thanks so much for your time .. again and again... Yonik Seeley wrote: remove this entry from the example schema unless you need the timestamp when it was indexed: field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ Also, only index fields you really need to search separately. For example, if the description field is also indexed into the text field via copyField, and you only search it via the text field, then don't store or index the description field. Retrieving only ids is something that could be optimized in Solr, but hasn't been done yet. -Yonik On Thu, Dec 4, 2008 at 11:10 AM, sunnyfr [EMAIL PROTECTED] wrote: Hi Yonik, I will index my data again Can you advice me to optimize a lot my data and tell me if you see something very wrong or bad for the memory, according to the fact that I just need to show back the ID, that's it. But I need to boost some field ... like description .. description_country according to the country ... Thanks a lot, I would appreciate it, fields field name=idtype=sint indexed=true stored=true omitNorms=true / field name=status_privatetype=boolean indexed=true stored=false omitNorms=true / field name=status_... 6 more like that field name=duration type=sint indexed=true stored=false omitNorms=true / field name=created type=date indexed=true stored=false omitNorms=true / field name=modified type=date indexed=true stored=false omitNorms=true / field name=rating_binratetype=sint indexed=true stored=false omitNorms=true / field name=user_id type=sint indexed=true stored=false omitNorms=true / field name=country type=string indexed=true stored=false omitNorms=true / field name=language type=string indexed=true stored=false omitNorms=true / field name=creative_type type=string indexed=true stored=false omitNorms=true / field name=rel_group_ids type=sint indexed=true stored=false omitNorms=true multiValued=true / ... 3more like that field name=rel_featured_user_ids type=sint indexed=true stored=false omitNorms=true multiValued=true / field name=rel_featured_group_idstype=sint indexed=true stored=false omitNorms=true multiValued=true / field name=stat_viewstype=sint indexed=true stored=false omitNorms=true / ...6more like that ... field name=title type=text indexed=true stored=false / field name=title_fr type=text_fr indexed=true stored=false / field name=title_en type=text_en indexed=true stored=false / field name=title_de type=text_de indexed=true stored=false / field name=title_es type=text_es indexed=true stored=false / field name=title_ru type=text_ru indexed=true stored=false / field name=title_pt type=text_pt indexed=true stored=false / field name=title_nl
Re: Throughput Optimization
It looks like file locking was the bottleneck - CPU usage is up to ~98% (from the previous peak of ~50%). I'm running the trunk code from Dec 2 with the faceting improvement (SOLR-475) turned off. Thanks for all the help! Yonik Seeley wrote: FYI, SOLR-465 has been committed. Let us know if it improves your scenario. -Yonik -- View this message in context: http://www.nabble.com/Throughput-Optimization-tp20335132p20840017.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Throughput Optimization
On Thu, Dec 4, 2008 at 1:54 PM, wojtekpia [EMAIL PROTECTED] wrote: It looks like file locking was the bottleneck - CPU usage is up to ~98% (from the previous peak of ~50%). Great to hear it! I'm running the trunk code from Dec 2 with the faceting improvement (SOLR-475) turned off. Thanks for all the help! new faceting stuff off because it didn't improve things in your case, or because you didn't want to change that variable just now? -Yonik Yonik Seeley wrote: FYI, SOLR-465 has been committed. Let us know if it improves your scenario. -Yonik -- View this message in context: http://www.nabble.com/Throughput-Optimization-tp20335132p20840017.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Ordering updates
It is not clear how you are using Solr i.e. distributed vs single index. Summarily, Solr does not update documents. It overwrites the old document with the new one if an old document with the same uniqueKey exists in the index. Does that answer your question? On Thu, Dec 4, 2008 at 1:46 AM, Laurence Rowe [EMAIL PROTECTED] wrote: Hi, Our CMS is distributed over a cluster and I was wandering how I can ensure that index records of newer versions of documents are never overwritten by older ones. Amazon AWS uses a timestamp on requests to ensure 'eventual consistency' of operations. Is there a way to supply a transaction ID with an update so an update is conditional on the supplied transaction id being greater than the existing indexed transaction id? Laurence -- Regards, Shalin Shekhar Mangar.
Re: new faceting algorithm
I'm seeing some strange behavior with my garbage collector that disappears when I turn off this optimization. I'm running load tests on my deployment. For the first few minutes, everything is fine (and this patch does make things faster - I haven't quantified the improvement yet). After that, the garbage collector stops collecting. Specifically, the new generation part of the heap is full, but never garbage collected, and the old generation is emptied, then never gets anything more. This throttles Solr performance (average response times that used to be ~500ms are now ~25s). I described my deployment scenario in an earlier post: http://www.nabble.com/Throughput-Optimization-td20335132.html Does it sound like the new faceting algorithm could be the culprit? wojtekpia wrote: Definitely, but it'll take me a few days. I'll also report findings on SOLR-465. (I've been on holiday for a few weeks) Noble Paul നോബിള് नोब्ळ् wrote: wojtek, you can report back the numbers if possible It would be nice to know how the new impl performs in real-world -- View this message in context: http://www.nabble.com/new-faceting-algorithm-tp20674902p20840622.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Throughput Optimization
New faceting stuff off because I'm encountering some problems when I turn it on, I posted the details: http://www.nabble.com/new-faceting-algorithm-td20674902.html#a20840622 Yonik Seeley wrote: On Thu, Dec 4, 2008 at 1:54 PM, wojtekpia [EMAIL PROTECTED] wrote: It looks like file locking was the bottleneck - CPU usage is up to ~98% (from the previous peak of ~50%). Great to hear it! I'm running the trunk code from Dec 2 with the faceting improvement (SOLR-475) turned off. Thanks for all the help! new faceting stuff off because it didn't improve things in your case, or because you didn't want to change that variable just now? -Yonik -- View this message in context: http://www.nabble.com/Throughput-Optimization-tp20335132p20840668.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Throughput Optimization
On Thu, Dec 4, 2008 at 2:30 PM, wojtekpia [EMAIL PROTECTED] wrote: New faceting stuff off because I'm encountering some problems when I turn it on, I posted the details: http://www.nabble.com/new-faceting-algorithm-td20674902.html#a20840622 Missed that, thanks... will respond there. -Yonik Yonik Seeley wrote: On Thu, Dec 4, 2008 at 1:54 PM, wojtekpia [EMAIL PROTECTED] wrote: It looks like file locking was the bottleneck - CPU usage is up to ~98% (from the previous peak of ~50%). Great to hear it! I'm running the trunk code from Dec 2 with the faceting improvement (SOLR-475) turned off. Thanks for all the help! new faceting stuff off because it didn't improve things in your case, or because you didn't want to change that variable just now? -Yonik -- View this message in context: http://www.nabble.com/Throughput-Optimization-tp20335132p20840668.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: new faceting algorithm
On Thu, Dec 4, 2008 at 2:28 PM, wojtekpia [EMAIL PROTECTED] wrote: I'm seeing some strange behavior with my garbage collector that disappears when I turn off this optimization. I'm running load tests on my deployment. For the first few minutes, everything is fine (and this patch does make things faster - I haven't quantified the improvement yet). After that, the garbage collector stops collecting. Specifically, the new generation part of the heap is full, but never garbage collected, and the old generation is emptied, then never gets anything more. Are you doing commits at any time? One possibility is the caching mechanism (weak-ref on the IndexReader)... that's going to be changing soon hopefully. -Yonik This throttles Solr performance (average response times that used to be ~500ms are now ~25s). I described my deployment scenario in an earlier post: http://www.nabble.com/Throughput-Optimization-td20335132.html Does it sound like the new faceting algorithm could be the culprit? wojtekpia wrote: Definitely, but it'll take me a few days. I'll also report findings on SOLR-465. (I've been on holiday for a few weeks) Noble Paul നോബിള് नोब्ळ् wrote: wojtek, you can report back the numbers if possible It would be nice to know how the new impl performs in real-world -- View this message in context: http://www.nabble.com/new-faceting-algorithm-tp20674902p20840622.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: new faceting algorithm
Yonik Seeley wrote: Are you doing commits at any time? One possibility is the caching mechanism (weak-ref on the IndexReader)... that's going to be changing soon hopefully. -Yonik No commits during this test. Should I start looking into my heap size distribution and garbage collector selection? -- View this message in context: http://www.nabble.com/new-faceting-algorithm-tp20674902p20841219.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: new faceting algorithm
On Thu, Dec 4, 2008 at 2:57 PM, wojtekpia [EMAIL PROTECTED] wrote: Yonik Seeley wrote: Are you doing commits at any time? One possibility is the caching mechanism (weak-ref on the IndexReader)... that's going to be changing soon hopefully. -Yonik No commits during this test. Should I start looking into my heap size distribution and garbage collector selection? Hmmm, OK. The other big difference would then be that retrieving the top facets requires creating a Lucene TermEnum (not all facet values are stored in memory). The lucene version in Solr has changed since I did long running tests... with various Lucene changes to thread-local caching, etc. I'll try and reproduce. Or maybe this is somehow a GC bug just tickled by the current caching mechanism? (weak hash map) -Yonik
Is there a clean way to determine whether a core exists?
The ping command gives me a 500 status if the core exists, or a 404 if it doesn't. For example, when I hit http://doom:8983/solr/content_item_representations_20081201/admin/ping I see HTTP ERROR: 500 INTERNAL_SERVER_ERROR RequestURI=/solr/admin/ping Powered by Jetty:// (Obviously I'm using Jetty. Moving to Tomcat is on our list.) I could depend on this behavior, but that seems ugly, so I decided to try a probe query instead. I am using the Solrj client library. Unfortunately, for core non- existence or any other problem, Solrj uses an unhelpful catch-all exception: /** Exception to catch all types of communication / parsing issues associated with talking to SOLR * * @version $Id: SolrServerException.java 555343 2007-07-11 17:46:25Z hossman $ * @since solr 1.3 */ public class SolrServerException extends Exception { ... } So I have wound up, thus far, with the following code: private boolean solrCoreExists(String url) throws SolrException, MalformedURLException, SolrServerException { try { SolrQuery solrQuery = new SolrQuery(); solrQuery.setQuery(xyzzy=plugh); new CommonsHttpSolrServer(url).query(solrQuery); return true; } catch (SolrServerException e) { if (e.getCause() != null e.getCause().getMessage().startsWith(Not Found)) { return false; } else { throw e; } } } Hopefully there's a better solution? Dean
Re: Is there a clean way to determine whether a core exists?
what about just calling: http://doom:8983/solr/content_item_representations_20081201/select That should give you a 404 if it does not exist. the admin stuff will behave funny if the core does not exist (perhaps you can file a JIRA issue for that) ryan On Dec 4, 2008, at 3:38 PM, Dean Thompson wrote: The ping command gives me a 500 status if the core exists, or a 404 if it doesn't. For example, when I hit http://doom:8983/solr/content_item_representations_20081201/admin/ping I see HTTP ERROR: 500 INTERNAL_SERVER_ERROR RequestURI=/solr/admin/ping Powered by Jetty:// (Obviously I'm using Jetty. Moving to Tomcat is on our list.) I could depend on this behavior, but that seems ugly, so I decided to try a probe query instead. I am using the Solrj client library. Unfortunately, for core non- existence or any other problem, Solrj uses an unhelpful catch-all exception: /** Exception to catch all types of communication / parsing issues associated with talking to SOLR * * @version $Id: SolrServerException.java 555343 2007-07-11 17:46:25Z hossman $ * @since solr 1.3 */ public class SolrServerException extends Exception { ... } So I have wound up, thus far, with the following code: private boolean solrCoreExists(String url) throws SolrException, MalformedURLException, SolrServerException { try { SolrQuery solrQuery = new SolrQuery(); solrQuery.setQuery(xyzzy=plugh); new CommonsHttpSolrServer(url).query(solrQuery); return true; } catch (SolrServerException e) { if (e.getCause() != null e.getCause().getMessage().startsWith(Not Found)) { return false; } else { throw e; } } } Hopefully there's a better solution? Dean
Re: Is there a clean way to determine whether a core exists?
Thanks for the quick response, Ryan! Actually, my admin/ping call gives me a 404 if the core doesn't exist, which seemed reasonable. I get the 500 if the core *did* exist. Thanks for the suggestion of using the select URL, but that gives me: HTTP ERROR: 500 null java.lang.NullPointerException at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:37) at org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java: 104) at org.apache.solr.search.QParser.getQuery(QParser.java:88) at org .apache .solr.handler.component.QueryComponent.prepare(QueryComponent.java:82) at org .apache .solr .handler.component.SearchHandler.handleRequestBody(SearchHandler.java: 148) at org .apache .solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org .apache .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org .apache .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.mortbay.jetty.servlet.ServletHandler $CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java: 216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java: 405) at org .mortbay .jetty .handler.ContextHandlerCollection.handle(ContextHandlerCollection.java: 211) at org .mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java: 114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java: 502) at org.mortbay.jetty.HttpConnection $RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector $Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool $PoolThread.run(BoundedThreadPool.java:442) RequestURI=/solr/content_item_representations_20081201/select On Dec 4, 2008, at 3:48 PM, Ryan McKinley wrote: http://doom:8983/solr/content_item_representations_20081201/select
Re: Ordering updates
Hi, We currently have a single Solr server, with a single index. There are a number of CMS processes distributed over a number of servers, with each CMS process sending an update to the Solr index when changes are made to a content object. My concern is that a scenario is possible where a content object is changed and reindexed concurrently by two CMS processes. The database ensures consistency within the CMS, these transactions get comitted as T1 and T2. But I cannot see how to ensure that the reindexing operations (that result in a delete and add for the document) are processed in the order R1 then R2, rather than R2 then R1. In the second case the index record is now inconsistent with the content object in the database. I would like to supply a transaction id with the reindex request, and configure Solr such that a reindex operation is processed if and only if the supplied transaction id is greater than the currently indexed transaction id. Otherwise the only way I can see to guarantee consistency is 1) have index operations processed by a single writer, or 2) commit the index operation between database prepare and commit statements. The first is not desirable as we introduce a single point of failure (in addition to the single Solr server) and delay updating the index. The second is not desirable because it reduces the throughput of the database, and with a distributed Solr setup would not solve the problem. From what I can tell this conditional indexing feature is not supported by Solr. Might it be supported by Lucene but not exposed by Solr? Thanks, Laurence 2008/12/4 Shalin Shekhar Mangar [EMAIL PROTECTED]: It is not clear how you are using Solr i.e. distributed vs single index. Summarily, Solr does not update documents. It overwrites the old document with the new one if an old document with the same uniqueKey exists in the index. Does that answer your question? On Thu, Dec 4, 2008 at 1:46 AM, Laurence Rowe [EMAIL PROTECTED] wrote: Hi, Our CMS is distributed over a cluster and I was wandering how I can ensure that index records of newer versions of documents are never overwritten by older ones. Amazon AWS uses a timestamp on requests to ensure 'eventual consistency' of operations. Is there a way to supply a transaction ID with an update so an update is conditional on the supplied transaction id being greater than the existing indexed transaction id? Laurence -- Regards, Shalin Shekhar Mangar.
Re: Ordering updates
On Fri, Dec 5, 2008 at 2:42 AM, Laurence Rowe [EMAIL PROTECTED] wrote: We currently have a single Solr server, with a single index. There are a number of CMS processes distributed over a number of servers, with each CMS process sending an update to the Solr index when changes are made to a content object. My concern is that a scenario is possible where a content object is changed and reindexed concurrently by two CMS processes. The database ensures consistency within the CMS, these transactions get comitted as T1 and T2. If each CMS process has a consistent view of the data and it wishes to update Solr with that data, where is the question of inconsistency here? But I cannot see how to ensure that the reindexing operations (that result in a delete and add for the document) are processed in the order R1 then R2, rather than R2 then R1. In the second case the index record is now inconsistent with the content object in the database. When you need to update a document in Solr, you need to send the complete document and it will automatically do the replace. They will be visible to searchers when you call commit on Solr. From your CMS's perspective, it is a single operation. I hope I am understanding your problem correctly. I would like to supply a transaction id with the reindex request, and configure Solr such that a reindex operation is processed if and only if the supplied transaction id is greater than the currently indexed transaction id. Otherwise the only way I can see to guarantee consistency is 1) have index operations processed by a single writer, or 2) commit the index operation between database prepare and commit statements. The first is not desirable as we introduce a single point of failure (in addition to the single Solr server) and delay updating the index. The second is not desirable because it reduces the throughput of the database, and with a distributed Solr setup would not solve the problem. From what I can tell this conditional indexing feature is not supported by Solr. Might it be supported by Lucene but not exposed by Solr? No this is not supported by either of Lucene/Solr. -- Regards, Shalin Shekhar Mangar.
Boost a query by field at query time - Standard Request Handler
Here is the problem I am trying to solve. I have to use the Standard Request Handler. Query (can be quite complex, as it gets built from an advanced search form): term1^2.0 OR term2 OR term3 term4 I have 3 fields - content (the default search field), title and url. Any matches in the title or url fields should be weighed more. I can specify index time boosting for these two fields, but I would rather not, as it is a heavy handed solution. I need to make it user configurable for advanced search. What should my query to SOLR be? Something like this? content:term1^2.0 OR content:term2 OR content:term3 term4 OR title:term1^2.0 OR title:term2 OR title:term3 term4 OR url:term1^2.0 OR url:term2 OR url:term3 term4 Looks like it can get pretty long and error prone. With the 'dismax' handler I can simply specify qf=content title^2 url^2 no matter how complex the 'q' parameter is. Is there a similar easier way I can do query time boosting with Standard Request Handler, that I am missing? Thanks for your help - ashok -- View this message in context: http://www.nabble.com/Boost-a-query--by-field-at-query-time---Standard-Request-Handler-tp20842675p20842675.html Sent from the Solr - User mailing list archive at Nabble.com.
Standard request with functional query
Hi guys, I have a standard query that searches across multiple text fields such as q=title:iphone OR bodytext:iphone OR title:firmware OR bodytext:firmware This comes back with documents that have iphone and firmware (I know I can use dismax handler but it seems to be really slow), which is great. Now I want to give some more weight to more recent documents (there is a dateCreated field in each document). So I've modified the query as such: (title:iphone OR bodytext:iphone OR title:firmware OR bodytext:firmware) AND _val_:ord(dateCreated)^0.1 URLencoded to q=(title%3Aiphone+OR+bodytext%3Aiphone+OR+title%3Afirmware+OR+bodytext%3Afirmware)+AND+_val_%3Aord(dateCreated)^0.1 However, the results are not as one would expects. The first few documents only come back with the word iphone and appears to be sorted by date created. It seems to completely ignore the score and use the dateCreated field for the score. On a not directly related issue it seems like if you put the weight within the double quotes: (title:iphone OR bodytext:iphone OR title:firmware OR bodytext:firmware) AND _val_:ord(dateCreated)^0.1 the parser complains: org.apache.lucene.queryParser.ParseException: Cannot parse '(title:iphone OR bodytext:iphone OR title:firmware OR bodytext:firmware) AND _val_:ord(dateCreated)^0.1': Expected ',' at position 16 in 'ord(dateCreated)^0.1' Thanks, Sammy
Re: Standard request with functional query
On Thu, Dec 4, 2008 at 4:35 PM, Sammy Yu [EMAIL PROTECTED] wrote: bodytext:firmware) AND _val_:ord(dateCreated)^0.1': Expected ',' at position 16 in 'ord(dateCreated)^0.1' ^0.1 is not function query syntax, it's Lucene/Solr QueryParser syntax. Try _val_:ord(dateCreated)^0.1 -Yonik
Merging Indices
The SOLR wiki says 3. Make sure both indexes you want to merge are closed. What exactly does 'closed' mean? 1. Do I need to stop SOLR search on both indexes before running the merge command? So a brief downtime is required? Or do I simply prevent any 'updates/deletes' to these indices during the merge time so they can still serve up results (read only?) while I am creating a new merged index? 2. Before the new index replaces the old index, do I need to stop SOLR for that instance? Or can I simply move the old index out and place the new index in the same place, without having to stop SOLR 3. If SOLR has to be stopped during the merge operation, can we work with a redundant/failover instance and stagger the merge so the search service will not go down? Any guidelines here are welcome. Thanks - ashok -- View this message in context: http://www.nabble.com/Merging-Indices-tp20845009p20845009.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solrj queries
Hola! soy nueva en solr, yo uso solrj, y tu pregunta es como consultar con solrj mira, ahí te va una pequeña descripción de lo que puedes hacer: 1. hacer tu jsp o php, etc. donde el usuario introducira el nombre y apellido, después lees esos valores de tal vez con algun getText. 2. entonces construyes la consulta: suponiendo que en tu esquema definiste el campo de nombre como name y el de apellido como lname; y que en el jsp o php (en tu formulario pues) los campos de texto donde el usuario introduce el nombre y apellido se llaman nam, lnam, respectivamente, entonces tendrías que hacer algo así: String Consulta = name: +nam.getText();+ lname: +lnam.getText(); bien hasta ahora solo tienes una cadena que contiene la consulta que le mandarás a solr, esta cadena la recibe de parámetro el objeto query, entonces lo construyes: SolrQuery query; QueryResponse qrsp; //función que consulta public SolrDocumentList consultar(String Consulta) throws SolrServerException { SolrDocumentList docs; query = new SolrQuery(); query.setQuery( Consulta ); qrsp = server.query( query ); docs= qrsp.getResults(); return docs; } entonces solo mandarías llamar esa función pasando de parámetro la cadena consulta que creaste antes: consultar(Consulta) -- View this message in context: http://www.nabble.com/Solrj-queries-tp20494859p20845145.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solrj queries
me faltó ; jejeje -- View this message in context: http://www.nabble.com/Solrj-queries-tp20494859p20845222.html Sent from the Solr - User mailing list archive at Nabble.com.
Quick thanks for all the assistance getting me up to speed
Hi Yonik Seeley, Erik Hatcher and others. Thanks for all the help fixing the bugs I ran into using the new 1.3 distributed features with rails (shards). I now have medline fully indexed in 7 solr shards (with 2 spare). Each server has 8GB RAM and a Quad Core 2.4GHz. As a test, I ran about 2 million queries over it last night (pubmed gets about 3 million per day) and the response time was within a few seconds. This was also under write load as well which made me feel very confidant in the scalability of solr and lucene. -- Regards, Ian Connor pubget.com Cambridge, MA iconnor [at] mit.edu
Re: NIO not working yet
I've updated my deployment to use NIOFSDirectory. Now I'd like to confirm some previous results with the original FSDirectory. Can I turn it off with a parameter? I tried: java -Dorg.apache.lucene.FSDirectory.class=org.apache.lucene.store.FSDirectory ... but that didn't work. -- View this message in context: http://www.nabble.com/NIO-not-working-yet-tp20468152p20845732.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr on Solaris
We are running solr on a solaris box with 4 CPU's(8 cores) and 3GB Ram. When we try to index sometimes the HTTP Connection just hangs and the client which is posting documents to solr doesn't get any response back. We since then have added timeouts to our http requests from the clients. I then get this error. java.lang.OutOfMemoryError: requested 239848 bytes for Chunk::new. Out of swap space? java.lang.OutOfMemoryError: unable to create new native thread Exception in thread JmxRmiRegistryConnectionPoller java.lang.OutOfMemoryError: unable to create new native thread We are running JDK 1.6_10 on the solaris box. . The weird thing is we are running the same application on linux box with JDK 1.6 and we haven't seen any problem like this. Any suggestions? -Raghu
Re: Solr on Solaris
Just curious, is this off a zone by any chance? - Jon On Dec 4, 2008, at 10:40 PM, Kashyap, Raghu wrote: We are running solr on a solaris box with 4 CPU's(8 cores) and 3GB Ram. When we try to index sometimes the HTTP Connection just hangs and the client which is posting documents to solr doesn't get any response back. We since then have added timeouts to our http requests from the clients. I then get this error. java.lang.OutOfMemoryError: requested 239848 bytes for Chunk::new. Out of swap space? java.lang.OutOfMemoryError: unable to create new native thread Exception in thread JmxRmiRegistryConnectionPoller java.lang.OutOfMemoryError: unable to create new native thread We are running JDK 1.6_10 on the solaris box. . The weird thing is we are running the same application on linux box with JDK 1.6 and we haven't seen any problem like this. Any suggestions? -Raghu
Re: Is there a clean way to determine whether a core exists?
: Subject: Is there a clean way to determine whether a core exists? doesn't the CoreAdminHandler's STATUS feature make this easy? -Hoss
Re: Is there a clean way to determine whether a core exists?
SOLR-880 is an issue raised for the same On Fri, Dec 5, 2008 at 10:16 AM, Chris Hostetter [EMAIL PROTECTED] wrote: : Subject: Is there a clean way to determine whether a core exists? doesn't the CoreAdminHandler's STATUS feature make this easy? -Hoss -- --Noble Paul
Re: Is there a clean way to determine whether a core exists?
On Dec 4, 2008, at 3:57 PM, Dean Thompson wrote: Thanks for the quick response, Ryan! Actually, my admin/ping call gives me a 404 if the core doesn't exist, which seemed reasonable. I get the 500 if the core *did* exist. aaah -- check what ping query you have configured and make sure that is a valid query. If you use the example one and then change your schema it may be referencing fields that don't exist and give you the 500 Thanks for the suggestion of using the select URL, but that gives me: java.lang.NullPointerException at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:37) This is a really stupid error we need to fix -- it should actually say: missing required parameter q https://issues.apache.org/jira/browse/SOLR-435 ryan
Re: Is there a clean way to determine whether a core exists?
yes: http://localhost:8983/solr/admin/cores?action=STATUS will give you a list of running cores. However that is not easy to check with a simple status != 404 see: http://wiki.apache.org/solr/CoreAdmin On Dec 4, 2008, at 11:46 PM, Chris Hostetter wrote: : Subject: Is there a clean way to determine whether a core exists? doesn't the CoreAdminHandler's STATUS feature make this easy? -Hoss