Re: Question on StreamingUpdateSolrServer
Hi, Lots of little things to look at here. You should do lsof as root, and it looks like you aren't doing that. You should double-check Tomcat's maxThreads param in server.xml. You should give Jetty a try. I don't think you said anything about looking at the container's or solr logs and finding errors. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, April 15, 2009 7:28:57 PM Subject: Re: Question on StreamingUpdateSolrServer Thanks Otis. I did increase the number of file descriptors to 22K, but I still get this problem. I've noticed following so far, 1) As soon as I get to around 1140 index segments (this is total over multiple cores) I start seeing this problem. 2) When the problem starts occassionally the index request (solrserver.commit) also fails with the following error, java.net.SocketException: Connection reset 3) Whenever the commit fails, I'm able to access Solr by the browser (http://ets11.co.com/solr). If the commit is succssfull and going on I get blank page on Firefox. Even the telnet to 8080 fails with Connection closed by foreign host. It does seem like there is some resource issue as it happens only once we reach a breaking point (too many index segment files) - lsof at this point usually shows at 1400, but my ulimit is much higher than that. I already use compound format for index files. I can also run optimize occassionally (though not preferred as it blocks the whole index cycle for a long time). I do want to find out what resource limitation is causing this and it has to do something with when Indexer is committing the records where there are large number of segment files. Any other ideas? Thanks, -vivek On Wed, Apr 15, 2009 at 3:10 PM, Otis Gospodnetic wrote: One more thing. I don't think this was mentioned, but you can: - optimize your indices - use compound index format That will lower the number of open file handles. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar To: solr-user@lucene.apache.org Sent: Friday, April 10, 2009 5:59:37 PM Subject: Re: Question on StreamingUpdateSolrServer I also noticed that the Solr app has over 6000 file handles open - lsof | grep solr | wc -l - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? 2) How can I reduce the open file count? 3) Is there a way to stop the auto-warmer? 4) Could this be related to Tomcat returning blank page for every request? Any ideas? Thanks, -vivek On Fri, Apr 10, 2009 at 1:48 PM, vivek sar wrote: Hi, I was using CommonsHttpSolrServer for indexing, but having two threads writing (10K batches) at the same time was throwing, ProtocolException: Unbuffered entity enclosing request can not be repeated. I switched to StreamingUpdateSolrServer (using addBeans) and I don't see the problem anymore. The speed is very fast - getting around 25k/sec (single thread), but I'm facing another problem. When the indexer using StreamingUpdateSolrServer is running I'm not able to send any url request from browser to Solr web app. I just get blank page. I can't even get to the admin interface. I'm also not able to shutdown the Tomcat running the Solr webapp when the Indexer is running. I've to first stop the Indexer app and then stop the Tomcat. I don't have this problem when using CommonsHttpSolrServer. Here is how I'm creating it, server = new StreamingUpdateSolrServer(url, 1000,3); I simply call server.addBeans(...) on it. Is there anything else I need to do to make use of StreamingUpdateSolrServer? Why does Tomcat become unresponsive when Indexer using StreamingUpdateSolrServer is running (though, indexing happens fine)? Thanks, -vivek
Re: Question on StreamingUpdateSolrServer
On Wed, Apr 15, 2009 at 7:28 PM, vivek sar vivex...@gmail.com wrote: lsof at this point usually shows at 1400, but my ulimit is much higher than that. Could you be hitting a kernel limit? cat /proc/sys/fs/file-max cat /proc/sys/fs/file-nr http://www.netadmintools.com/art295.html -Yonik http://www.lucidimagination.com
Re: Question on StreamingUpdateSolrServer
Quick comment - why so shy with number of open file descriptors? On some nothing-special machines from several years ago I had this limit set to 30K+ - here, for example: http://www.simpy.com/user/otis :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, April 14, 2009 3:12:41 AM Subject: Re: Question on StreamingUpdateSolrServer The machine's ulimit is set to 9000 and the OS has upper limit of 12000 on files. What would explain this? Has anyone tried Solr with 25 cores on the same Solr instance? Thanks, -vivek 2009/4/13 Noble Paul നോബിള് नोब्ळ् : On Tue, Apr 14, 2009 at 7:14 AM, vivek sar wrote: Some more update. As I mentioned earlier we are using multi-core Solr (up to 65 cores in one Solr instance with each core 10G). This was opening around 3000 file descriptors (lsof). I removed some cores and after some trial and error I found at 25 cores system seems to work fine (around 1400 file descriptors). Tomcat is responsive even when the indexing is happening at Solr (for 25 cores). But, as soon as it goes to 26 cores the Tomcat becomes unresponsive again. The puzzling thing is if I stop indexing I can search on even 65 cores, but while indexing is happening it seems to support only up to 25 cores. 1) Is there a limit on number of cores a Solr instance can handle? 2) Does Solr do anything to the existing cores while indexing? I'm writing to only one core at a time. There is no hard limit (it is Integer.MAX_VALUE) . But inreality your mileage depends on your hardware and no:of file handles the OS can open We are struggling to find why Tomcat stops responding on high number of cores while indexing is in-progress. Any help is very much appreciated. Thanks, -vivek On Mon, Apr 13, 2009 at 10:52 AM, vivek sar wrote: Here is some more information about my setup, Solr - v1.4 (nightly build 03/29/09) Servlet Container - Tomcat 6.0.18 JVM - 1.6.0 (64 bit) OS - Mac OS X Server 10.5.6 Hardware Overview: Processor Name: Quad-Core Intel Xeon Processor Speed: 3 GHz Number Of Processors: 2 Total Number Of Cores: 8 L2 Cache (per processor): 12 MB Memory: 20 GB Bus Speed: 1.6 GHz JVM Parameters (for Solr): export CATALINA_OPTS=-server -Xms6044m -Xmx6044m -DSOLR_APP -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -Dsun.rmi.dgc.client.gcInterval=360 -Dsun.rmi.dgc.server.gcInterval=360 Other: lsof|grep solr|wc -l 2493 ulimit -an open files (-n) 9000 Tomcat connectionTimeout=2 maxThreads=100 / Total Solr cores on same instance - 65 useCompoundFile - true The tests I ran, While Indexer is running 1) Go to http://juum19.co.com:8080/solr;- returns blank page (no error in the catalina.out) 2) Try telnet juum19.co.com 8080 - returns with Connection closed by foreign host Stop the Indexer Program (Tomcat is still running with Solr) 3) Go to http://juum19.co.com:8080/solr; - works ok, shows the list of all the Solr cores 4) Try telnet - able to Telnet fine 5) Now comment out all the caches in solrconfig.xml. Try same tests, but the Tomcat still doesn't response. Is there a way to stop the auto-warmer. I commented out the caches in the solrconfig.xml but still see the following log, INFO: autowarming result for searc...@3aba3830 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} INFO: Closing searc...@175dc1e2 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} 6) Change the Indexer frequency so it runs every 2 min (instead of all the time). I noticed once the commit is done, I'm able to run my searches. During commit and auto-warming period I just get blank page. 7) Changed from Solrj to XML update - I still get the blank page whenever update/commit is happening. Apr 13, 2009 6:46:18
Re: Question on StreamingUpdateSolrServer
One more thing. I don't think this was mentioned, but you can: - optimize your indices - use compound index format That will lower the number of open file handles. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, April 10, 2009 5:59:37 PM Subject: Re: Question on StreamingUpdateSolrServer I also noticed that the Solr app has over 6000 file handles open - lsof | grep solr | wc -l - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? 2) How can I reduce the open file count? 3) Is there a way to stop the auto-warmer? 4) Could this be related to Tomcat returning blank page for every request? Any ideas? Thanks, -vivek On Fri, Apr 10, 2009 at 1:48 PM, vivek sar wrote: Hi, I was using CommonsHttpSolrServer for indexing, but having two threads writing (10K batches) at the same time was throwing, ProtocolException: Unbuffered entity enclosing request can not be repeated. I switched to StreamingUpdateSolrServer (using addBeans) and I don't see the problem anymore. The speed is very fast - getting around 25k/sec (single thread), but I'm facing another problem. When the indexer using StreamingUpdateSolrServer is running I'm not able to send any url request from browser to Solr web app. I just get blank page. I can't even get to the admin interface. I'm also not able to shutdown the Tomcat running the Solr webapp when the Indexer is running. I've to first stop the Indexer app and then stop the Tomcat. I don't have this problem when using CommonsHttpSolrServer. Here is how I'm creating it, server = new StreamingUpdateSolrServer(url, 1000,3); I simply call server.addBeans(...) on it. Is there anything else I need to do to make use of StreamingUpdateSolrServer? Why does Tomcat become unresponsive when Indexer using StreamingUpdateSolrServer is running (though, indexing happens fine)? Thanks, -vivek
Re: Question on StreamingUpdateSolrServer
Thanks Otis. I did increase the number of file descriptors to 22K, but I still get this problem. I've noticed following so far, 1) As soon as I get to around 1140 index segments (this is total over multiple cores) I start seeing this problem. 2) When the problem starts occassionally the index request (solrserver.commit) also fails with the following error, java.net.SocketException: Connection reset 3) Whenever the commit fails, I'm able to access Solr by the browser (http://ets11.co.com/solr). If the commit is succssfull and going on I get blank page on Firefox. Even the telnet to 8080 fails with Connection closed by foreign host. It does seem like there is some resource issue as it happens only once we reach a breaking point (too many index segment files) - lsof at this point usually shows at 1400, but my ulimit is much higher than that. I already use compound format for index files. I can also run optimize occassionally (though not preferred as it blocks the whole index cycle for a long time). I do want to find out what resource limitation is causing this and it has to do something with when Indexer is committing the records where there are large number of segment files. Any other ideas? Thanks, -vivek On Wed, Apr 15, 2009 at 3:10 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: One more thing. I don't think this was mentioned, but you can: - optimize your indices - use compound index format That will lower the number of open file handles. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: vivek sar vivex...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, April 10, 2009 5:59:37 PM Subject: Re: Question on StreamingUpdateSolrServer I also noticed that the Solr app has over 6000 file handles open - lsof | grep solr | wc -l - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? 2) How can I reduce the open file count? 3) Is there a way to stop the auto-warmer? 4) Could this be related to Tomcat returning blank page for every request? Any ideas? Thanks, -vivek On Fri, Apr 10, 2009 at 1:48 PM, vivek sar wrote: Hi, I was using CommonsHttpSolrServer for indexing, but having two threads writing (10K batches) at the same time was throwing, ProtocolException: Unbuffered entity enclosing request can not be repeated. I switched to StreamingUpdateSolrServer (using addBeans) and I don't see the problem anymore. The speed is very fast - getting around 25k/sec (single thread), but I'm facing another problem. When the indexer using StreamingUpdateSolrServer is running I'm not able to send any url request from browser to Solr web app. I just get blank page. I can't even get to the admin interface. I'm also not able to shutdown the Tomcat running the Solr webapp when the Indexer is running. I've to first stop the Indexer app and then stop the Tomcat. I don't have this problem when using CommonsHttpSolrServer. Here is how I'm creating it, server = new StreamingUpdateSolrServer(url, 1000,3); I simply call server.addBeans(...) on it. Is there anything else I need to do to make use of StreamingUpdateSolrServer? Why does Tomcat become unresponsive when Indexer using StreamingUpdateSolrServer is running (though, indexing happens fine)? Thanks, -vivek
Re: Question on StreamingUpdateSolrServer
The machine's ulimit is set to 9000 and the OS has upper limit of 12000 on files. What would explain this? Has anyone tried Solr with 25 cores on the same Solr instance? Thanks, -vivek 2009/4/13 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: On Tue, Apr 14, 2009 at 7:14 AM, vivek sar vivex...@gmail.com wrote: Some more update. As I mentioned earlier we are using multi-core Solr (up to 65 cores in one Solr instance with each core 10G). This was opening around 3000 file descriptors (lsof). I removed some cores and after some trial and error I found at 25 cores system seems to work fine (around 1400 file descriptors). Tomcat is responsive even when the indexing is happening at Solr (for 25 cores). But, as soon as it goes to 26 cores the Tomcat becomes unresponsive again. The puzzling thing is if I stop indexing I can search on even 65 cores, but while indexing is happening it seems to support only up to 25 cores. 1) Is there a limit on number of cores a Solr instance can handle? 2) Does Solr do anything to the existing cores while indexing? I'm writing to only one core at a time. There is no hard limit (it is Integer.MAX_VALUE) . But inreality your mileage depends on your hardware and no:of file handles the OS can open We are struggling to find why Tomcat stops responding on high number of cores while indexing is in-progress. Any help is very much appreciated. Thanks, -vivek On Mon, Apr 13, 2009 at 10:52 AM, vivek sar vivex...@gmail.com wrote: Here is some more information about my setup, Solr - v1.4 (nightly build 03/29/09) Servlet Container - Tomcat 6.0.18 JVM - 1.6.0 (64 bit) OS - Mac OS X Server 10.5.6 Hardware Overview: Processor Name: Quad-Core Intel Xeon Processor Speed: 3 GHz Number Of Processors: 2 Total Number Of Cores: 8 L2 Cache (per processor): 12 MB Memory: 20 GB Bus Speed: 1.6 GHz JVM Parameters (for Solr): export CATALINA_OPTS=-server -Xms6044m -Xmx6044m -DSOLR_APP -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -Dsun.rmi.dgc.client.gcInterval=360 -Dsun.rmi.dgc.server.gcInterval=360 Other: lsof|grep solr|wc -l 2493 ulimit -an open files (-n) 9000 Tomcat Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 maxThreads=100 / Total Solr cores on same instance - 65 useCompoundFile - true The tests I ran, While Indexer is running 1) Go to http://juum19.co.com:8080/solr; - returns blank page (no error in the catalina.out) 2) Try telnet juum19.co.com 8080 - returns with Connection closed by foreign host Stop the Indexer Program (Tomcat is still running with Solr) 3) Go to http://juum19.co.com:8080/solr; - works ok, shows the list of all the Solr cores 4) Try telnet - able to Telnet fine 5) Now comment out all the caches in solrconfig.xml. Try same tests, but the Tomcat still doesn't response. Is there a way to stop the auto-warmer. I commented out the caches in the solrconfig.xml but still see the following log, INFO: autowarming result for searc...@3aba3830 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} INFO: Closing searc...@175dc1e2 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} 6) Change the Indexer frequency so it runs every 2 min (instead of all the time). I noticed once the commit is done, I'm able to run my searches. During commit and auto-warming period I just get blank page. 7) Changed from Solrj to XML update - I still get the blank page whenever update/commit is happening. Apr 13, 2009 6:46:18 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[621094001, 621094002, 621094003, 621094004, 621094005, 621094006, 621094007, 621094008, ...(6992 more)]} 0 1948 Apr 13, 2009 6:46:18 PM org.apache.solr.core.SolrCore execute INFO: [20090413_12] webapp=/solr path=/update params={} status=0 QTime=1948 So, looks like it's not just StreamingUpdateSolrServer, but whenever the update/commit is happening I'm not able to search. I don't know if it's related to using
Re: Question on StreamingUpdateSolrServer
I index in 10K batches and commit after 5 index cyles (after 50K). Is there any limitation that I can't search during commit or auto-warming? I got 8 CPU cores and only 2 were showing busy (using top) - so it's unlikely that the CPU was pegged. 2009/4/12 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: If you use StreamingUpdateSolrServer it POSTs all the docs in a single request. 10 million docs may be a bit too much for a single request. I guess you should batch it in multiple requests of smaller chunks, It is likely that the CPU is really hot when the autowarming is hapening. getting a decent search perf w/o autowarming is not easy . autowarmCount is an attribute of a cache .see here http://wiki.apache.org/solr/SolrCaching On Mon, Apr 13, 2009 at 3:32 AM, vivek sar vivex...@gmail.com wrote: Thanks Shalin. I noticed couple more things. As I index around 100 million records a day, my Indexer is running pretty much at all times throughout the day. Whenever I run a search query I usually get connection reset when the commit is happening and get blank page when the auto-warming of searchers is happening. Here are my questions, 1) Is this coincidence or a known issue? Can't we search while commit or auto-warming is happening? 2) How do I stop auto-warming? My search traffic is very low so I'm trying to turn off auto-warming after commit has happened - is there anything in the solrconfig.xml to do that? 3) What would be the best strategy for searching in my scenario where commits may be happening all the time (I commit every 50K records - so every 30-60 sec there is a commit happening followed by auto-warming that takes 40 sec)? Search frequency is pretty low for us, but we want to make sure that whenever it happens it is fast enough and returns result (instead of exception or a blank screen). Thanks for all the help. -vivek On Sat, Apr 11, 2009 at 1:48 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Sun, Apr 12, 2009 at 2:15 AM, vivek sar vivex...@gmail.com wrote: The problem is I don't see any error message in the catalina.out. I don't even see the request coming in - I simply get blank page on browser. If I keep trying the request goes through and I get respond from Solr, but then it become unresponsive again or sometimes throws connection reset error. I'm not sure why would it work sometimes and not the other times for the same query. As soon as I stop the Indexer process things start working fine. Any way I can debug this problem? I'm not sure. I've never seen this issue myself. Could you try using the bundled jetty instead of Tomcat or on a different box just to make sure this is not an environment specific issue? -- Regards, Shalin Shekhar Mangar. -- --Noble Paul
Re: Question on StreamingUpdateSolrServer
On Mon, Apr 13, 2009 at 12:36 PM, vivek sar vivex...@gmail.com wrote: I index in 10K batches and commit after 5 index cyles (after 50K). Is there any limitation that I can't search during commit or auto-warming? I got 8 CPU cores and only 2 were showing busy (using top) - so it's unlikely that the CPU was pegged. No, there is no such limitation. The old searcher will continue to serve search requests until the new one is warmed and registered. So, CPU does not seem to be an issue. Does this happen only when you use StreamingUpdateSolrServer? Which OS, file system? What JVM parameters are you using? Which servlet container and version? -- Regards, Shalin Shekhar Mangar.
Re: Question on StreamingUpdateSolrServer
Here is some more information about my setup, Solr - v1.4 (nightly build 03/29/09) Servlet Container - Tomcat 6.0.18 JVM - 1.6.0 (64 bit) OS - Mac OS X Server 10.5.6 Hardware Overview: Processor Name: Quad-Core Intel Xeon Processor Speed: 3 GHz Number Of Processors: 2 Total Number Of Cores: 8 L2 Cache (per processor): 12 MB Memory: 20 GB Bus Speed: 1.6 GHz JVM Parameters (for Solr): export CATALINA_OPTS=-server -Xms6044m -Xmx6044m -DSOLR_APP -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -Dsun.rmi.dgc.client.gcInterval=360 -Dsun.rmi.dgc.server.gcInterval=360 Other: lsof|grep solr|wc -l 2493 ulimit -an open files (-n) 9000 Tomcat Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 maxThreads=100 / Total Solr cores on same instance - 65 useCompoundFile - true The tests I ran, While Indexer is running 1) Go to http://juum19.co.com:8080/solr;- returns blank page (no error in the catalina.out) 2) Try telnet juum19.co.com 8080 - returns with Connection closed by foreign host Stop the Indexer Program (Tomcat is still running with Solr) 3) Go to http://juum19.co.com:8080/solr; - works ok, shows the list of all the Solr cores 4) Try telnet - able to Telnet fine 5) Now comment out all the caches in solrconfig.xml. Try same tests, but the Tomcat still doesn't response. Is there a way to stop the auto-warmer. I commented out the caches in the solrconfig.xml but still see the following log, INFO: autowarming result for searc...@3aba3830 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} INFO: Closing searc...@175dc1e2 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} 6) Change the Indexer frequency so it runs every 2 min (instead of all the time). I noticed once the commit is done, I'm able to run my searches. During commit and auto-warming period I just get blank page. 7) Changed from Solrj to XML update - I still get the blank page whenever update/commit is happening. Apr 13, 2009 6:46:18 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[621094001, 621094002, 621094003, 621094004, 621094005, 621094006, 621094007, 621094008, ...(6992 more)]} 0 1948 Apr 13, 2009 6:46:18 PM org.apache.solr.core.SolrCore execute INFO: [20090413_12] webapp=/solr path=/update params={} status=0 QTime=1948 So, looks like it's not just StreamingUpdateSolrServer, but whenever the update/commit is happening I'm not able to search. I don't know if it's related to using multi-core. In this test I was using only single thread for update to a single core using only single Solr instance. So, it's clearly related to index process (update, commit and auto-warming). As soon as update/commit/auto-warming is completed I'm able to run my queries again. Is there anything that could stop searching while update process is in-progress - like any lock or something? Any other ideas? Thanks, -vivek On Mon, Apr 13, 2009 at 12:14 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Mon, Apr 13, 2009 at 12:36 PM, vivek sar vivex...@gmail.com wrote: I index in 10K batches and commit after 5 index cyles (after 50K). Is there any limitation that I can't search during commit or auto-warming? I got 8 CPU cores and only 2 were showing busy (using top) - so it's unlikely that the CPU was pegged. No, there is no such limitation. The old searcher will continue to serve search requests until the new one is warmed and registered. So, CPU does not seem to be an issue. Does this happen only when you use StreamingUpdateSolrServer? Which OS, file system? What JVM parameters are you using? Which servlet container and version? -- Regards, Shalin Shekhar Mangar.
Re: Question on StreamingUpdateSolrServer
Some more update. As I mentioned earlier we are using multi-core Solr (up to 65 cores in one Solr instance with each core 10G). This was opening around 3000 file descriptors (lsof). I removed some cores and after some trial and error I found at 25 cores system seems to work fine (around 1400 file descriptors). Tomcat is responsive even when the indexing is happening at Solr (for 25 cores). But, as soon as it goes to 26 cores the Tomcat becomes unresponsive again. The puzzling thing is if I stop indexing I can search on even 65 cores, but while indexing is happening it seems to support only up to 25 cores. 1) Is there a limit on number of cores a Solr instance can handle? 2) Does Solr do anything to the existing cores while indexing? I'm writing to only one core at a time. We are struggling to find why Tomcat stops responding on high number of cores while indexing is in-progress. Any help is very much appreciated. Thanks, -vivek On Mon, Apr 13, 2009 at 10:52 AM, vivek sar vivex...@gmail.com wrote: Here is some more information about my setup, Solr - v1.4 (nightly build 03/29/09) Servlet Container - Tomcat 6.0.18 JVM - 1.6.0 (64 bit) OS - Mac OS X Server 10.5.6 Hardware Overview: Processor Name: Quad-Core Intel Xeon Processor Speed: 3 GHz Number Of Processors: 2 Total Number Of Cores: 8 L2 Cache (per processor): 12 MB Memory: 20 GB Bus Speed: 1.6 GHz JVM Parameters (for Solr): export CATALINA_OPTS=-server -Xms6044m -Xmx6044m -DSOLR_APP -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -Dsun.rmi.dgc.client.gcInterval=360 -Dsun.rmi.dgc.server.gcInterval=360 Other: lsof|grep solr|wc -l 2493 ulimit -an open files (-n) 9000 Tomcat Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 maxThreads=100 / Total Solr cores on same instance - 65 useCompoundFile - true The tests I ran, While Indexer is running 1) Go to http://juum19.co.com:8080/solr; - returns blank page (no error in the catalina.out) 2) Try telnet juum19.co.com 8080 - returns with Connection closed by foreign host Stop the Indexer Program (Tomcat is still running with Solr) 3) Go to http://juum19.co.com:8080/solr; - works ok, shows the list of all the Solr cores 4) Try telnet - able to Telnet fine 5) Now comment out all the caches in solrconfig.xml. Try same tests, but the Tomcat still doesn't response. Is there a way to stop the auto-warmer. I commented out the caches in the solrconfig.xml but still see the following log, INFO: autowarming result for searc...@3aba3830 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} INFO: Closing searc...@175dc1e2 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} 6) Change the Indexer frequency so it runs every 2 min (instead of all the time). I noticed once the commit is done, I'm able to run my searches. During commit and auto-warming period I just get blank page. 7) Changed from Solrj to XML update - I still get the blank page whenever update/commit is happening. Apr 13, 2009 6:46:18 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[621094001, 621094002, 621094003, 621094004, 621094005, 621094006, 621094007, 621094008, ...(6992 more)]} 0 1948 Apr 13, 2009 6:46:18 PM org.apache.solr.core.SolrCore execute INFO: [20090413_12] webapp=/solr path=/update params={} status=0 QTime=1948 So, looks like it's not just StreamingUpdateSolrServer, but whenever the update/commit is happening I'm not able to search. I don't know if it's related to using multi-core. In this test I was using only single thread for update to a single core using only single Solr instance. So, it's clearly related to index process (update, commit and auto-warming). As soon as update/commit/auto-warming is completed I'm able to run my queries again. Is there anything that could stop searching while update process is in-progress - like any lock or something? Any other ideas? Thanks, -vivek On Mon, Apr 13, 2009 at 12:14 AM, Shalin Shekhar
Re: Question on StreamingUpdateSolrServer
On Tue, Apr 14, 2009 at 7:14 AM, vivek sar vivex...@gmail.com wrote: Some more update. As I mentioned earlier we are using multi-core Solr (up to 65 cores in one Solr instance with each core 10G). This was opening around 3000 file descriptors (lsof). I removed some cores and after some trial and error I found at 25 cores system seems to work fine (around 1400 file descriptors). Tomcat is responsive even when the indexing is happening at Solr (for 25 cores). But, as soon as it goes to 26 cores the Tomcat becomes unresponsive again. The puzzling thing is if I stop indexing I can search on even 65 cores, but while indexing is happening it seems to support only up to 25 cores. 1) Is there a limit on number of cores a Solr instance can handle? 2) Does Solr do anything to the existing cores while indexing? I'm writing to only one core at a time. There is no hard limit (it is Integer.MAX_VALUE) . But inreality your mileage depends on your hardware and no:of file handles the OS can open We are struggling to find why Tomcat stops responding on high number of cores while indexing is in-progress. Any help is very much appreciated. Thanks, -vivek On Mon, Apr 13, 2009 at 10:52 AM, vivek sar vivex...@gmail.com wrote: Here is some more information about my setup, Solr - v1.4 (nightly build 03/29/09) Servlet Container - Tomcat 6.0.18 JVM - 1.6.0 (64 bit) OS - Mac OS X Server 10.5.6 Hardware Overview: Processor Name: Quad-Core Intel Xeon Processor Speed: 3 GHz Number Of Processors: 2 Total Number Of Cores: 8 L2 Cache (per processor): 12 MB Memory: 20 GB Bus Speed: 1.6 GHz JVM Parameters (for Solr): export CATALINA_OPTS=-server -Xms6044m -Xmx6044m -DSOLR_APP -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -Dsun.rmi.dgc.client.gcInterval=360 -Dsun.rmi.dgc.server.gcInterval=360 Other: lsof|grep solr|wc -l 2493 ulimit -an open files (-n) 9000 Tomcat Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 maxThreads=100 / Total Solr cores on same instance - 65 useCompoundFile - true The tests I ran, While Indexer is running 1) Go to http://juum19.co.com:8080/solr; - returns blank page (no error in the catalina.out) 2) Try telnet juum19.co.com 8080 - returns with Connection closed by foreign host Stop the Indexer Program (Tomcat is still running with Solr) 3) Go to http://juum19.co.com:8080/solr; - works ok, shows the list of all the Solr cores 4) Try telnet - able to Telnet fine 5) Now comment out all the caches in solrconfig.xml. Try same tests, but the Tomcat still doesn't response. Is there a way to stop the auto-warmer. I commented out the caches in the solrconfig.xml but still see the following log, INFO: autowarming result for searc...@3aba3830 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} INFO: Closing searc...@175dc1e2 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} 6) Change the Indexer frequency so it runs every 2 min (instead of all the time). I noticed once the commit is done, I'm able to run my searches. During commit and auto-warming period I just get blank page. 7) Changed from Solrj to XML update - I still get the blank page whenever update/commit is happening. Apr 13, 2009 6:46:18 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[621094001, 621094002, 621094003, 621094004, 621094005, 621094006, 621094007, 621094008, ...(6992 more)]} 0 1948 Apr 13, 2009 6:46:18 PM org.apache.solr.core.SolrCore execute INFO: [20090413_12] webapp=/solr path=/update params={} status=0 QTime=1948 So, looks like it's not just StreamingUpdateSolrServer, but whenever the update/commit is happening I'm not able to search. I don't know if it's related to using multi-core. In this test I was using only single thread for update to a single core using only single Solr instance. So, it's clearly related to index process (update, commit and auto-warming). As soon as update/commit/auto-warming is completed I'm
Re: Question on StreamingUpdateSolrServer
Thanks Shalin. I noticed couple more things. As I index around 100 million records a day, my Indexer is running pretty much at all times throughout the day. Whenever I run a search query I usually get connection reset when the commit is happening and get blank page when the auto-warming of searchers is happening. Here are my questions, 1) Is this coincidence or a known issue? Can't we search while commit or auto-warming is happening? 2) How do I stop auto-warming? My search traffic is very low so I'm trying to turn off auto-warming after commit has happened - is there anything in the solrconfig.xml to do that? 3) What would be the best strategy for searching in my scenario where commits may be happening all the time (I commit every 50K records - so every 30-60 sec there is a commit happening followed by auto-warming that takes 40 sec)? Search frequency is pretty low for us, but we want to make sure that whenever it happens it is fast enough and returns result (instead of exception or a blank screen). Thanks for all the help. -vivek On Sat, Apr 11, 2009 at 1:48 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Sun, Apr 12, 2009 at 2:15 AM, vivek sar vivex...@gmail.com wrote: The problem is I don't see any error message in the catalina.out. I don't even see the request coming in - I simply get blank page on browser. If I keep trying the request goes through and I get respond from Solr, but then it become unresponsive again or sometimes throws connection reset error. I'm not sure why would it work sometimes and not the other times for the same query. As soon as I stop the Indexer process things start working fine. Any way I can debug this problem? I'm not sure. I've never seen this issue myself. Could you try using the bundled jetty instead of Tomcat or on a different box just to make sure this is not an environment specific issue? -- Regards, Shalin Shekhar Mangar.
Re: Question on StreamingUpdateSolrServer
If you use StreamingUpdateSolrServer it POSTs all the docs in a single request. 10 million docs may be a bit too much for a single request. I guess you should batch it in multiple requests of smaller chunks, It is likely that the CPU is really hot when the autowarming is hapening. getting a decent search perf w/o autowarming is not easy . autowarmCount is an attribute of a cache .see here http://wiki.apache.org/solr/SolrCaching On Mon, Apr 13, 2009 at 3:32 AM, vivek sar vivex...@gmail.com wrote: Thanks Shalin. I noticed couple more things. As I index around 100 million records a day, my Indexer is running pretty much at all times throughout the day. Whenever I run a search query I usually get connection reset when the commit is happening and get blank page when the auto-warming of searchers is happening. Here are my questions, 1) Is this coincidence or a known issue? Can't we search while commit or auto-warming is happening? 2) How do I stop auto-warming? My search traffic is very low so I'm trying to turn off auto-warming after commit has happened - is there anything in the solrconfig.xml to do that? 3) What would be the best strategy for searching in my scenario where commits may be happening all the time (I commit every 50K records - so every 30-60 sec there is a commit happening followed by auto-warming that takes 40 sec)? Search frequency is pretty low for us, but we want to make sure that whenever it happens it is fast enough and returns result (instead of exception or a blank screen). Thanks for all the help. -vivek On Sat, Apr 11, 2009 at 1:48 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Sun, Apr 12, 2009 at 2:15 AM, vivek sar vivex...@gmail.com wrote: The problem is I don't see any error message in the catalina.out. I don't even see the request coming in - I simply get blank page on browser. If I keep trying the request goes through and I get respond from Solr, but then it become unresponsive again or sometimes throws connection reset error. I'm not sure why would it work sometimes and not the other times for the same query. As soon as I stop the Indexer process things start working fine. Any way I can debug this problem? I'm not sure. I've never seen this issue myself. Could you try using the bundled jetty instead of Tomcat or on a different box just to make sure this is not an environment specific issue? -- Regards, Shalin Shekhar Mangar. -- --Noble Paul
Re: Question on StreamingUpdateSolrServer
On Sat, Apr 11, 2009 at 3:29 AM, vivek sar vivex...@gmail.com wrote: I also noticed that the Solr app has over 6000 file handles open - lsof | grep solr | wc -l - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? You have 10 cores, so Solr opens 10 indexes, each of which contains multiple files. That is one reason. Apart from that, Tomcat will keep some file handles for incoming connections. 2) How can I reduce the open file count? Are they causing a problem? Tomcat will log messages when it cannot accept incoming connections if it runs out of available file handles. But if you experiencing issues, you can increase the file handle limit or you can set useCompoundFile=true in solrconfig.xml. 3) Is there a way to stop the auto-warmer? 4) Could this be related to Tomcat returning blank page for every request? It could be. Check the Tomcat and Solr logs. -- Regards, Shalin Shekhar Mangar.
Re: Question on StreamingUpdateSolrServer
Thanks Shalin. The problem is I don't see any error message in the catalina.out. I don't even see the request coming in - I simply get blank page on browser. If I keep trying the request goes through and I get respond from Solr, but then it become unresponsive again or sometimes throws connection reset error. I'm not sure why would it work sometimes and not the other times for the same query. As soon as I stop the Indexer process things start working fine. Any way I can debug this problem? -vivek On Fri, Apr 10, 2009 at 11:05 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Sat, Apr 11, 2009 at 3:29 AM, vivek sar vivex...@gmail.com wrote: I also noticed that the Solr app has over 6000 file handles open - lsof | grep solr | wc -l - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? You have 10 cores, so Solr opens 10 indexes, each of which contains multiple files. That is one reason. Apart from that, Tomcat will keep some file handles for incoming connections. 2) How can I reduce the open file count? Are they causing a problem? Tomcat will log messages when it cannot accept incoming connections if it runs out of available file handles. But if you experiencing issues, you can increase the file handle limit or you can set useCompoundFile=true in solrconfig.xml. 3) Is there a way to stop the auto-warmer? 4) Could this be related to Tomcat returning blank page for every request? It could be. Check the Tomcat and Solr logs. -- Regards, Shalin Shekhar Mangar.
Re: Question on StreamingUpdateSolrServer
On Sun, Apr 12, 2009 at 2:15 AM, vivek sar vivex...@gmail.com wrote: The problem is I don't see any error message in the catalina.out. I don't even see the request coming in - I simply get blank page on browser. If I keep trying the request goes through and I get respond from Solr, but then it become unresponsive again or sometimes throws connection reset error. I'm not sure why would it work sometimes and not the other times for the same query. As soon as I stop the Indexer process things start working fine. Any way I can debug this problem? I'm not sure. I've never seen this issue myself. Could you try using the bundled jetty instead of Tomcat or on a different box just to make sure this is not an environment specific issue? -- Regards, Shalin Shekhar Mangar.
Question on StreamingUpdateSolrServer
Hi, I was using CommonsHttpSolrServer for indexing, but having two threads writing (10K batches) at the same time was throwing, ProtocolException: Unbuffered entity enclosing request can not be repeated. I switched to StreamingUpdateSolrServer (using addBeans) and I don't see the problem anymore. The speed is very fast - getting around 25k/sec (single thread), but I'm facing another problem. When the indexer using StreamingUpdateSolrServer is running I'm not able to send any url request from browser to Solr web app. I just get blank page. I can't even get to the admin interface. I'm also not able to shutdown the Tomcat running the Solr webapp when the Indexer is running. I've to first stop the Indexer app and then stop the Tomcat. I don't have this problem when using CommonsHttpSolrServer. Here is how I'm creating it, server = new StreamingUpdateSolrServer(url, 1000,3); I simply call server.addBeans(...) on it. Is there anything else I need to do to make use of StreamingUpdateSolrServer? Why does Tomcat become unresponsive when Indexer using StreamingUpdateSolrServer is running (though, indexing happens fine)? Thanks, -vivek
Re: Question on StreamingUpdateSolrServer
I also noticed that the Solr app has over 6000 file handles open - lsof | grep solr | wc -l - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? 2) How can I reduce the open file count? 3) Is there a way to stop the auto-warmer? 4) Could this be related to Tomcat returning blank page for every request? Any ideas? Thanks, -vivek On Fri, Apr 10, 2009 at 1:48 PM, vivek sar vivex...@gmail.com wrote: Hi, I was using CommonsHttpSolrServer for indexing, but having two threads writing (10K batches) at the same time was throwing, ProtocolException: Unbuffered entity enclosing request can not be repeated. I switched to StreamingUpdateSolrServer (using addBeans) and I don't see the problem anymore. The speed is very fast - getting around 25k/sec (single thread), but I'm facing another problem. When the indexer using StreamingUpdateSolrServer is running I'm not able to send any url request from browser to Solr web app. I just get blank page. I can't even get to the admin interface. I'm also not able to shutdown the Tomcat running the Solr webapp when the Indexer is running. I've to first stop the Indexer app and then stop the Tomcat. I don't have this problem when using CommonsHttpSolrServer. Here is how I'm creating it, server = new StreamingUpdateSolrServer(url, 1000,3); I simply call server.addBeans(...) on it. Is there anything else I need to do to make use of StreamingUpdateSolrServer? Why does Tomcat become unresponsive when Indexer using StreamingUpdateSolrServer is running (though, indexing happens fine)? Thanks, -vivek