Re: No server response code on insert: how do I avoid this at high speed?
Otis Gospodnetic wrote: Perhaps the container logs explain what happened? How about just throttling to the point where the failure rate is 0%? Too slow? Otis's questions regarding dropped inserts sent me back to the drawing board. The system had been tuned to a slower database to optimize speed and accept a few drops. When I migrated to a faster DB I didn't retune. Here are results of testing indexing performance for Tomcat and Jetty. The DB speedup apparently moved the bottleneck from getting records from the database (around 400 rps) to cramming records into the servlet container. System: 16 processor, 2.5 ghz, 64G memory Index: 33 Gig, freshly optimized, avg recordsize 1.4k Insert load: 250,000 records I calculate records/sec by dividing the number of successful inserts by the time. The adjusted time is the estimated time it would take to insert the full 250,000 records with no errors, which is raw time plus the additional time required to insert those dropped records, ie, raw time * (1 + error-rate * 0.01). Judging from processor/memory/io utilization, it appears the write speed of a single java thread is dominating the solr indexing speed. Which makes sense. Takehome lessons: The speed limit is about 450 records per second in our environment. Three or four threads posting inserts max out speed. More threads don't help. Jetty is significantly faster than Tomcat at sane thread counts in our environment I hope this is useful. -Jim PS: If you have formatting issues with this table, try viewing with a fixed width font Tomcat Jetty _ # threads Raw time # Drops% Error Records/sec Adj. time Raw time # Drops % Error Records/sec Adj. time 16533 171316.85 436.9 569.51 594 24222 9.69 380.1 651.55 15520 168786.75 448.31 555.1 518 28581 11.43 427.45577.22 14547 163786.55 427.1 582.83 496 30047 12.02 443.45555.61 13540 166386.65 432.15575.91 495 27076 10.83 450.35548.61 12545 159206.36 429.5 579.66 494 28785 11.51 447.8 550.88 11523 161926.47 447.05 556.84 484 26495 10.6 461.79535.29 10540 156436.26 433.99 573.8 497 27190 10.88 448.31551.05 9553 155436.21 423.97587.34 494 25862 10.34 453.72545.1 8541 140955.64 436.05571.51 501 23482 9.39 452.13548.06 7549 107354.29 435.82 572.55 499 24657 9.86 451.59548.22 6566 94683.79 424.97587.45 502 23074 9.23 452.04548.33 5588 77543.10 411.98606.23 527 20779 8.31 434.95570.8 4577 42011.68 425.99586.69 513 16608 6.64 454.96547.08 3613 0 0 407.83613537 9503 3.8 447.85557.41 2801 0 0 312.11801633 00 394.94633 1 1365 0 0 183.15 1365 1122 00 222.82 1122
Re: No server response code on insert: how do I avoid this at high speed?
Good questions. Otis Gospodnetic wrote: Perhaps the container logs explain what happened 1) I can't find anything intersterting in the container logs. To the best of my knowledge, neither of the containers notice the drop. Jetty d show out of threads type errors before I tweaking the thread parameters. Once it was tuned a bit, I stopped seeing these entries in the log, but did not stop getting the errors. How about just throttling to the point where the failure rate is 0%? Too slow? 2) Throttling to 0 errors really slows things down. The last time I ran stats, performance scaled almost linearly with additional threads until we reached the approximate number of CPUs in the system. Anything above two threads shows progressively more error if I don't apply any throttling. The churn I need to keep up with makes that undesirable. I'll put together some stats on insert rates, number of threads, and error rates and post them here. It's a classic trade off: tolerating poor results that require additional processing in exchange for higher performance. A set of heuristics for this situation might be useful, since I'm likely not the only one with an indexing bottleneck. -Jim Otis Gospodnetic wrote: Perhaps the container logs explain what happened? How about just throttling to the point where the failure rate is 0%? Too slow? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Paleo Tek [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, September 12, 2008 11:19:52 AM Subject: No server response code on insert: how do I avoid this at high speed? I have a largish index with a lot of churn, and inserts that come in large bursts. My server is a multiprocessor with plenty of memory, so I can multi-thread and stuff in about 1.6 million records per hour, going full speed. I use a dozen or so threads to post curl inserts, and monitor the responses. Using jetty, there is ~10% failure rate with no server response code received. Switching to tomcat reduces the error rate to around 2%. (which makes me like tomcat a lot, even though I'm a dog person...). I suspect I'm overrunning the capacity of the servlet container. Tweaking parameters in Jetty improved performance, and I can tune Tomcat. But then I'll just be overrunning a tuned system, at a slightly faster rate. My work around is to keep track of which inserts fail, but I suspect there's a better approach. Any suggestions how I can balance maximum insert speed with a low error rate? Thanks! -Jim
Re: No server response code on insert: how do I avoid this at high speed?
On Mon, Sep 15, 2008 at 2:17 PM, Paleo Tek [EMAIL PROTECTED] wrote: 1) I can't find anything intersterting in the container logs. Is the client timing out the connection? If Solr were encountering errors, they would be logged. -Yonik
No server response code on insert: how do I avoid this at high speed?
I have a largish index with a lot of churn, and inserts that come in large bursts. My server is a multiprocessor with plenty of memory, so I can multi-thread and stuff in about 1.6 million records per hour, going full speed. I use a dozen or so threads to post curl inserts, and monitor the responses. Using jetty, there is ~10% failure rate with no server response code received. Switching to tomcat reduces the error rate to around 2%. (which makes me like tomcat a lot, even though I'm a dog person...). I suspect I'm overrunning the capacity of the servlet container. Tweaking parameters in Jetty improved performance, and I can tune Tomcat. But then I'll just be overrunning a tuned system, at a slightly faster rate. My work around is to keep track of which inserts fail, but I suspect there's a better approach. Any suggestions how I can balance maximum insert speed with a low error rate? Thanks! -Jim
Re: No server response code on insert: how do I avoid this at high speed?
Perhaps the container logs explain what happened? How about just throttling to the point where the failure rate is 0%? Too slow? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Paleo Tek [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, September 12, 2008 11:19:52 AM Subject: No server response code on insert: how do I avoid this at high speed? I have a largish index with a lot of churn, and inserts that come in large bursts. My server is a multiprocessor with plenty of memory, so I can multi-thread and stuff in about 1.6 million records per hour, going full speed. I use a dozen or so threads to post curl inserts, and monitor the responses. Using jetty, there is ~10% failure rate with no server response code received. Switching to tomcat reduces the error rate to around 2%. (which makes me like tomcat a lot, even though I'm a dog person...). I suspect I'm overrunning the capacity of the servlet container. Tweaking parameters in Jetty improved performance, and I can tune Tomcat. But then I'll just be overrunning a tuned system, at a slightly faster rate. My work around is to keep track of which inserts fail, but I suspect there's a better approach. Any suggestions how I can balance maximum insert speed with a low error rate? Thanks! -Jim
Re: No server response code on insert: how do I avoid this at high speed?
On Fri, Sep 12, 2008 at 11:19 AM, Paleo Tek [EMAIL PROTECTED] wrote: Using jetty, there is ~10% failure rate with no server response code received. What happened then? Did the network connection just drop, or did the server or client time it out? How can you tell it failed? -Yonik