Re: Why does Solr commit block indexing?
Hi Michael, thanks for your answer. Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ? Regards, -- Renaud Delbru On 16/12/10 16:45, Michael McCandless wrote: Unfortunately, (I think?) Solr currently commits by closing the IndexWriter, which must wait for any running merges to complete, and then opening a new one. This is really rather silly because IndexWriter has had its own commit method (which does not block ongoing indexing nor merging) for quite some time now. I'm not sure why we haven't switched over already... there must be some trickiness involved. Mike On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org wrote: Hi, See log at [1]. We are using the latest snapshot of lucene_branch3.1. We have configured Solr to use the ConcurrentMergeScheduler: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ When a commit() runs, it blocks indexing (all imcoming update requests are blocked until the commit operation is finished) ... at the end of the log we notice a 4 minute gap during which none of the solr cients trying to add data receive any attention. This is a bit annoying as it leads to timeout exception on the client side. Here, the commit time is only 4 minutes, but it can be larger if there are merges of large segments I thought Solr was able to handle commits and updates at the same time: the commit operation should be done in the background, and the server still continue to receive update requests (maybe at a slower rate than normal). But it looks like it is not the case. Is it a normal behaviour ? [1] http://pastebin.com/KPkusyVb Regards -- Renaud Delbru
Re: Why does Solr commit block indexing?
I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past. It does indeed need to be fixed. -Grant On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote: Hi Michael, thanks for your answer. Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ? Regards, -- Renaud Delbru On 16/12/10 16:45, Michael McCandless wrote: Unfortunately, (I think?) Solr currently commits by closing the IndexWriter, which must wait for any running merges to complete, and then opening a new one. This is really rather silly because IndexWriter has had its own commit method (which does not block ongoing indexing nor merging) for quite some time now. I'm not sure why we haven't switched over already... there must be some trickiness involved. Mike On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org wrote: Hi, See log at [1]. We are using the latest snapshot of lucene_branch3.1. We have configured Solr to use the ConcurrentMergeScheduler: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ When a commit() runs, it blocks indexing (all imcoming update requests are blocked until the commit operation is finished) ... at the end of the log we notice a 4 minute gap during which none of the solr cients trying to add data receive any attention. This is a bit annoying as it leads to timeout exception on the client side. Here, the commit time is only 4 minutes, but it can be larger if there are merges of large segments I thought Solr was able to handle commits and updates at the same time: the commit operation should be done in the background, and the server still continue to receive update requests (maybe at a slower rate than normal). But it looks like it is not the case. Is it a normal behaviour ? [1] http://pastebin.com/KPkusyVb Regards -- Renaud Delbru -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem docs using Solr/Lucene: http://www.lucidimagination.com/search
Re: Why does Solr commit block indexing?
Hi Grant, looking forward for a fix ;o). Such a fix would improve quite a lot the performance of Solr update throughput (even if its performance is already quite impressive). cheers -- Renaud Delbru On 17/12/10 13:05, Grant Ingersoll wrote: I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past. It does indeed need to be fixed. -Grant On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote: Hi Michael, thanks for your answer. Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ? Regards, -- Renaud Delbru On 16/12/10 16:45, Michael McCandless wrote: Unfortunately, (I think?) Solr currently commits by closing the IndexWriter, which must wait for any running merges to complete, and then opening a new one. This is really rather silly because IndexWriter has had its own commit method (which does not block ongoing indexing nor merging) for quite some time now. I'm not sure why we haven't switched over already... there must be some trickiness involved. Mike On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org wrote: Hi, See log at [1]. We are using the latest snapshot of lucene_branch3.1. We have configured Solr to use the ConcurrentMergeScheduler: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ When a commit() runs, it blocks indexing (all imcoming update requests are blocked until the commit operation is finished) ... at the end of the log we notice a 4 minute gap during which none of the solr cients trying to add data receive any attention. This is a bit annoying as it leads to timeout exception on the client side. Here, the commit time is only 4 minutes, but it can be larger if there are merges of large segments I thought Solr was able to handle commits and updates at the same time: the commit operation should be done in the background, and the server still continue to receive update requests (maybe at a slower rate than normal). But it looks like it is not the case. Is it a normal behaviour ? [1] http://pastebin.com/KPkusyVb Regards -- Renaud Delbru -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem docs using Solr/Lucene: http://www.lucidimagination.com/search
Re: Why does Solr commit block indexing?
On Fri, Dec 17, 2010 at 8:05 AM, Grant Ingersoll gsing...@apache.org wrote: I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past. It does indeed need to be fixed. It stems from the APIs that were available at the time in Lucene 1.4. IIRC, Mark worked up a patch that avoided ever closing the reader I think, and delegated more of the concurrency control to Lucene (since it can handle it these days). I think maybe there was just a problem with rollback or something... -Yonik http://www.lucidimagination.com -Grant On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote: Hi Michael, thanks for your answer. Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ? Regards, -- Renaud Delbru On 16/12/10 16:45, Michael McCandless wrote: Unfortunately, (I think?) Solr currently commits by closing the IndexWriter, which must wait for any running merges to complete, and then opening a new one. This is really rather silly because IndexWriter has had its own commit method (which does not block ongoing indexing nor merging) for quite some time now. I'm not sure why we haven't switched over already... there must be some trickiness involved. Mike On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org wrote: Hi, See log at [1]. We are using the latest snapshot of lucene_branch3.1. We have configured Solr to use the ConcurrentMergeScheduler: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ When a commit() runs, it blocks indexing (all imcoming update requests are blocked until the commit operation is finished) ... at the end of the log we notice a 4 minute gap during which none of the solr cients trying to add data receive any attention. This is a bit annoying as it leads to timeout exception on the client side. Here, the commit time is only 4 minutes, but it can be larger if there are merges of large segments I thought Solr was able to handle commits and updates at the same time: the commit operation should be done in the background, and the server still continue to receive update requests (maybe at a slower rate than normal). But it looks like it is not the case. Is it a normal behaviour ? [1] http://pastebin.com/KPkusyVb Regards -- Renaud Delbru -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem docs using Solr/Lucene: http://www.lucidimagination.com/search
Why does Solr commit block indexing?
Hi, See log at [1]. We are using the latest snapshot of lucene_branch3.1. We have configured Solr to use the ConcurrentMergeScheduler: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ When a commit() runs, it blocks indexing (all imcoming update requests are blocked until the commit operation is finished) ... at the end of the log we notice a 4 minute gap during which none of the solr cients trying to add data receive any attention. This is a bit annoying as it leads to timeout exception on the client side. Here, the commit time is only 4 minutes, but it can be larger if there are merges of large segments I thought Solr was able to handle commits and updates at the same time: the commit operation should be done in the background, and the server still continue to receive update requests (maybe at a slower rate than normal). But it looks like it is not the case. Is it a normal behaviour ? [1] http://pastebin.com/KPkusyVb Regards -- Renaud Delbru
Re: Why does Solr commit block indexing?
Unfortunately, (I think?) Solr currently commits by closing the IndexWriter, which must wait for any running merges to complete, and then opening a new one. This is really rather silly because IndexWriter has had its own commit method (which does not block ongoing indexing nor merging) for quite some time now. I'm not sure why we haven't switched over already... there must be some trickiness involved. Mike On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru renaud.del...@deri.org wrote: Hi, See log at [1]. We are using the latest snapshot of lucene_branch3.1. We have configured Solr to use the ConcurrentMergeScheduler: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ When a commit() runs, it blocks indexing (all imcoming update requests are blocked until the commit operation is finished) ... at the end of the log we notice a 4 minute gap during which none of the solr cients trying to add data receive any attention. This is a bit annoying as it leads to timeout exception on the client side. Here, the commit time is only 4 minutes, but it can be larger if there are merges of large segments I thought Solr was able to handle commits and updates at the same time: the commit operation should be done in the background, and the server still continue to receive update requests (maybe at a slower rate than normal). But it looks like it is not the case. Is it a normal behaviour ? [1] http://pastebin.com/KPkusyVb Regards -- Renaud Delbru