Re: Why does Solr commit block indexing?

2010-12-17 Thread Renaud Delbru

Hi Michael,

thanks for your answer.
Do the Solr team is aware of the problem ? Is there an issue opened 
about this, or ongoing work about that ?


Regards,
--
Renaud Delbru

On 16/12/10 16:45, Michael McCandless wrote:

Unfortunately, (I think?) Solr currently commits by closing the
IndexWriter, which must wait for any running merges to complete, and
then opening a new one.

This is really rather silly because IndexWriter has had its own commit
method (which does not block ongoing indexing nor merging) for quite
some time now.

I'm not sure why we haven't switched over already... there must be
some trickiness involved.

Mike

On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org  wrote:

Hi,

See log at [1].
We are using the latest snapshot of lucene_branch3.1. We have configured
Solr to use the ConcurrentMergeScheduler:
mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/

When a commit() runs, it blocks indexing (all imcoming update requests are
blocked until the commit operation is finished) ... at the end of the log we
notice a 4 minute gap during which none of the solr cients trying to add
data receive any attention.
This is a bit annoying as it leads to timeout exception on the client side.
Here, the commit time is only 4 minutes, but it can be larger if there are
merges of large segments
I thought Solr was able to handle commits and updates at the same time: the
commit operation should be done in the background, and the server still
continue to receive update requests (maybe at a slower rate than normal).
But it looks like it is not the case. Is it a normal behaviour ?

[1] http://pastebin.com/KPkusyVb

Regards
--
Renaud Delbru





Re: Why does Solr commit block indexing?

2010-12-17 Thread Grant Ingersoll
I'm not sure if there is a issue open, but I know I've talked w/ Yonik about 
this and a few other changes to the DirectUpdateHandler2 in the past.  It does 
indeed need to be fixed.

-Grant

On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:

 Hi Michael,
 
 thanks for your answer.
 Do the Solr team is aware of the problem ? Is there an issue opened about 
 this, or ongoing work about that ?
 
 Regards,
 -- 
 Renaud Delbru
 
 On 16/12/10 16:45, Michael McCandless wrote:
 Unfortunately, (I think?) Solr currently commits by closing the
 IndexWriter, which must wait for any running merges to complete, and
 then opening a new one.
 
 This is really rather silly because IndexWriter has had its own commit
 method (which does not block ongoing indexing nor merging) for quite
 some time now.
 
 I'm not sure why we haven't switched over already... there must be
 some trickiness involved.
 
 Mike
 
 On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org  
 wrote:
 Hi,
 
 See log at [1].
 We are using the latest snapshot of lucene_branch3.1. We have configured
 Solr to use the ConcurrentMergeScheduler:
 mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/
 
 When a commit() runs, it blocks indexing (all imcoming update requests are
 blocked until the commit operation is finished) ... at the end of the log we
 notice a 4 minute gap during which none of the solr cients trying to add
 data receive any attention.
 This is a bit annoying as it leads to timeout exception on the client side.
 Here, the commit time is only 4 minutes, but it can be larger if there are
 merges of large segments
 I thought Solr was able to handle commits and updates at the same time: the
 commit operation should be done in the background, and the server still
 continue to receive update requests (maybe at a slower rate than normal).
 But it looks like it is not the case. Is it a normal behaviour ?
 
 [1] http://pastebin.com/KPkusyVb
 
 Regards
 --
 Renaud Delbru
 
 

--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search



Re: Why does Solr commit block indexing?

2010-12-17 Thread Renaud Delbru

Hi Grant,

looking forward for a fix ;o). Such a fix would improve quite a lot the 
performance of Solr update throughput (even if its performance is 
already quite impressive).


cheers
--
Renaud Delbru

On 17/12/10 13:05, Grant Ingersoll wrote:

I'm not sure if there is a issue open, but I know I've talked w/ Yonik about 
this and a few other changes to the DirectUpdateHandler2 in the past.  It does 
indeed need to be fixed.

-Grant

On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:


Hi Michael,

thanks for your answer.
Do the Solr team is aware of the problem ? Is there an issue opened about this, 
or ongoing work about that ?

Regards,
--
Renaud Delbru

On 16/12/10 16:45, Michael McCandless wrote:

Unfortunately, (I think?) Solr currently commits by closing the
IndexWriter, which must wait for any running merges to complete, and
then opening a new one.

This is really rather silly because IndexWriter has had its own commit
method (which does not block ongoing indexing nor merging) for quite
some time now.

I'm not sure why we haven't switched over already... there must be
some trickiness involved.

Mike

On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org   wrote:

Hi,

See log at [1].
We are using the latest snapshot of lucene_branch3.1. We have configured
Solr to use the ConcurrentMergeScheduler:
mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/

When a commit() runs, it blocks indexing (all imcoming update requests are
blocked until the commit operation is finished) ... at the end of the log we
notice a 4 minute gap during which none of the solr cients trying to add
data receive any attention.
This is a bit annoying as it leads to timeout exception on the client side.
Here, the commit time is only 4 minutes, but it can be larger if there are
merges of large segments
I thought Solr was able to handle commits and updates at the same time: the
commit operation should be done in the background, and the server still
continue to receive update requests (maybe at a slower rate than normal).
But it looks like it is not the case. Is it a normal behaviour ?

[1] http://pastebin.com/KPkusyVb

Regards
--
Renaud Delbru


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search





Re: Why does Solr commit block indexing?

2010-12-17 Thread Yonik Seeley
On Fri, Dec 17, 2010 at 8:05 AM, Grant Ingersoll gsing...@apache.org wrote:
 I'm not sure if there is a issue open, but I know I've talked w/ Yonik about 
 this and a few other changes to the DirectUpdateHandler2 in the past.  It 
 does indeed need to be fixed.

It stems from the APIs that were available at the time in Lucene 1.4.
IIRC, Mark worked up a patch that avoided ever closing the reader I
think, and delegated more of the concurrency control to Lucene (since
it can handle it these days).  I think maybe there was just a problem
with rollback or something...

-Yonik
http://www.lucidimagination.com




 -Grant

 On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:

 Hi Michael,

 thanks for your answer.
 Do the Solr team is aware of the problem ? Is there an issue opened about 
 this, or ongoing work about that ?

 Regards,
 --
 Renaud Delbru

 On 16/12/10 16:45, Michael McCandless wrote:
 Unfortunately, (I think?) Solr currently commits by closing the
 IndexWriter, which must wait for any running merges to complete, and
 then opening a new one.

 This is really rather silly because IndexWriter has had its own commit
 method (which does not block ongoing indexing nor merging) for quite
 some time now.

 I'm not sure why we haven't switched over already... there must be
 some trickiness involved.

 Mike

 On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbrurenaud.del...@deri.org  
 wrote:
 Hi,

 See log at [1].
 We are using the latest snapshot of lucene_branch3.1. We have configured
 Solr to use the ConcurrentMergeScheduler:
 mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/

 When a commit() runs, it blocks indexing (all imcoming update requests are
 blocked until the commit operation is finished) ... at the end of the log 
 we
 notice a 4 minute gap during which none of the solr cients trying to add
 data receive any attention.
 This is a bit annoying as it leads to timeout exception on the client side.
 Here, the commit time is only 4 minutes, but it can be larger if there are
 merges of large segments
 I thought Solr was able to handle commits and updates at the same time: the
 commit operation should be done in the background, and the server still
 continue to receive update requests (maybe at a slower rate than normal).
 But it looks like it is not the case. Is it a normal behaviour ?

 [1] http://pastebin.com/KPkusyVb

 Regards
 --
 Renaud Delbru



 --
 Grant Ingersoll
 http://www.lucidimagination.com/

 Search the Lucene ecosystem docs using Solr/Lucene:
 http://www.lucidimagination.com/search




Why does Solr commit block indexing?

2010-12-16 Thread Renaud Delbru

Hi,

See log at [1].
We are using the latest snapshot of lucene_branch3.1. We have configured 
Solr to use the ConcurrentMergeScheduler:

mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/

When a commit() runs, it blocks indexing (all imcoming update requests 
are blocked until the commit operation is finished) ... at the end of 
the log we notice a 4 minute gap during which none of the solr cients 
trying to add data receive any attention.
This is a bit annoying as it leads to timeout exception on the client 
side. Here, the commit time is only 4 minutes, but it can be larger if 
there are merges of large segments
I thought Solr was able to handle commits and updates at the same time: 
the commit operation should be done in the background, and the server 
still continue to receive update requests (maybe at a slower rate than 
normal). But it looks like it is not the case. Is it a normal behaviour ?


[1] http://pastebin.com/KPkusyVb

Regards
--
Renaud Delbru


Re: Why does Solr commit block indexing?

2010-12-16 Thread Michael McCandless
Unfortunately, (I think?) Solr currently commits by closing the
IndexWriter, which must wait for any running merges to complete, and
then opening a new one.

This is really rather silly because IndexWriter has had its own commit
method (which does not block ongoing indexing nor merging) for quite
some time now.

I'm not sure why we haven't switched over already... there must be
some trickiness involved.

Mike

On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru renaud.del...@deri.org wrote:
 Hi,

 See log at [1].
 We are using the latest snapshot of lucene_branch3.1. We have configured
 Solr to use the ConcurrentMergeScheduler:
 mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/

 When a commit() runs, it blocks indexing (all imcoming update requests are
 blocked until the commit operation is finished) ... at the end of the log we
 notice a 4 minute gap during which none of the solr cients trying to add
 data receive any attention.
 This is a bit annoying as it leads to timeout exception on the client side.
 Here, the commit time is only 4 minutes, but it can be larger if there are
 merges of large segments
 I thought Solr was able to handle commits and updates at the same time: the
 commit operation should be done in the background, and the server still
 continue to receive update requests (maybe at a slower rate than normal).
 But it looks like it is not the case. Is it a normal behaviour ?

 [1] http://pastebin.com/KPkusyVb

 Regards
 --
 Renaud Delbru