Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-21 Thread John Wang
Yonik:

It would be great if Solr can be configured through some sort of
dependency injection framework like Spring! A big +1 from me!

-John

On Fri, Sep 18, 2009 at 11:10 PM, Yonik Seeley
yo...@lucidimagination.comwrote:

 On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar
 shalinman...@gmail.com wrote:
  I was wondering if there is a way I can modify calibrateSizeByDeletes
 just
  by configuration ?
 
 
  Alas, no. The only option that I see for you is to sub-class
  LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the
  constructor. However, please open a Jira issue and so we don't forget
 about
  it.

 It's the continuing stuff like this that makes me feel like we should
 be Spring (or equivalent) based someday... I'm just not sure how we're
 going to get there.

 -Yonik
 http://www.lucidimagination.com



Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-21 Thread Noble Paul നോബിള്‍ नोब्ळ्
with SOLR-1447  you should be able to use ZoieMergePolicy as well

On Mon, Sep 21, 2009 at 11:43 AM, John Wang john.w...@gmail.com wrote:
 Yonik:

        It would be great if Solr can be configured through some sort of
 dependency injection framework like Spring! A big +1 from me!

 -John

 On Fri, Sep 18, 2009 at 11:10 PM, Yonik Seeley
 yo...@lucidimagination.comwrote:

 On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar
 shalinman...@gmail.com wrote:
  I was wondering if there is a way I can modify calibrateSizeByDeletes
 just
  by configuration ?
 
 
  Alas, no. The only option that I see for you is to sub-class
  LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the
  constructor. However, please open a Jira issue and so we don't forget
 about
  it.

 It's the continuing stuff like this that makes me feel like we should
 be Spring (or equivalent) based someday... I'm just not sure how we're
 going to get there.

 -Yonik
 http://www.lucidimagination.com





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-21 Thread Jason Rutherglen
John,

It would be great if Lucene's benchmark were used so everyone
could execute the test in their own environment and verify. It's
not clear the settings or code used to generate the results so
it's difficult to draw any reliable conclusions.

The steep spike shows greater evidence for the IO cache being
cleared during large merges resulting in search performance
degradation. See:
http://www.lucidimagination.com/search/?q=madvise

Merging is IO intensive, less CPU intensive, if the
ConcurrentMergeScheduler is used, which defaults to 3 threads,
then the CPU could be maxed out. Using a single thread on
synchronous spinning magnetic media seems more logical. Queries
are usually the inverse, CPU intensive, not IO intensive when
the index is in the IO cache. After merging a large segment (or
during), queries would start hitting disk, and the results
clearly show that. The queries are suddenly more time consuming
as they seek on disk at a time when IO activity is at it's peak
from merging large segments. Using madvise would prevent usable
indexes from being swapped to disk during a merge, query
performance would continue unabated.

As we move to a sharded model of indexes, large merges will
naturally not occur. Shards will reach a specified size and new
documents will be sent to new shards.

-J

On Sun, Sep 20, 2009 at 11:12 PM, John Wang john.w...@gmail.com wrote:
 The current default Lucene MergePolicy does not handle frequent updates
 well.

 We have done some performance analysis with that and a custom merge policy:

 http://code.google.com/p/zoie/wiki/ZoieMergePolicy

 -John

 On Mon, Sep 21, 2009 at 1:08 PM, Jason Rutherglen 
 jason.rutherg...@gmail.com wrote:

 I opened SOLR-1447 for this

 2009/9/18 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
  We can use a simple reflection based implementation to simplify
  reading too many parameters.
 
  What I wish to emphasize is that Solr should be agnostic of xml
  altogether. It should only be aware of specific Objects and
  interfaces. If users wish to plugin something else in some other way ,
  it should be fine
 
 
   There is a huge learning involved in learning the current
  solrconfig.xml . Let us not make people throw away that .
 
  On Sat, Sep 19, 2009 at 1:59 AM, Jason Rutherglen
  jason.rutherg...@gmail.com wrote:
  Over the weekend I may write a patch to allow simple reflection based
  injection from within solrconfig.
 
  On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley
  yo...@lucidimagination.com wrote:
  On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar
  shalinman...@gmail.com wrote:
  I was wondering if there is a way I can modify calibrateSizeByDeletes
 just
  by configuration ?
 
 
  Alas, no. The only option that I see for you is to sub-class
  LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the
  constructor. However, please open a Jira issue and so we don't forget
 about
  it.
 
  It's the continuing stuff like this that makes me feel like we should
  be Spring (or equivalent) based someday... I'm just not sure how we're
  going to get there.
 
  -Yonik
  http://www.lucidimagination.com
 
 
 
 
 
  --
  -
  Noble Paul | Principal Engineer| AOL | http://aol.com
 




Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-18 Thread Yonik Seeley
On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 I was wondering if there is a way I can modify calibrateSizeByDeletes just
 by configuration ?


 Alas, no. The only option that I see for you is to sub-class
 LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the
 constructor. However, please open a Jira issue and so we don't forget about
 it.

It's the continuing stuff like this that makes me feel like we should
be Spring (or equivalent) based someday... I'm just not sure how we're
going to get there.

-Yonik
http://www.lucidimagination.com


Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-18 Thread Jason Rutherglen
Over the weekend I may write a patch to allow simple reflection based
injection from within solrconfig.

On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar
 shalinman...@gmail.com wrote:
 I was wondering if there is a way I can modify calibrateSizeByDeletes just
 by configuration ?


 Alas, no. The only option that I see for you is to sub-class
 LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the
 constructor. However, please open a Jira issue and so we don't forget about
 it.

 It's the continuing stuff like this that makes me feel like we should
 be Spring (or equivalent) based someday... I'm just not sure how we're
 going to get there.

 -Yonik
 http://www.lucidimagination.com



Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
We can use a simple reflection based implementation to simplify
reading too many parameters.

What I wish to emphasize is that Solr should be agnostic of xml
altogether. It should only be aware of specific Objects and
interfaces. If users wish to plugin something else in some other way ,
it should be fine


 There is a huge learning involved in learning the current
solrconfig.xml . Let us not make people throw away that .

On Sat, Sep 19, 2009 at 1:59 AM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 Over the weekend I may write a patch to allow simple reflection based
 injection from within solrconfig.

 On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar
 shalinman...@gmail.com wrote:
 I was wondering if there is a way I can modify calibrateSizeByDeletes just
 by configuration ?


 Alas, no. The only option that I see for you is to sub-class
 LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the
 constructor. However, please open a Jira issue and so we don't forget about
 it.

 It's the continuing stuff like this that makes me feel like we should
 be Spring (or equivalent) based someday... I'm just not sure how we're
 going to get there.

 -Yonik
 http://www.lucidimagination.com





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?

2009-09-17 Thread Jibo John

Hello,

Came across a lucene patch (http://issues.apache.org/jira/browse/LUCENE-1634 
) that would consider the number of deleted documents as the criteria  
when deciding which segments to merge.


Since we expect to have very frequent deletes, we hope this would help  
reclaim the space consumed by the deleted documents in a much more  
efficient way.


Currently, we can specify a mergepolicy in solrconfig.xml like this:


 !--mergePolicyorg.apache.lucene.index.LogByteSizeMergePolicy/ 
mergePolicy--



However, by default, calibrateSizeByDeletes = false in LogMergePolicy.

I was wondering if there is a way I can modify calibrateSizeByDeletes  
just by configuration ?


Thanks,
-Jibo