Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
Yonik: It would be great if Solr can be configured through some sort of dependency injection framework like Spring! A big +1 from me! -John On Fri, Sep 18, 2009 at 11:10 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Alas, no. The only option that I see for you is to sub-class LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the constructor. However, please open a Jira issue and so we don't forget about it. It's the continuing stuff like this that makes me feel like we should be Spring (or equivalent) based someday... I'm just not sure how we're going to get there. -Yonik http://www.lucidimagination.com
Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
with SOLR-1447 you should be able to use ZoieMergePolicy as well On Mon, Sep 21, 2009 at 11:43 AM, John Wang john.w...@gmail.com wrote: Yonik: It would be great if Solr can be configured through some sort of dependency injection framework like Spring! A big +1 from me! -John On Fri, Sep 18, 2009 at 11:10 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Alas, no. The only option that I see for you is to sub-class LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the constructor. However, please open a Jira issue and so we don't forget about it. It's the continuing stuff like this that makes me feel like we should be Spring (or equivalent) based someday... I'm just not sure how we're going to get there. -Yonik http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
John, It would be great if Lucene's benchmark were used so everyone could execute the test in their own environment and verify. It's not clear the settings or code used to generate the results so it's difficult to draw any reliable conclusions. The steep spike shows greater evidence for the IO cache being cleared during large merges resulting in search performance degradation. See: http://www.lucidimagination.com/search/?q=madvise Merging is IO intensive, less CPU intensive, if the ConcurrentMergeScheduler is used, which defaults to 3 threads, then the CPU could be maxed out. Using a single thread on synchronous spinning magnetic media seems more logical. Queries are usually the inverse, CPU intensive, not IO intensive when the index is in the IO cache. After merging a large segment (or during), queries would start hitting disk, and the results clearly show that. The queries are suddenly more time consuming as they seek on disk at a time when IO activity is at it's peak from merging large segments. Using madvise would prevent usable indexes from being swapped to disk during a merge, query performance would continue unabated. As we move to a sharded model of indexes, large merges will naturally not occur. Shards will reach a specified size and new documents will be sent to new shards. -J On Sun, Sep 20, 2009 at 11:12 PM, John Wang john.w...@gmail.com wrote: The current default Lucene MergePolicy does not handle frequent updates well. We have done some performance analysis with that and a custom merge policy: http://code.google.com/p/zoie/wiki/ZoieMergePolicy -John On Mon, Sep 21, 2009 at 1:08 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: I opened SOLR-1447 for this 2009/9/18 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: We can use a simple reflection based implementation to simplify reading too many parameters. What I wish to emphasize is that Solr should be agnostic of xml altogether. It should only be aware of specific Objects and interfaces. If users wish to plugin something else in some other way , it should be fine There is a huge learning involved in learning the current solrconfig.xml . Let us not make people throw away that . On Sat, Sep 19, 2009 at 1:59 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Over the weekend I may write a patch to allow simple reflection based injection from within solrconfig. On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Alas, no. The only option that I see for you is to sub-class LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the constructor. However, please open a Jira issue and so we don't forget about it. It's the continuing stuff like this that makes me feel like we should be Spring (or equivalent) based someday... I'm just not sure how we're going to get there. -Yonik http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Alas, no. The only option that I see for you is to sub-class LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the constructor. However, please open a Jira issue and so we don't forget about it. It's the continuing stuff like this that makes me feel like we should be Spring (or equivalent) based someday... I'm just not sure how we're going to get there. -Yonik http://www.lucidimagination.com
Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
Over the weekend I may write a patch to allow simple reflection based injection from within solrconfig. On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Alas, no. The only option that I see for you is to sub-class LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the constructor. However, please open a Jira issue and so we don't forget about it. It's the continuing stuff like this that makes me feel like we should be Spring (or equivalent) based someday... I'm just not sure how we're going to get there. -Yonik http://www.lucidimagination.com
Re: How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
We can use a simple reflection based implementation to simplify reading too many parameters. What I wish to emphasize is that Solr should be agnostic of xml altogether. It should only be aware of specific Objects and interfaces. If users wish to plugin something else in some other way , it should be fine There is a huge learning involved in learning the current solrconfig.xml . Let us not make people throw away that . On Sat, Sep 19, 2009 at 1:59 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Over the weekend I may write a patch to allow simple reflection based injection from within solrconfig. On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Alas, no. The only option that I see for you is to sub-class LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the constructor. However, please open a Jira issue and so we don't forget about it. It's the continuing stuff like this that makes me feel like we should be Spring (or equivalent) based someday... I'm just not sure how we're going to get there. -Yonik http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
How to leverage the LogMergePolicy calibrateSizeByDeletes patch in Solr ?
Hello, Came across a lucene patch (http://issues.apache.org/jira/browse/LUCENE-1634 ) that would consider the number of deleted documents as the criteria when deciding which segments to merge. Since we expect to have very frequent deletes, we hope this would help reclaim the space consumed by the deleted documents in a much more efficient way. Currently, we can specify a mergepolicy in solrconfig.xml like this: !--mergePolicyorg.apache.lucene.index.LogByteSizeMergePolicy/ mergePolicy-- However, by default, calibrateSizeByDeletes = false in LogMergePolicy. I was wondering if there is a way I can modify calibrateSizeByDeletes just by configuration ? Thanks, -Jibo