Re: Lookback and/or time-aware Merge Policy?

2013-07-25 Thread Otis Gospodnetic
Thanks for showing I wasn't completely crazy to think this made sense, Mike.

I added:
https://issues.apache.org/jira/browse/LUCENE-5134
https://issues.apache.org/jira/browse/LUCENE-5135

Otis



On Mon, Jul 15, 2013 at 1:28 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 Lookback is a good idea: you could at least gather statistics and
 assess, later, whether good merges had been selected, and maybe play
 what if games to explore if different merge selections would have
 resulted in less copying.

 A time-based MergeScheduler would make sense: e.g., it would allow
 small merges to run any time, but big ones must wait until after
 hours.

 Also, RateLimitedDirWrapper can be used to limit IO impact of ongoing
 merges.  It's like a naive ionice, for merging.

 Mike McCandless

 http://blog.mikemccandless.com


 On Mon, Jul 8, 2013 at 10:41 PM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
 Hi,

 I was (re-re-re-re)-reading Mike's post about Lucene segment merges -
 http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

 Mike mentioned lookhead as something that could possibly yield more
 optimal merges.

 But what about lookback? :)

 What if some sort of stats were kept about about which segments were
 picked for merges?  With some sort of stats in hand, could one look
 back and, knowing what happened after those merges, evaluate if more
 optimal merge choices could have been made and then use that next
 time?

 Also, what about time of day and query rates?  Very often search
 traffic follows the wave pattern, which could mean that more
 aggressive merging could be done during periods with lower query
 rates... or maybe during that time more segments could be allowed to
 live in the index, assuming that after allowing that for some time,
 the subsequent merge could be bigger/more thorough, so to speak.

 Thoughts?

 Otis
 --
 Solr  ElasticSearch Support -- http://sematext.com/
 Performance Monitoring -- http://sematext.com/spm

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lookback and/or time-aware Merge Policy?

2013-07-15 Thread Michael McCandless
Lookback is a good idea: you could at least gather statistics and
assess, later, whether good merges had been selected, and maybe play
what if games to explore if different merge selections would have
resulted in less copying.

A time-based MergeScheduler would make sense: e.g., it would allow
small merges to run any time, but big ones must wait until after
hours.

Also, RateLimitedDirWrapper can be used to limit IO impact of ongoing
merges.  It's like a naive ionice, for merging.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Jul 8, 2013 at 10:41 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
 Hi,

 I was (re-re-re-re)-reading Mike's post about Lucene segment merges -
 http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

 Mike mentioned lookhead as something that could possibly yield more
 optimal merges.

 But what about lookback? :)

 What if some sort of stats were kept about about which segments were
 picked for merges?  With some sort of stats in hand, could one look
 back and, knowing what happened after those merges, evaluate if more
 optimal merge choices could have been made and then use that next
 time?

 Also, what about time of day and query rates?  Very often search
 traffic follows the wave pattern, which could mean that more
 aggressive merging could be done during periods with lower query
 rates... or maybe during that time more segments could be allowed to
 live in the index, assuming that after allowing that for some time,
 the subsequent merge could be bigger/more thorough, so to speak.

 Thoughts?

 Otis
 --
 Solr  ElasticSearch Support -- http://sematext.com/
 Performance Monitoring -- http://sematext.com/spm

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org