The merge policy code seems to have many opportunities for improvement. ES comes with some pre-coded merge strategies (tiered, log byte-size, log doc) but only one (tiered) is used at a time in a node. I don't know the reason why an index can not have its own merge policy at creation time. This could simplify the development of custom implementations e.g. blob indices or key-value stores.
Jörg On Tue, Feb 17, 2015 at 7:05 PM, ElasticGuy <[email protected]> wrote: > So it looks like that SortingMergePolicyProvider instantiated the > necessary MergePolicies as planned, but is not behaving as expected. On an > optimize call on a test index, forcing a merge, the overridden functions in > SortingOneMerge are not called. These functions, like getMergeReaders(), > instruct ES to retrieve a sorted view of the existing segments prior to > merging. > > After some digging, it appears that this is caused by the > ElasticsearchMergePolicy.IndexUpgraderMergeSpecification. It looks like > this was designed with the understanding that ES merge policies would only > be used for deciding which segments to merge. The > IndexUpgraderMergeSpecification strips out any custom merge logic with the > following: > > @Override > public void add(OneMerge merge) { > super.add(new IndexUpgraderOneMerge(merge.segments)); > } > > I doubt that this was the original intent, with SortingMergePolicy only > being a recent addition to the Lucene codebase. I've created an issue > request: https://github.com/elasticsearch/elasticsearch/issues/9731 > > > On Tuesday, February 17, 2015 at 11:07:50 AM UTC-5, ElasticGuy wrote: >> >> I went ahead and gave it a shot. Haven't tried it yet. If anyone wants >> to, take a look at the gist and let me know your thoughts: >> https://gist.github.com/ebradshaw/d29c80a9b843a5d1e77a >> >> I don't love using reflection directly to instantiate the delegated merge >> policy provider. I'm not sure how to bind multiple providers at >> configuration time though. There's probably a cleaner way of handling this. >> >> Right now this is just thrown together to support field sorting. >> >> Now if that works to sort on merge, is there a way to ensure that all >> segments are sorted, even those that have been flushed without merging? >> >> On Tuesday, February 17, 2015 at 9:00:59 AM UTC-5, ElasticGuy wrote: >>> >>> Thanks Jorg. I'll look into that. Has a similar plugin been written >>> that could be used for reference? >>> >>> >>> On Tuesday, February 17, 2015 at 8:13:45 AM UTC-5, Jörg Prante wrote: >>>> >>>> It is possible to write a plugin which implements SortingMergePolicy in >>>> ES. >>>> >>>> Jörg >>>> >>>> On Tue, Feb 17, 2015 at 1:33 PM, ElasticGuy <[email protected]> wrote: >>>> >>>>> I am investigating other key value stores actually. However, I am >>>>> also using elasticsearch for other purposes. I noticed Lucene has the >>>>> SortingMergePolicy. Are there plans/is there a way to use this in >>>>> Elasticsearch? >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/f8d9db27-b949-40ae-a8b8-34e299e32193% >>>>> 40googlegroups.com. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/af34816b-af02-4eba-b705-f1f52e16824a%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/af34816b-af02-4eba-b705-f1f52e16824a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH1Q39t814ZhF0q76Bbsq2KDbOy8XbWC-pwo%2BzHqkq3Kw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
