The merge policy code seems to have many opportunities for improvement. ES
comes with some pre-coded merge strategies (tiered, log byte-size, log doc)
but only one (tiered) is used at a time in a node. I don't know the reason
why an index can not have its own merge policy at creation time. This could
simplify the development of custom implementations e.g. blob indices or
key-value stores.

Jörg

On Tue, Feb 17, 2015 at 7:05 PM, ElasticGuy <[email protected]> wrote:

> So it looks like that SortingMergePolicyProvider instantiated the
> necessary MergePolicies as planned, but is not behaving as expected.  On an
> optimize call on a test index, forcing a merge, the overridden functions in
> SortingOneMerge are not called.  These functions, like getMergeReaders(),
> instruct ES to retrieve a sorted view of the existing segments prior to
> merging.
>
> After some digging, it appears that this is caused by the
> ElasticsearchMergePolicy.IndexUpgraderMergeSpecification. It looks like
> this was designed with the understanding that ES merge policies would only
> be used for deciding which segments to merge.  The
> IndexUpgraderMergeSpecification strips out any custom merge logic with the
> following:
>
>         @Override
>         public void add(OneMerge merge) {
>             super.add(new IndexUpgraderOneMerge(merge.segments));
>         }
>
> I doubt that this was the original intent, with SortingMergePolicy only
> being a recent addition to the Lucene codebase.  I've created an issue
> request: https://github.com/elasticsearch/elasticsearch/issues/9731
>
>
> On Tuesday, February 17, 2015 at 11:07:50 AM UTC-5, ElasticGuy wrote:
>>
>> I went ahead and gave it a shot.  Haven't tried it yet.  If anyone wants
>> to, take a look at the gist and let me know your thoughts:
>> https://gist.github.com/ebradshaw/d29c80a9b843a5d1e77a
>>
>> I don't love using reflection directly to instantiate the delegated merge
>> policy provider.  I'm not sure how to bind multiple providers at
>> configuration time though.  There's probably a cleaner way of handling this.
>>
>> Right now this is just thrown together to support field sorting.
>>
>> Now if that works to sort on merge, is there a way to ensure that all
>> segments are sorted, even those that have been flushed without merging?
>>
>> On Tuesday, February 17, 2015 at 9:00:59 AM UTC-5, ElasticGuy wrote:
>>>
>>> Thanks Jorg.  I'll look into that.  Has a similar plugin been written
>>> that could be used for reference?
>>>
>>>
>>> On Tuesday, February 17, 2015 at 8:13:45 AM UTC-5, Jörg Prante wrote:
>>>>
>>>> It is possible to write a plugin which implements SortingMergePolicy in
>>>> ES.
>>>>
>>>> Jörg
>>>>
>>>> On Tue, Feb 17, 2015 at 1:33 PM, ElasticGuy <[email protected]> wrote:
>>>>
>>>>> I am investigating other key value stores actually.  However, I am
>>>>> also using elasticsearch for other purposes.  I noticed Lucene has the
>>>>> SortingMergePolicy.  Are there plans/is there a way to use this in
>>>>> Elasticsearch?
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/f8d9db27-b949-40ae-a8b8-34e299e32193%
>>>>> 40googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/af34816b-af02-4eba-b705-f1f52e16824a%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/af34816b-af02-4eba-b705-f1f52e16824a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH1Q39t814ZhF0q76Bbsq2KDbOy8XbWC-pwo%2BzHqkq3Kw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to