I haven't really delved into the MergePolicy work that's been done, but a
recent Jira comment going me poking arround the javadocs -- MergePolicy is
a public interface, which suggests clients are allowed to impliment it,
leading me wonder about two things...
1) Writing a MergePolicy requires knowing about the package protected
SegmentInfos class ... how do we expect people to make that work (i know
we've said in the past that people shouldn't have to implement classes in
the o.a.l namespace just to make thigns work for them)
2) should we instead make this an abstract base class to help "future
proof" ourselves against wanting to add support for more "optional"
methods we might want to allow MergePolicies to specify?
(this being the age old interface vs bse class discussion ... providing a
base class allows us add support for new methods later by providing
defaults, interfaces can never be changed except in major leases (ie: X.0)
For example: suppose down the road we want to support an option like yonik
describes here...
https://issues.apache.org/jira/browse/LUCENE-1043?#action_12539675
More controversial: maybe even expand the number of docs that can be
bulk copied by not bothering removing deleted docs if it's some very small
number (unless it's an optimize). This is probably not worth it.
...this is the kind ofthing a MergePolicy could specify with some new
method...
public float getMaxAllowedPercentageOfDeletedDocsIgnored() {
return 0.0f;
}
...that individual MergePolicies could override.
Perhaps the broader question is: do we really want/expect people to write
their own MergePolicies, or is hte interface just to provide an
abstraction for picking one of the provided Impls? ... in that case, it
seems like we should lock down the API a bit more (we can always open it
up later)
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]