[
https://issues.apache.org/jira/browse/LUCENE-5527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959254#comment-13959254
]
Hoss Man commented on LUCENE-5527:
----------------------------------
bq. The patch only moves 3 methods of the Collector class into a new
LeafCollector class that gets instantiated per-segment. It doesn't add any
functionnality but just changes the way documents are collected.
sure ... but "in for a penny in for a pound" ... this change is going to
require every existing implementation of "Collector" to change, so *if* we're
already going to require that of all existing classes, why not also add a
{{close()}} method (that can easily be a No-Op for most existing
implementations) at the same time?
bq. I think having callbacks upon collection completion would be useful and
help clean up code as Shikhar Bhushan mentionned. But this LeafCollector
refactoring has already been stalled in LUCENE-5299 because LUCENE-5299 also
discussed parallelizing collection so I would like to keep this change as small
as possible and get it in.
Fair enough -- as i said, i don't have strong opinions about any of this, i
just wanted to raise the question. But personally i think that adding
{{Closable}} here and now as part of this API change issue makes sense (since
it's about the "life cycle" of the Collector/LeafCollector) independent of any
question of parallelization., Even if parallelization is controversial, adding
{{close()}} doesn't seem to be, so why not do all of the known desirable,
non-controversial, Collector API changes at once?
If i'm wrong, and there is some dissent about whether {{close()}} makes sense
on these classes, then by all means yes: ignore the suggestion and focus purely
on what you've got in the current patch.
> Make the Collector API work per-segment
> ---------------------------------------
>
> Key: LUCENE-5527
> URL: https://issues.apache.org/jira/browse/LUCENE-5527
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Fix For: 5.0
>
> Attachments: LUCENE-5527.patch
>
>
> Spin-off of LUCENE-5299.
> LUCENE-5229 proposes different changes, some of them being controversial, but
> there is one of them that I really really like that consists in refactoring
> the {{Collector}} API in order to have a different Collector per segment.
> The idea is, instead of having a single Collector object that needs to be
> able to take care of all segments, to have a top-level Collector:
> {code}
> public interface Collector {
> AtomicCollector setNextReader(AtomicReaderContext context) throws
> IOException;
>
> }
> {code}
> and a per-AtomicReaderContext collector:
> {code}
> public interface AtomicCollector {
> void setScorer(Scorer scorer) throws IOException;
> void collect(int doc) throws IOException;
> boolean acceptsDocsOutOfOrder();
> }
> {code}
> I think it makes the API clearer since it is now obious {{setScorer}} and
> {{acceptDocsOutOfOrder}} need to be called after {{setNextReader}} which is
> otherwise unclear.
> It also makes things more flexible. For example, a collector could much more
> easily decide to use different strategies on different segments. In
> particular, it makes the early-termination collector much cleaner since it
> can return different atomic collectors implementations depending on whether
> the current segment is sorted or not.
> Even if we have lots of collectors all over the place, we could make it
> easier to migrate by having a Collector that would implement both Collector
> and AtomicCollector, return {{this}} in setNextReader and make current
> concrete Collector implementations extend this class instead of directly
> extending Collector.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]