[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-16 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999732#comment-13999732
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

Been thinking about the semantics of these done callbacks not being invoked in 
case of exceptions which was a concern raised by [~jpountz] in LUCENE-5527, 
this seems to be not very helpful when e.g. you have a TimeExceededException or 
EarlyTerminatingCollectorException thrown and you need to maybe merge in some 
state into the parent collector in {{LeafCollector.leafDone()}}, or perhaps 
finalize results in {{Collector.done()}}.

Maybe we need a special kind of exception, just like 
CollectionTerminatedException. The semantics for CollectionTerminatedException 
are currently that collection continues with the next leaf. So some new 
base-class for the rethrow me but invoke done callbacks case?

In case of any other kinds of exception like IOException, I don't think we 
should be invoking done() callbacks because the collector's results should not 
be expected to be usable.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-14 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997666#comment-13997666
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

 On one hand I think a Collector.finish() would be nice, but the argument 
 could be made you could handle this yourself (its done with 
 IndexSearcher.search returns).

Such a technique does not compose easily e.g. when you want to wrap collectors 
in other collectors, unless you customize each and every one in the chain.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-05-13 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995070#comment-13995070
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

Umm, I totally forgot about the callers. Updated patch coming.

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Attachments: LUCENE-4370.patch, LUCENE-4370.patch


 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2014-04-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961394#comment-13961394
 ] 

Shikhar Bhushan commented on LUCENE-4370:
-

Proposal: {{LeafCollector}} (in trunk via LUCENE-5527) gets a method {{void 
finish();}

Semantics: It is invoked when collection with that leaf has completed. It is 
not invoked if collection does terminates due to an exception.

I know this ticket was originally about having such a method on {{Collector}} 
and not at the segment-level collection, however I think all use cases can be 
cleanly modelled in this manner.

As naming goes, I think {{finish()}} or {{done()}} or such is better than 
{{close()}}, which implies a try-finally'esque construct.

/cc [~jpountz] [~rcmuir] [~hossman]

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2012-09-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452006#comment-13452006
 ] 

Robert Muir commented on LUCENE-4370:
-

On one hand I think a Collector.finish() would be nice, but the argument could 
be made you could handle this yourself
(its done with IndexSearcher.search returns).

If we do this, we would have to be careful that collectors are currently going 
thru the workflow properly (especially delegators): 
I actually think there are bugs today.

I just looked at it out of curiousity and it doesnt look like CachingCollector 
is always doing the right thing wrt setNextReader/setScorer
for example.


 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected

2012-09-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452009#comment-13452009
 ] 

Robert Muir commented on LUCENE-4370:
-

Specifically, on replay() CachingCollector does this:
{code}
  for (SegStart seg : cachedSegs) {
other.setNextReader(seg.readerContext);
other.setScorer(cachedScorer);
{code}

that looks right, but it forwards setScorer() to the delegate but not 
setNextReader().
so if its maxRAM is exceeded in collect(), it never calls setNextReader() 
before calling collect() on the delegate.

{code}
if (curDocs == null) {
  // Cache was too large
  other.collect(doc);
  return;
}
{code}

 Let Collector know when all docs have been collected
 

 Key: LUCENE-4370
 URL: https://issues.apache.org/jira/browse/LUCENE-4370
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0-BETA
Reporter: Tomás Fernández Löbbe
Priority: Minor

 Collectors are a good point for extension/customization of Lucene/Solr, 
 however sometimes it's necessary to know when the last document has been 
 collected (for example, for flushing cached data).
 It would be nice to have a method that gets called after the last doc has 
 been collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org