[
https://issues.apache.org/jira/browse/LUCENE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452055#comment-13452055
]
Shai Erera commented on LUCENE-4372:
------------------------------------
CachingCollector has this in its javadocs:
{code}
* Caches all docs, and optionally also scores, coming from
* a search, and is then able to replay them to another
* collector. You specify the max RAM this class may use.
* Once the collection is done, call {@link #isCached}. If
* this returns true, you can use {@link #replay(Collector)}
* against a new collector. If it returns false, this means
* too much RAM was required and you must instead re-run the
* original search.
{code}
Notice the last sentence about isCached returning false.
Should we just fix the static create() method's documentation (even though it
points to the class's javadocs)?
I don't see any alternative -- if the user specified a too low RAM limit, what
can you do besides discarding the docs and documenting that behavior? I'd hate
to see exceptions thrown...
> CachingCollector.create(boolean, boolean, double) is trappy
> -----------------------------------------------------------
>
> Key: LUCENE-4372
> URL: https://issues.apache.org/jira/browse/LUCENE-4372
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Robert Muir
>
> Followup to LUCENE-3102.
> Shai proposed a method that just caches all scores so they can be replayed:
> {quote}
> Do you think we can modify this Collector to not necessarily wrap another
> Collector? We have such Collector which stores (in-memory) all matching doc
> IDs + scores (if required). Those are later fed into several processes that
> operate on them (e.g. fetch more info from the index etc.). I am thinking, we
> can make CachingCollector optionally wrap another Collector and then someone
> can reuse it by setting RAM limit to unlimited (we should have a constant for
> that) in order to simply collect all matching docs + scores.
> {quote}
> But Mike had concerns about the RAM usage:
> {quote}
> I'd actually rather not have the constant – ie, I don't want to make
> it easy to be unlimited? It seems too dangerous... I'd rather your
> code has to spell out 10*1024 so you realize you're saying 10 GB (for
> example).
> {quote}
> My concern here is what happens when you dont specify enough, I think those
> hits are just silently dropped (which is worse than using lots of RAM).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]