[jira] [Commented] (LUCENE-4600) Explore facets aggregation during documents collection

Michael McCandless (JIRA) Mon, 10 Dec 2012 15:03:23 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528402#comment-13528402
 ]


Michael McCandless commented on LUCENE-4600:
--------------------------------------------

bq. Also, it's not like this is unique to facets. IIRC, IndexWriter also holds 
onto some char[] or byte[] arrays? At least, few days ago someone asked me how 
come IW never releases some 100 MB of char[] - since he set RAM buffer size to 
128 MB, it made sense to me ...

Actually we stopped recycling with DWPT ... now we let GC do its job.  But, 
also, when IW did this, it was internal (no public API was affected) ... I 
don't like that the app can/should pass in IntArrayAllocator to the public APIs.

bq. Perhaps leave it for now, and separately (new issue? ) we can test if 
allocating a new array is costly? If it turns out that this is actually 
important, we can have a cleanup thread to reclaim the unused ones?

OK I'll open a new issue.  Rather than adding a cleanup thread to the current 
impl, I think we should remove Int/FloatArrayAllocator and just do new 
int[]/float[]?  And only add it back if we can prove there's a performance 
gain?  I think we should let Java/GC do its job ...
                
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
>                 Key: LUCENE-4600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4600
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, 
> LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with 
> a float[] to hold scores as well, if you will aggregate them) during 
> collection, and then at the end when you call getFacetsResults(), it makes a 
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't 
> have to tie up transient RAM (fairly small for the bit set but possibly big 
> for the float[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4600) Explore facets aggregation during documents collection

Reply via email to