[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-4600: --------------------------------------- Attachment: LUCENE-4600-cli.patch bq. Also, you can obtain the right IntDecoder from the CLP for decoding the ordinals. That would remove the hard dependency on VInt+gap, and allow e.g. to use a PackedInts decoder. I tried this, changing the CountingFacetsCollector to the attached patch (to use CategoryListIterator), but alas those abstractions are apparently costing us in this hotspot (unless I screwed something up in the patch? Eg, that null I pass is kinda spooky!): {noformat} Task QPS base StdDev QPS comp StdDev Pct diff HighTerm 0.86 (4.7%) 0.56 (0.4%) -34.4% ( -37% - -30%) MedTerm 5.85 (1.0%) 5.04 (0.5%) -13.9% ( -15% - -12%) LowTerm 11.82 (0.6%) 11.02 (0.5%) -6.8% ( -7% - -5%) {noformat} base is the original CountingFacetsCollector and comp is the new one using the CategoryListIterator API. I think we should try to invoke specialized collectors when possible? > Explore facets aggregation during documents collection > ------------------------------------------------------ > > Key: LUCENE-4600 > URL: https://issues.apache.org/jira/browse/LUCENE-4600 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, > LUCENE-4600.patch > > > Today the facet module simply gathers all hits (as a bitset, optionally with > a float[] to hold scores as well, if you will aggregate them) during > collection, and then at the end when you call getFacetsResults(), it makes a > 2nd pass over all those hits doing the actual aggregation. > We should investigate just aggregating as we collect instead, so we don't > have to tie up transient RAM (fairly small for the bit set but possibly big > for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org