[ https://issues.apache.org/jira/browse/LUCENE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917304#comment-13917304 ]
Shai Erera commented on LUCENE-5476: ------------------------------------ I agree Mike. Rob wrote though in a previous comment: _"The old style sampling indeed had a fixed sample size, which I found very useful"_, so I assumed that's something he wants to push for in this issue as well. I'm OK w/ re-introducing sampling in baby steps, but if Rob's goal is to use sampling + fixed sampling, then we should help him do that in this issue - there's no reason to break this into two issues? Rob, if you only want to introduce sampling ratio, then I agree with Mike that SamplingParams is an overkill for this issue. And in anyway I think a separate sampling package is an overkill as well. If sampling code grows, we can always refactor and move it under its own package. > Facet sampling > -------------- > > Key: LUCENE-5476 > URL: https://issues.apache.org/jira/browse/LUCENE-5476 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Rob Audenaerde > Attachments: LUCENE-5476.patch, SamplingFacetsCollector.java > > > With LUCENE-5339 facet sampling disappeared. > When trying to display facet counts on large datasets (>10M documents) > counting facets is rather expensive, as all the hits are collected and > processed. > Sampling greatly reduced this and thus provided a nice speedup. Could it be > brought back? -- This message was sent by Atlassian JIRA (v6.1.5#6160) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org