On Feb 9, 2010, at 9:21 AM, Zacarias wrote: > Hi, > > I want to solve the > https://issues.apache.org/jira/browse/SOLR-1713improvment but I have > some questions. If somebody can give a little > orientation should be great. > > What the issue says is "Query rows=10 but cluster on more"? > If this is what it says, the idea is to solve using results or collection > part of the ClusteringComponent. (Because Collection part uses > DocumentEngine, which is in experimental state). > If the user wants to cluster on more rows, should I query twice or just > query by the biggest quantity of rows and then reduce the number at the end?
I think we want to avoid querying twice. I would query by the max of rows and a new parameter (cluster_rows? internal_rows? Other?) and then reduce the number at the end. It's a little tricky, b/c we likely don't want to couple the QueryComponent to the ClusterComponent, so we may want to make this just a wee bit more generic. -Grant