On 27 March 2017 at 00:44, Emre Hasegeli <e...@hasegeli.com> wrote: > > If we want to have a variable which stores the number of ranges, then > > I think numRanges is better than numBlocks. I can't imagine many > > people would disagree there. > > I renamed it with other two. > > > At the very least please write a comment to explain this in the code. > > Right now it looks broken. If I noticed this then one day in the > > future someone else will. If you write a comment then person of the > > future will likely read it, and then not raise any questions about the > > otherwise questionable code. > > I added a sentence about it. > > > I do strongly agree that the estimates need improved here. I've > > personally had issues with bad brin estimates before, and I'd like to > > see it improved. I think the patch is not quite complete without it > > also considering stats on expression indexes. If you have time to go > > do that I'd suggest you go ahead with that. > > I copy-pasted expression index support from btcostestimate() as well, > and extended the regression test. > > I think this function can use more polishing before committing, but I > don't know where to begin. There are single functions for every > access method in here. I don't like them being in there to begin > with. selfuncs.s is quite long with all kinds of dependencies and > dependents. I think it would be better to have the access method > selectivity estimation functions under src/access. Maybe we should > start doing so by moving this to src/access/brin/brin_selfuncs.c. > What do you think? >
Looking over this again. - cond := format('%I %s %L', r.colname, r.oper, r.value); + cond := format('%s %s %L', r.colname, r.oper, r.value); Why did you change this? It seems unrelated. + indexRel = index_open(index->indexoid, AccessShareLock); + pagesPerRange = Min(BrinGetPagesPerRange(indexRel), baserel->pages); + Assert(baserel->pages > 0); + Assert(pagesPerRange > 0); + rangeProportion = (double) pagesPerRange / baserel->pages; + numRanges = 1.0 + (double) baserel->pages / pagesPerRange; + index_close(indexRel, AccessShareLock); Why do you add 1.0 to numRanges? This gives one too many ranges. I also don't really like the way you're setting pagesPerRange. I'd suggest keeping that as the raw value from the index metadata, and doing: If you want the actual number of ranges then I think this is best expressed as: numRanges = Max(ceil(baserel->pages / pagesPerRange), 1.0); The rangeProportion variable could be calculated based on that too rangeProportion = 1.0 / numRanges; That way you don't have to rely on relpages being > 0. It seems like a bad assumption to make. I see there's some hack in estimate_rel_size() making that true, but I think it's best not to rely on that hack. - qual_op_cost = cpu_operator_cost * - (list_length(indexQuals) + list_length(indexOrderBys)); - *indexStartupCost += qual_arg_cost; *indexTotalCost += qual_arg_cost; - *indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost); *indexPages = index->pages; - /* XXX what about pages_per_range? */ + /* + * Add index qual op costs. Unlike other indexes, we are not processing + * tuples but ranges. + */ + qual_op_cost = cpu_operator_cost * list_length(indexQuals); + *indexTotalCost += numRanges * qual_op_cost; What's the reason for removing the + list_length(indexOrderBys) here? I don't think it's the business of this patch to touch that. BRIN may one day support that by partition sorting non-overlapping ranges, so that could well be why it was there in the first place. - *indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost); + *indexTotalCost += selec * numTuples * cpu_index_tuple_cost; Why is this not being charged qual_op_cost anymore? + * Our starting point is that BRIN selectivity has to be less than the + * selectivity of the btree. We are using a product of logical and Can you explain why this is the case? + * class "minmax", and makes a little sense for the other one "inclusion". "and" I think should be "but" I think it would also be good to write some regression tests for this. All the changes I see that you did make to brin.sql seem unrelated to this patch. I've also attached a spreadsheet that can be used to play around with the estimates your patch is giving. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
BRIN_estimates2.ods
Description: application/vnd.oasis.opendocument.spreadsheet
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers