Thanks for having a look at this. On Wed, 14 Oct 2020 at 04:16, Tom Lane <t...@sss.pgh.pa.us> wrote: > I'm on board with trying to get rid of NaN rowcount estimates more > centrally. I do not think it is a good idea to try to wire in a > prohibition against zero rowcounts. That is actually the correct > thing in assorted scenarios --- one example recently under discussion > was ModifyTable without RETURNING, and another is where we can prove > that a restriction clause is constant-false. At some point I think > we are going to want to deal honestly with those cases instead of > sweeping them under the rug. So I'm disinclined to remove zero > defenses that we'll just have to put back someday.
OK, that certainly limits the scope here. It just means we can't get rid of the <= 0 checks in join costing functions. The problem case that this was added for was a dummy Append. We still have valid cases that won't convert the join rel to a dummy rel with a dummy Append on one side. > I think converting Inf to DBL_MAX, in hopes of avoiding creation of > NaNs later, is fine. (Note that applying rint() to that is quite > useless --- in every floating-point system, values bigger than > 2^number-of-mantissa-bits are certainly integral.) Good point. > I'm not sure why you propose to map NaN to one. Wouldn't mapping it > to Inf (and thence to DBL_MAX) make at least as much sense? Probably > more in fact. We know that unwarranted one-row estimates are absolute > death to our chances of picking a well-chosen plan. That came around due to what the join costing functions were doing. i.e: /* Protect some assumptions below that rowcounts aren't zero or NaN */ if (inner_path_rows <= 0 || isnan(inner_path_rows)) inner_path_rows = 1; [1] didn't have an example case of how the NaNs were introduced, so I was mostly just copying the logic that was added to fix that back in 72826fb3. > > I toyed around with the attached patch, but I'm still not that excited > > about the clamping of infinite values to DBL_MAX. The test case I > > showed above with generate_Series(1,379) still ends up with NaN cost > > estimates due to costing a sort with DBL_MAX rows. When I was writing > > the patch, I had it in my head that the costs per row will always be > > lower than 1. > > Yeah, that is a good point. Maybe instead of clamping to DBL_MAX, > we should clamp rowcounts to something that provides some headroom > for multiplication by per-row costs. A max rowcount of say 1e100 > should serve fine, while still being comfortably more than any > non-insane estimate. > > So now I'm imagining something like > > #define MAXIMUM_ROWCOUNT 1e100 That seems more reasonable. We likely could push it a bit higher, but I'm not all that motivated to since if that was true, then you could expect the heat death of the universe to arrive before your query results. In which case the user would likely struggle to find electrons to power their computer. > clamp_row_est(double nrows) > { > /* Get rid of NaN, Inf, and impossibly large row counts */ > if (isnan(nrows) || nrows >= MAXIMUM_ROWCOUNT) > nrows = MAXIMUM_ROWCOUNT; > else > ... existing logic ... I've got something along those lines in the attached. > Perhaps we should also have some sort of clamp for path cost > estimates, at least to prevent them from being NaNs which > is going to confuse add_path terribly. hmm. I'm not quite sure where to start with that one. Many of the path estimates will already go through clamp_row_est(). There are various special requirements, e.g Appends with no subpaths. So when to apply it would depend on what path type it is. I'd say it would need lots of careful analysis and a scattering of new calls in pathnode.c I've ended up leaving the NaN checks in the join costing functions. There was no case mentioned in [1] that showed how we hit that reported test case, so I'm not really confident enough to know I'm not just reintroducing the same problem again by removing that. The path row estimate that had the NaN might not have been through clamp_row_est(). Many don't. David [1] https://www.postgresql.org/message-id/7270.1302902842%40sss.pgh.pa.us
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index cd3716d494..733f7ea543 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -107,6 +107,13 @@ */ #define APPEND_CPU_COST_MULTIPLIER 0.5 +/* + * Maximum value for row estimates. We cap row estimates to this to help + * ensure that costs based on these estimates remain within the range of what + * double can represent. add_path() wouldn't act sanely given infinite or NaN + * cost values. + */ +#define MAXIMUM_ROWCOUNT 1e100 double seq_page_cost = DEFAULT_SEQ_PAGE_COST; double random_page_cost = DEFAULT_RANDOM_PAGE_COST; @@ -189,11 +196,14 @@ double clamp_row_est(double nrows) { /* - * Force estimate to be at least one row, to make explain output look - * better and to avoid possible divide-by-zero when interpolating costs. - * Make it an integer, too. + * Avoid infinite and NaN row estimates. Costs derived from such values + * are going to be useless. Also force the estimate to be at least one + * row, to make explain output look better and to avoid possible + * divide-by-zero when interpolating costs. Make it an integer, too. */ - if (nrows <= 1.0) + if (nrows > MAXIMUM_ROWCOUNT || isnan(nrows)) + nrows = MAXIMUM_ROWCOUNT; + else if (nrows <= 1.0) nrows = 1.0; else nrows = rint(nrows); @@ -2737,12 +2747,11 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path, QualCost restrict_qual_cost; double ntuples; - /* Protect some assumptions below that rowcounts aren't zero or NaN */ - if (outer_path_rows <= 0 || isnan(outer_path_rows)) + /* Protect some assumptions below that rowcounts aren't zero */ + if (outer_path_rows <= 0) outer_path_rows = 1; - if (inner_path_rows <= 0 || isnan(inner_path_rows)) + if (inner_path_rows <= 0) inner_path_rows = 1; - /* Mark the path with the correct row estimate */ if (path->path.param_info) path->path.rows = path->path.param_info->ppi_rows; @@ -2952,10 +2961,10 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace, innerendsel; Path sort_path; /* dummy for result of cost_sort */ - /* Protect some assumptions below that rowcounts aren't zero or NaN */ - if (outer_path_rows <= 0 || isnan(outer_path_rows)) + /* Protect some assumptions below that rowcounts aren't zero */ + if (outer_path_rows <= 0) outer_path_rows = 1; - if (inner_path_rows <= 0 || isnan(inner_path_rows)) + if (inner_path_rows <= 0) inner_path_rows = 1; /* @@ -3185,8 +3194,8 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path, rescannedtuples; double rescanratio; - /* Protect some assumptions below that rowcounts aren't zero or NaN */ - if (inner_path_rows <= 0 || isnan(inner_path_rows)) + /* Protect some assumptions below that rowcounts aren't zero */ + if (inner_path_rows <= 0) inner_path_rows = 1; /* Mark the path with the correct row estimate */