On Fri, 23 Jan 2026 at 06:05, Hannu Krosing <[email protected]> wrote:
>
> Fixing all the warnings

I think overall this needs significantly more care and precision than
what you've given it so far. For example, you have:

+    if(dopt->max_table_segment_pages != InvalidBlockNumber)
+        appendPQExpBufferStr(query,
"pg_relation_size(c.oid)/current_setting('block_size')::int AS
relpages, ");
+    else
+        appendPQExpBufferStr(query, "c.relpages, ");

Note that pg_class.relpages is "int". Later the code in master does:

tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));

If you look in vacuum.c, you'll see "pgcform->relpages = (int32)
num_pages;" that the value stored in relpages will be negative when
the table is >= 16TB (assuming 8k pages). Your pg_relation_size
expression is not going to produce an INT. It'll produce a BIGINT, per
"select pg_typeof(pg_relation_size('pg_class') /
current_setting('block_size')::int);". So the atoi() can receive a
string of digits representing an integer larger than INT_MAX in this
case. Looking at [1], I see:

"7.22.1 Numeric conversion functions 1 The functions atof, atoi, atol,
and atoll need not affect the value of the integer expression errno on
an error. If the value of the result cannot be represented, *the
behavior is undefined.*"

And testing locally, I see that my Microsoft compiler will just return
INT_MAX on overflow, whereas I see gcc does nothing to prevent
overflows and just continues to multiply by 10 regardless of what
overflows occur, which I think would just make the code work by
accident.

Aside from that, nothing in the documentation mentions that this is
for "heap" tables only. That should be mentioned as it'll just result
in people posting questions about why it's not working for some other
table access method. There's also not much care for white space.
You've introduced a bunch of whitespace changes unrelated to code
changes you've made, plus there's not much regard for following
project standard. For example, you commonly do "if(" and don't
consistently follow the bracing rules, e.g:

+ for(chkptr = optarg; *chkptr != '\0'; chkptr++)
+     if(*chkptr == '-')

Things like the following help convey the level of care that's gone into this:

+/*
+ * option_parse_int
+ *
+ * Parse integer value for an option.  If the parsing is successful, returns
+ * true and stores the result in *result if that's given; if parsing fails,
+ * returns false.
+ */
+bool
+option_parse_uint32(const char *optarg, const char *optname,

i.e zero effort gone in to modify the comments after pasting them from
option_parse_int().

Another example:

+ pg_log_error("%s musst be in range %lu..%lu",

Also, I have no comprehension of why you'd use uint64 for the valid
range when the function is for processing uint32 types in:

+bool
+option_parse_uint32(const char *optarg, const char *optname,
+ uint64 min_range, uint64 max_range,
+ uint32 *result)

In its current state, it's quite hard to take this patch seriously.
Please spend longer self-reviewing it before posting. You could
temporarily hard-code something for testing which makes at least 1
table appear to be larger than 16TB and ensure your code works. What
you have is visually broken and depends on whatever the atoi
implementation opts to do in the overflow case. These are all things
diligent commiters will be testing and it's sad to see how little
effort you're putting into this. How do you expect this community to
scale with this quality level of patch submissions? You've been around
long enough and should know and do better.  Are you just expecting the
committer to fix these things for you? That work does not get done via
magic wand. Being on v10 already, I'd have expected the patch to be
far beyond proof of concept grade. If you're withholding investing
time on this until you see more community buy-in, then I'd suggest you
write that and withhold further revisions until you're happy with the
level of buy-in.

I'm also still not liking your de-normalised TableInfo representation
for "is_segment". IMO, InvalidBlockNumber should be used to represent
open bounded ranges, and if there's no chunking, then startPage and
endPage will both be InvalidBlockNumber. IMO, what you have now
needlessly allows invalid states where is_segment == true and
startPage, endPage are not set correctly. If you want to keep the code
simple, hide the complexity in a macro or an inline function. There's
just no performance reason to materialise the more complex condition
into a dedicated boolean flag.

If the quality level of this has not improved significantly by v11,
count me out.

David

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf


Reply via email to