Re: Cannot find a working 64-bit integer type on Illumos

Thomas Munro Fri, 29 Nov 2024 02:14:06 -0800

On Tue, Jul 23, 2024 at 2:39 AM Peter Eisentraut <pe...@eisentraut.org> wrote:
> On 14.07.24 16:51, Tom Lane wrote:
> > Peter Eisentraut <pe...@eisentraut.org> writes:
> >> On 04.07.24 03:55, Thomas Munro wrote:
> >>>> Personally, I find "PRId64" pretty unreadable. "INT64_MODIFIER" wasn't
> >>>> nice either, though, and following standards is good, so I'm sure I'll
> >>>> get used to it.
> >
> >> Using PRId64 would be very beneficial because gettext understands it,
> >> and so we would no longer need the various workarounds for not putting
> >> INT64_FORMAT into the middle of a translated string.
> >
> > Uh, really?  The translated strings live in /usr/share, which is
> > expected to be architecture-independent, so how would they make
> > that work?
>
> Gettext has some special run-time support for this.  See here:
> <https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html#No-string-concatenation>

Nice.  I've never really looked into gettext() very seriously, so I
wondered about the portability of it all.  In our docs, a long time
ago, you wrote "... you need an implementation of the Gettext API.
Some operating systems have this built-in (e.g., Linux, NetBSD,
Solaris),".  It looks like we might still have those implementations
of libintl.so in our build farm today: mostly the GNU one, but also
NetBSD's, illumos's, and to the unknowable extent it is different
Solaris's.  They might all be using GNU msgfmt though.

I want to move this forward, but we have to decide which way.  I'm now
leaning back towards using <inttypes.h> macros, based on your
feedback.  So here's a rebase of v2, with some improvements, and some
responses to your earlier email:

On Mon, Jul 15, 2024 at 2:47 AM Peter Eisentraut <pe...@eisentraut.org> wrote:
> On 04.07.24 03:55, Thomas Munro wrote:
> > Unfortunately, that theory turned out to be wrong.  The usual suspect,
> > Windows, uses something else: "I64" or something like that.  We could
> > teach our snprintf to grok that, but I don't like the idea anymore.
> > So let's go back to INT64_MODIFIER, with just a small amount of
> > configure time work to pick the right value.  I couldn't figure out
> > any header-only way to do that.
>
> src/port/snprintf.c used to support I64 in the past, but it was later
> removed.  We could probably put it back.

New idea: let's just redefine PRI...{32,64,PTR} on that platform,
instead of modifying snprintf.c.

> - src/include/c.h:
>
> Maybe add a comment that above all the int8_t -> int8 etc. typedefs
> that these are for backward compatibility or something like that.
> Actually, just move the comment
>
> +/* Historical names for limits in stdint.h. */
>
> up a bit to it covers the types as well.

Hmm I think it deserves a separate comment but yeah "EXACTLY N BITS"
and can be "used for numerical computations" can definitely go.  I
guess that's a memory of some pre-standard system that had similarly
named types with the "at least" meaning, but it's not helping anybody
today.

> Also, these /* == 8 bits */ comments could be removed, I think.

Yeah.

> - src/include/port/pg_bitutils.h:
> - src/port/pg_bitutils.c:
> - src/port/snprintf.c:
>
> These changes look functionally correct, but I think I like the old
> code layout better, like
>
> #if (using long)
> ...
> #elif (using long long)
> ...
> #else
> #error
> #endif
>
> rather than
>
> #if (using long)
> ...
> #else
> static assertion
> ... // long long
> #endif

Fair point, that was all a bit ugly.  Let's see...

> which seems a bit more complicated.  I think you could leave the code
> mostly alone and just change
>
> defined(HAVE_LONG_INT_64) to SIZEOF_LONG == 8
> defined(HAVE_LONG_LONG_INT_64) to SIZEOF_LONG_LONG == 8
>
> in each case.

What do you think about this approach?

            case 'z':
#if SIZEOF_SIZE_T == SIZEOF_LONG
                longflag = 1;
#elif SIZEOF_SIZE_T == SIZEOF_LONG_LONG
                longlongflag = 1;
#else
#error "cannot find integer type of the same size as size_t"
#endif

And then in the cases where the size is implied by the typename
(uint64_t -> 8) it's the same setup except with a number:

#if SIZEOF_LONG == 8
    return 63 - __builtin_clzl(word);
#elif SIZEOF_LONG_LONG == 8
    return 63 - __builtin_clzll(word);
#else
#error "cannot find integer type of the same size as uint64_t"
#endif

What I like about that is that it's very clear that the conditions
match the ...l() or ...ll() function.  It's obviously correct, and
would fail on a Deathstation 9000 that is trying to trick us.

> - src/include/postgres_ext.h:
>
> -#define OID_MAX  UINT_MAX
> -/* you will need to include <limits.h> to use the above #define */
> +#define OID_MAX  UINT32_MAX
>
> If the type Oid is unsigned int, then the OID_MAX should be UINT_MAX.
> So this should not be changed.  Also, is the comment about <limits.h>
> no longer applicable?

Right, yeah, hunk nuked.

> - src/interfaces/ecpg/ecpglib/typename.c:
> - src/interfaces/ecpg/include/sqltypes.h:
> - .../test/expected/compat_informix-sqlda.c:
>
> -#ifdef HAVE_LONG_LONG_INT_64
> +#if SIZEOF_LONG < 8
>
> These changes alter the library behavior unnecessarily.  The old code
> would always prefer to report back long long (ECPGt_long_long etc.),
> but the new code will report back long (ECPGt_long etc.) if it is
> 64-bit.  I don't know the impact of these changes, but it seems
> preferable to keep the existing behavior.

I agree that it should be better but I don't think that's right about
the preference.  The order of the tests here didn't matter, because
only HAVE_LONG_INT_64 would have been defined if sizeof(long) == 8,
not both, so it would still have preferred the first of { long, long
long } that had size 8 in that order.  But yeah that's confusing,
let's try changing it to the same form as the cases above (an
explicitly search for the integer type of the same size where the
conditions are obviously correct with a clear priority and no
default/fallback/assumption, just an error):

        case INT8OID:
#if SIZEOF_LONG == 8
            return ECPGt_long;
#elif SIZEOF_LONG_LONG == 8
            return ECPGt_long_long;
#else
#error "cannot find integer type of the same size as INT8OID"
#endif

> - src/interfaces/ecpg/include/ecpg_config.h.in:
> - src/interfaces/ecpg/include/meson.build:
>
> In the past, we have kept obsolete symbols as always defined in
> ecpg_config.h.  We ought to do the same here.

Normally we keep symbols that were once conditional and are now
constant, like ENABLE_THREAD_SAFETY.   Should that include
HAVE_LONG_INT_64 and HAVE_LONG_LONG_INT_64?  Only one would be
defined, but it might be that neither is true in the patch: it's a
typedef for int64_t, and the standard library might well have a
typedef to __internal_int64_type, which is neither long nor long long.
it seems like internal machinery for giving the client code an "int64"
type in the C89 days, and surely not part of our real API.

WIP patch attached.

(Let's leave that tzcode stuff far later.)

v4-0001-Use-types-and-support-stdint.h-and-inttypes.h.patch
Description: Binary data

Re: Cannot find a working 64-bit integer type on Illumos

Reply via email to