On Thu, Nov 03, 2022 at 09:55:22AM -0400, Tom Lane wrote:
> Peter Eisentraut <[email protected]> writes:
> > On 01.11.22 09:15, Tom Lane wrote:
> >> Agreed that the libpq manual is not the place for this, but I feel
> >> like it will also be clutter in "Data Types". Perhaps we should
> >> invent a new appendix or the like? Somewhere near the wire protocol
> >> docs seems sensible.
>
> > Would that clutter the protocol docs? ;-)
>
> I said "near", not "in". At the time I was thinking "new appendix",
> but I now recall that the wire protocol docs are not an appendix
> but a chapter in the Internals division. So that doesn't seem like
> quite the right place anyway.
>
> Perhaps a new chapter under "IV. Client Interfaces" is the right
> place?
>
> If we wanted to get aggressive, we could move most of the nitpicky details
> about datatype text formatting (e.g., the array quoting rules) there too.
> I'm not set on that, but it'd make datatype.sgml smaller which could
> hardly be a bad thing.
>
> > I suppose figuring out exactly where to put it and how to mark it up,
> > etc., in a repeatable fashion is part of the job here.
>
> Yup.
How does this look?
I've simply moved things around into a new "Binary Format" section with
the few parts that I've started for some quick feedback about whether
this is looking like the right landing place.
Regards,
Mark
diff --git a/doc/src/sgml/binary-format.sgml b/doc/src/sgml/binary-format.sgml
index a297ece784..779b606ec9 100644
--- a/doc/src/sgml/binary-format.sgml
+++ b/doc/src/sgml/binary-format.sgml
@@ -6,9 +6,102 @@
<indexterm zone="binary-format"><primary>pgsql binary format</primary></indexterm>
<para>
- This chapter describes the binary format used in the wire protocol. There
- are a number of C examples for the data types used in PostgreSQL. We will
- try to be as comprehensive as possible with the native data types.
+ This chapter describes the binary representation of the native PostgreSQL
+ data types and gives examples on how to handle each data type's binary format
+ by offering C code examples for each data types.
</para>
+ <para>
+ We will try to cover all of the native data types...
+ </para>
+
+ <sect1 id="binary-format-boolean">
+ <title><type>boolean</type></title>
+
+ <para>
+ A <type>boolean</type> is transmitted as single byte that, when cast to an
+ <literal>int</literal>, will be <literal>0</literal> for
+ <literal>false</literal> and <literal>1</literal> for
+ <literal>true</literal>.
+ </para>
+<programlisting>
+<![CDATA[
+int value;
+
+ptr = PQgetvalue(res, row_number, column_number);
+value = (int) *ptr;
+printf("%d\n", value);
+]]>
+</programlisting>
+ </sect1>
+
+ <sect1 id="binary-format-real">
+ <title><type>real</type></title>
+
+ <para>
+ A <type>real</type> is composed of 4 bytes and needs to be handled correctly
+ for byte order.
+ </para>
+
+<programlisting>
+<![CDATA[
+union {
+ int i;
+ float f;
+} value;
+
+ptr = PQgetvalue(res, row_number, column_number);
+val.i = ntohl(*((uint32_t *) ptr));
+printf("%f\n", value.f);
+]]>
+</programlisting>
+ </sect1>
+
+ <sect1 id="binary-format-timestamp-without-time-zone">
+ <title><type>timestamp without time zone</type></title>
+
+ <para>
+ A <type>timestamp without time zone</type> is a 64-bit data type
+ representing the number of microseconds since January 1, 2000. It can be
+ converted into a broken-down time representation by converting the time into
+ seconds and saving the microseconds elsewhere.
+ </para>
+
+ <para>
+ Note that in C time is counted from January 1, 1970, so this difference
+ needs to be accounted for in addition to handling the network byte order.
+ </para>
+
+<programlisting>
+<![CDATA[
+#define POSTGRES_EPOCH_JDATE 2451545 /* == date2j(2000, 1, 1) */
+#define UNIX_EPOCH_JDATE 2440588 /* == date2j(1970, 1, 1) */
+#define SECS_PER_DAY 86400
+
+uint64_t value;
+
+struct tm *tm;
+time_t timep;
+uint32_t mantissa;
+
+ptr = PQgetvalue(res, column_number, row_number);
+/* Note ntohll() is not implemented on all platforms. */
+val = ntohll(*((uint64_t *) ptr));
+
+timep = val / (uint64_t) 1000000 +
+ (uint64_t) (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) *
+ (uint64_t) SECS_PER_DAY;
+mantissa = val - (uint64_t) (timep -
+ (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * SECS_PER_DAY) *
+ (uint64_t) 1000000;
+
+/* Assume and print timestamps in GMT for simplicity. */
+tm = gmtime(&timep);
+
+printf("%04d-%02d-%02d %02d:%02d:%02d.%06d\n",
+ tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday, tm->tm_hour,
+ tm->tm_min, tm->tm_sec, mantissa);
+]]>
+</programlisting>
+ </sect1>
</chapter>
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 0d6be9a2fa..688f947107 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -51,6 +51,7 @@
<!ENTITY jit SYSTEM "jit.sgml">
<!-- programmer's guide -->
+<!ENTITY binary-format SYSTEM "binary-format.sgml">
<!ENTITY bgworker SYSTEM "bgworker.sgml">
<!ENTITY dfunc SYSTEM "dfunc.sgml">
<!ENTITY ecpg SYSTEM "ecpg.sgml">
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index 2e271862fc..705b03f4aa 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -196,6 +196,7 @@ break is not needed in a wider output rendering.
&lobj;
&ecpg;
&infoschema;
+ &binary-format;
</part>