On Wed, 6 Mar 2019 11:27:38 +0900
Michael Paquier <[email protected]> wrote:
> On Tue, Mar 05, 2019 at 07:55:22PM -0600, Karl O. Pinc wrote:
> > Attached: doc_base64_v4.patch
>
> Details about the "escape" mode are already available within the
> description of function "encode". Wouldn't we want to consolidate a
> description for all the modes at the same place, including some words
> for hex? Your patch only includes the description of base64, which is
> a good addition, still not consistent with the rest. A paragraph
> after all the functions listed is fine I think as the description is
> long so it would bloat the table if included directly.
Makes sense. (As did hyperlinking to the RFC.)
(No matter how simple I think a patch is going to be it
always turns into a project. :)
Attached: doc_base64_v5.patch
Made index entries for hex and escape encodings.
Added word "encoding" to index entries.
Made <varlist> entries with terms for
base64, hex, and escape encodings.
Added documentation for hex and escape encodings,
including output formats and what are acceptable
inputs.
Regards,
Karl <[email protected]>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6765b0d584..bd337fd530 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -1752,6 +1752,9 @@
<indexterm>
<primary>decode</primary>
</indexterm>
+ <indexterm>
+ <primary>base64 encoding</primary>
+ </indexterm>
<literal><function>decode(<parameter>string</parameter> <type>text</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
@@ -1769,16 +1772,25 @@
<indexterm>
<primary>encode</primary>
</indexterm>
+ <indexterm>
+ <primary>base64 encoding</primary>
+ </indexterm>
+ <indexterm>
+ <primary>hex encoding</primary>
+ </indexterm>
+ <indexterm>
+ <primary>escape encoding</primary>
+ </indexterm>
<literal><function>encode(<parameter>data</parameter> <type>bytea</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Encode binary data into a textual representation. Supported
- formats are: <literal>base64</literal>, <literal>hex</literal>, <literal>escape</literal>.
- <literal>escape</literal> converts zero bytes and high-bit-set bytes to
- octal sequences (<literal>\</literal><replaceable>nnn</replaceable>) and
- doubles backslashes.
+ formats are:
+ <link linkend="base64-encoding"><literal>base64</literal></link>,
+ <link linkend="hex-encoding"><literal>hex</literal></link>,
+ <link linkend="escape-encoding"><literal>escape</literal></link>.
</entry>
<entry><literal>encode('123\000\001', 'base64')</literal></entry>
<entry><literal>MTIzAAE=</literal></entry>
@@ -2365,6 +2377,84 @@
<function>format</function> treats a NULL as a zero-element array.
</para>
+ <indexterm>
+ <primary>base64 encoding</primary>
+ </indexterm>
+ <indexterm>
+ <primary>hex encoding</primary>
+ </indexterm>
+ <indexterm>
+ <primary>escape encoding</primary>
+ </indexterm>
+
+ <para>
+ The string and the binary <function>encode</function>
+ and <function>decode</function> functions support the following
+ encodings:
+
+ <variablelist>
+ <varlistentry id="base64-encoding">
+ <term>base64</term>
+ <listitem>
+ <para>
+ The <literal>base64</literal> encoding is that
+ of <ulink url="https://tools.ietf.org/html/rfc2045#section-6.8">RFC
+ 2045 section 6.8</ulink>. As per the RFC, encoded lines are
+ broken at 76 characters. However instead of the MIME CRLF
+ end-of-line marker, only a newline is used for end-of-line.
+ </para>
+ <para>
+ The carriage-return, newline, space, and tab characters are
+ ignored by <function>decode</function>. Otherwise, an error is
+ raised when <function>decode</function> is supplied invalid
+ base64 data — including when trailing padding is incorrect.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="hex-encoding">
+ <term>hex</term>
+ <listitem>
+ <para>
+ <literal>hex</literal> represents each 4 bits of data as a single
+ hexadecimal digit, <literal>0</literal>
+ through <literal>f</literal>. Encoding outputs
+ the <literal>a</literal>-<literal>f</literal> hex digits in lower
+ case. Because the smallest unit of data is 8 bits there are
+ always an even number of characters returned
+ by <function>encode</function>.
+ </para>
+ <para>
+ The <function>decode</function> function
+ accepts <literal>a</literal>-<literal>f</literal> characters in
+ either upper or lower case. An error is raised
+ when <function>decode</function> is supplied invalid hex data
+ — including when given an odd number of characters.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="escape-encoding">
+ <term>escape</term>
+ <listitem>
+ <para>
+ <literal>escape</literal> converts zero bytes and high-bit-set
+ bytes to octal sequences
+ (<literal>\</literal><replaceable>nnn</replaceable>) and doubles
+ backslashes. Encoding always produces 4 characters for each
+ high-bit-set input byte.
+ </para>
+ <para>
+ The <function>decode</function> function accepts any number of
+ octal digits after a <literal>\</literal> character. An error is
+ raised when <function>decode</function> is supplied a
+ single <literal>\</literal> not followed by an octal digit.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
<para>
See also the aggregate function <function>string_agg</function> in
<xref linkend="functions-aggregate"/>.
@@ -3577,16 +3667,25 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>encode</primary>
</indexterm>
+ <indexterm>
+ <primary>base64 encoding</primary>
+ </indexterm>
+ <indexterm>
+ <primary>hex encoding</primary>
+ </indexterm>
+ <indexterm>
+ <primary>escape encoding</primary>
+ </indexterm>
<literal><function>encode(<parameter>data</parameter> <type>bytea</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Encode binary data into a textual representation. Supported
- formats are: <literal>base64</literal>, <literal>hex</literal>, <literal>escape</literal>.
- <literal>escape</literal> converts zero bytes and high-bit-set bytes to
- octal sequences (<literal>\</literal><replaceable>nnn</replaceable>) and
- doubles backslashes.
+ formats are:
+ <link linkend="base64-encoding"><literal>base64</literal></link>,
+ <link linkend="hex-encoding"><literal>hex</literal></link>,
+ <link linkend="escape-encoding"><literal>escape</literal></link>.
</entry>
<entry><literal>encode('123\000456'::bytea, 'escape')</literal></entry>
<entry><literal>123\000456</literal></entry>