Re: SHA-2 functions

2018-02-22 Thread Peter Eisentraut
On 2/22/18 01:05, Michael Paquier wrote:
> On Wed, Feb 21, 2018 at 03:45:17PM -0500, Peter Eisentraut wrote:
>> On 2/20/18 23:04, Michael Paquier wrote:
>>> I think that crypto_hash.c or hash_crypt.c would be adapted as well.
>>> crypt.c is too much generic, so including both concepts in the name is
>>> the way to go.  The name given by Tom here sounds actually nice.
>>
>> Updated patches
> 
> I have been reviewing both patches, and those look good to me.

Committed, thanks

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SHA-2 functions

2018-02-21 Thread Michael Paquier
On Wed, Feb 21, 2018 at 03:45:17PM -0500, Peter Eisentraut wrote:
> On 2/20/18 23:04, Michael Paquier wrote:
>> I think that crypto_hash.c or hash_crypt.c would be adapted as well.
>> crypt.c is too much generic, so including both concepts in the name is
>> the way to go.  The name given by Tom here sounds actually nice.
> 
> Updated patches

I have been reviewing both patches, and those look good to me.

git diff --check has one complain:
src/backend/utils/adt/cryptohashes.c:170: new blank line at EOF.
--
Michael


signature.asc
Description: PGP signature


Re: SHA-2 functions

2018-02-21 Thread Peter Eisentraut
On 2/20/18 23:04, Michael Paquier wrote:
> I think that crypto_hash.c or hash_crypt.c would be adapted as well.
> crypt.c is too much generic, so including both concepts in the name is
> the way to go.  The name given by Tom here sounds actually nice.

Updated patches

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 2debe6cf7eec396e269a2c3d89ee56f5aea711e2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pete...@gmx.net>
Date: Tue, 6 Feb 2018 21:46:46 -0500
Subject: [PATCH v2 1/2] Add user-callable SHA-2 functions

Add the user-callable functions sha224, sha256, sha384, sha512.  We
already had these in the C code to support SCRAM, but there was no test
coverage outside of the SCRAM tests.  Adding these as user-callable
functions allows writing some tests.  Also, we have a user-callable md5
function but no more modern alternative, which led to wide use of md5 as
a general-purpose hash function, which leads to occasional complaints
about using md5.

Also mark the existing md5 functions as leak-proof.
---
 doc/src/sgml/func.sgml   |  71 -
 src/backend/utils/adt/Makefile   |   3 +-
 src/backend/utils/adt/cryptohashes.c | 170 +++
 src/backend/utils/adt/varlena.c  |  48 -
 src/include/catalog/pg_proc.h|  12 ++-
 src/test/regress/expected/opr_sanity.out |   6 ++
 src/test/regress/expected/strings.out|  53 ++
 src/test/regress/sql/strings.sql |  18 
 8 files changed, 329 insertions(+), 52 deletions(-)
 create mode 100644 src/backend/utils/adt/cryptohashes.c

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1e535cf215..2f59af25a6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3640,7 +3640,7 @@ Other Binary String Functions
returning the result in hexadecimal
   
   md5(E'Th\\000omas'::bytea)
-  8ab2d3c9689aaf18 b4958c334c82d8b1
+  
8ab2d3c9689aaf18b4958c334c82d8b1
  
 
   
@@ -3674,6 +3674,66 @@ Other Binary String Functions
set_byte(E'Th\\000omas'::bytea, 4, 64)
Th\000o@as
   
+
+  
+   
+
+ sha224
+
+sha224(bytea)
+   
+   bytea
+   
+SHA-224 hash
+   
+   sha224('abc')
+   
\x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7
+  
+
+  
+   
+
+ sha256
+
+sha256(bytea)
+   
+   bytea
+   
+SHA-256 hash
+   
+   sha256('abc')
+   
\xba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
+  
+
+  
+   
+
+ sha384
+
+sha384(bytea)
+   
+   bytea
+   
+SHA-384 hash
+   
+   sha384('abc')
+   
\xcb00753f45a35e8bb5a03d699ac65007272c32ab0eded1631a8b605a43ff5bed8086072ba1e7cc2358baeca134c825a7
+  
+
+  
+   
+
+ sha512
+
+sha512(bytea)
+   
+   bytea
+   
+SHA-512 hash
+   
+   sha512('abc')
+   
\xddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a964b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f
+  
 

   
@@ -3686,6 +3746,15 @@ Other Binary String Functions
the first byte, and bit 15 is the most significant bit of the second byte.
   
 
+  
+   Note that for historic reasons, the function md5
+   returns a hex-encoded value of type text whereas the SHA-2
+   functions return type bytea.  Use the functions
+   encode and decode to convert
+   between the two, for example encode(sha256('abc'),
+   'hex') to get a hex-encoded text representation.
+  
+
   
See also the aggregate function string_agg in
 and the large object functions
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 61ca90312f..4b35dbb8bb 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -11,7 +11,8 @@ include $(top_builddir)/src/Makefile.global
 # keep this list arranged alphabetically or it gets to be a mess
 OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \
-   bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \
+   bool.o cash.o char.o cryptohashes.o \
+   date.o datetime.o datum.o dbsize.o domains.o \
encode.o enum.o expandeddatum.o expandedrecord.o \
float.o format_type.o formatting.o genfile.o \
geo_ops.o geo_selfuncs.o geo_spgist.o inet_cidr_ntop.o inet_net_pton.o \
diff --git a/src/backend/utils/adt/cryptohashes.c 
b/src/backend/utils/adt/cryptohashes.c
new file mode 100644
index 00..1f242eba71
--- /dev/null
+++ b/src/backend/utils/adt/cryptohashes.c
@@ -0,0 +1,170 @@
+/*-

Re: SHA-2 functions

2018-02-20 Thread Michael Paquier
On Tue, Feb 20, 2018 at 05:09:48PM -0500, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 2/19/18 21:07, Michael Paquier wrote:
>>> varlena.c is already large and messy.  I would suggest to split into a
>>> new file all the user-facing cryptographic functions, including md5 and
>>> hex functions, say in src/backend/utils/adt/crypt.c.
> 
>> I had originally started a new file called hash.c, but I figured that
>> would be quite confusing.  I can use crypt.c or a similar name.
>> Although crypt.c sounds a lot like crypt().
> 
> cryptohashes.c or some such?  I concur with Michael that dropping this
> into varlena.c isn't a great plan.

I think that crypto_hash.c or hash_crypt.c would be adapted as well.
crypt.c is too much generic, so including both concepts in the name is
the way to go.  The name given by Tom here sounds actually nice.
--
Michael


signature.asc
Description: PGP signature


Re: SHA-2 functions

2018-02-20 Thread Tom Lane
Peter Eisentraut  writes:
> On 2/19/18 21:07, Michael Paquier wrote:
>> varlena.c is already large and messy.  I would suggest to split into a
>> new file all the user-facing cryptographic functions, including md5 and
>> hex functions, say in src/backend/utils/adt/crypt.c.

> I had originally started a new file called hash.c, but I figured that
> would be quite confusing.  I can use crypt.c or a similar name.
> Although crypt.c sounds a lot like crypt().

cryptohashes.c or some such?  I concur with Michael that dropping this
into varlena.c isn't a great plan.

regards, tom lane



Re: SHA-2 functions

2018-02-20 Thread Peter Eisentraut
On 2/19/18 21:07, Michael Paquier wrote:
> +   sha224('abc')
> +   
> \x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7
> Some bytea characters from the hash are not able to show up correctly?
> This does not result in spaces.

U+200B is a zero-width space, used here to hint for possible line breaks.

> +   Note that for historic reasons, the function md5
> +   returns a hex-encoded value of type text whereas the SHA-2
> +   functions return type bytea.  Use the functions
> +   encode and decode to convert
> +   between the two.
> Adding an example would be nice.

OK

> varlena.c is already large and messy.  I would suggest to split into a
> new file all the user-facing cryptographic functions, including md5 and
> hex functions, say in src/backend/utils/adt/crypt.c.

I had originally started a new file called hash.c, but I figured that
would be quite confusing.  I can use crypt.c or a similar name.
Although crypt.c sounds a lot like crypt().

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SHA-2 functions

2018-02-19 Thread Michael Paquier
On Mon, Feb 19, 2018 at 08:43:44AM -0500, Peter Eisentraut wrote:
> I also noticed while working on some SSL code that we have perfectly
> good SHA-2 functionality in the server already, but it has no test
> coverage outside the SCRAM tests.

Yep, the refactoring in src/common/ has been done for SCRAM.  As The
first versions of the patch were for SCRAM-SHA-1, only SHA-1 functions
were moved.  With SCRAM-SHA-256, the full set of APIs for 256, 386 and
512 has been moved as they share much code.

> So I suggest these patches that expose the new functions sha224(),
> sha256(), sha384(), sha512().  That allows us to make the SSL and SCRAM
> tests more robust, and it will allow them to be used in general purpose
> contexts over md5().

Huge +1 as well.  This also makes sure that the non-SSL implementation
carried in Postgres is consistent what what OpenSSL has.  This would
matter as well if new SSL implementations are added in the future.

+   sha224('abc')
+   
\x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7
Some bytea characters from the hash are not able to show up correctly?
This does not result in spaces.

+   Note that for historic reasons, the function md5
+   returns a hex-encoded value of type text whereas the SHA-2
+   functions return type bytea.  Use the functions
+   encode and decode to convert
+   between the two.
Adding an example would be nice.

varlena.c is already large and messy.  I would suggest to split into a
new file all the user-facing cryptographic functions, including md5 and
hex functions, say in src/backend/utils/adt/crypt.c.
--
Michael


signature.asc
Description: PGP signature


Re: SHA-2 functions

2018-02-19 Thread Michael Paquier
On Mon, Feb 19, 2018 at 03:02:02PM -0500, Peter Eisentraut wrote:
> On 2/19/18 09:06, Aleksander Alekseev wrote:
>>> So I suggest these patches that expose the new functions sha224(),
>>> sha256(), sha384(), sha512().  That allows us to make the SSL and SCRAM
>>> tests more robust, and it will allow them to be used in general purpose
>>> contexts over md5().
>> 
>> Nice patch. I wonder though whether tests should include hashing
>> non-ASCII characters as well.
> 
> The input is of type bytea, so the concept of non-ASCII characters
> doesn't quite apply.

Encoding issues is a reason to use bytea output in some cases.  For
logical decoding this is for example also why an interface like
pg_logical_slot_peek_binary_changes() exists...  Back to the patch.
--
Michael


signature.asc
Description: PGP signature


Re: SHA-2 functions

2018-02-19 Thread Peter Eisentraut
On 2/19/18 09:06, Aleksander Alekseev wrote:
>> So I suggest these patches that expose the new functions sha224(),
>> sha256(), sha384(), sha512().  That allows us to make the SSL and SCRAM
>> tests more robust, and it will allow them to be used in general purpose
>> contexts over md5().
> 
> Nice patch. I wonder though whether tests should include hashing
> non-ASCII characters as well.

The input is of type bytea, so the concept of non-ASCII characters
doesn't quite apply.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SHA-2 functions

2018-02-19 Thread Joe Conway
On 02/19/2018 08:43 AM, Peter Eisentraut wrote:
> I also noticed while working on some SSL code that we have perfectly
> good SHA-2 functionality in the server already, but it has no test
> coverage outside the SCRAM tests.
> 
> So I suggest these patches that expose the new functions sha224(),
> sha256(), sha384(), sha512().  That allows us to make the SSL and SCRAM
> tests more robust, and it will allow them to be used in general purpose
> contexts over md5().

I didn't look closely at the patch itself, but +1 on the concept.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: SHA-2 functions

2018-02-19 Thread Aleksander Alekseev
Hello Peter,

> So I suggest these patches that expose the new functions sha224(),
> sha256(), sha384(), sha512().  That allows us to make the SSL and SCRAM
> tests more robust, and it will allow them to be used in general purpose
> contexts over md5().

Nice patch. I wonder though whether tests should include hashing
non-ASCII characters as well.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


SHA-2 functions

2018-02-19 Thread Peter Eisentraut
There was a complaint recently about the documentation using the widely
frowned-upon md5() function in an unrelated context as an example hash
function.  This is quite common in many examples, such as hashing row
values to compare them, or hashing datums if they don't fit into an
index.  But there is nothing we can easily replace md5 with, without
going to things like pgcrypto.

I also noticed while working on some SSL code that we have perfectly
good SHA-2 functionality in the server already, but it has no test
coverage outside the SCRAM tests.

So I suggest these patches that expose the new functions sha224(),
sha256(), sha384(), sha512().  That allows us to make the SSL and SCRAM
tests more robust, and it will allow them to be used in general purpose
contexts over md5().

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 52edfab8f1175c69ba791139fde26feb9279364a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pete...@gmx.net>
Date: Tue, 6 Feb 2018 21:46:46 -0500
Subject: [PATCH 1/2] Add user-callable SHA-2 functions

Add the user-callable functions sha224, sha256, sha384, sha512.  We
already had these in the C code to support SCRAM, but there was no test
coverage outside of the SCRAM tests.  Adding these as user-callable
functions allows writing some tests.  Also, we have a user-callable md5
function but no more modern alternative, which led to wide use of md5 as
a general-purpose hash function, which leads to occasional complaints
about using md5.

Also mark the existing md5 functions as leak-proof.
---
 doc/src/sgml/func.sgml   |  70 -
 src/backend/utils/adt/varlena.c  | 101 +++
 src/include/catalog/pg_proc.h|  12 +++-
 src/test/regress/expected/opr_sanity.out |   6 ++
 src/test/regress/expected/strings.out|  53 
 src/test/regress/sql/strings.sql |  18 ++
 6 files changed, 257 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1e535cf215..bab828d507 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3640,7 +3640,7 @@ Other Binary String Functions
returning the result in hexadecimal
   
   md5(E'Th\\000omas'::bytea)
-  8ab2d3c9689aaf18 b4958c334c82d8b1
+  
8ab2d3c9689aaf18b4958c334c82d8b1
  
 
   
@@ -3674,6 +3674,66 @@ Other Binary String Functions
set_byte(E'Th\\000omas'::bytea, 4, 64)
Th\000o@as
   
+
+  
+   
+
+ sha224
+
+sha224(bytea)
+   
+   bytea
+   
+SHA-224 hash
+   
+   sha224('abc')
+   
\x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7
+  
+
+  
+   
+
+ sha256
+
+sha256(bytea)
+   
+   bytea
+   
+SHA-256 hash
+   
+   sha256('abc')
+   
\xba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
+  
+
+  
+   
+
+ sha384
+
+sha384(bytea)
+   
+   bytea
+   
+SHA-384 hash
+   
+   sha384('abc')
+   
\xcb00753f45a35e8bb5a03d699ac65007272c32ab0eded1631a8b605a43ff5bed8086072ba1e7cc2358baeca134c825a7
+  
+
+  
+   
+
+ sha512
+
+sha512(bytea)
+   
+   bytea
+   
+SHA-512 hash
+   
+   sha512('abc')
+   
\xddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a964b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f
+  
 

   
@@ -3686,6 +3746,14 @@ Other Binary String Functions
the first byte, and bit 15 is the most significant bit of the second byte.
   
 
+  
+   Note that for historic reasons, the function md5
+   returns a hex-encoded value of type text whereas the SHA-2
+   functions return type bytea.  Use the functions
+   encode and decode to convert
+   between the two.
+  
+
   
See also the aggregate function string_agg in
 and the large object functions
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 304cb26691..d1266f26f5 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -23,6 +23,7 @@
 #include "catalog/pg_type.h"
 #include "common/int.h"
 #include "common/md5.h"
+#include "common/sha2.h"
 #include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
@@ -4611,6 +4612,106 @@ md5_bytea(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(hexsum));
 }
 
+/*
+ * SHA-2 variants
+ */
+
+Datum
+sha224_bytea(PG_FUNCTION_ARGS)
+{
+   bytea  *in = PG_GETARG_BYTEA_PP(0);
+   const uint8 *data;
+   size_t  len;
+   pg_sha224_ctx ctx;
+   unsigned char buf[PG_SHA224_DIGEST_LENGTH];
+   bytea  *result;
+
+