Hi,
So, we have various sha* functions.
And I recently got asked about using them as a based for unique index
over long texts.
Normally one would do it with md5(text), but the person asking wanted to
use sha(). and these functions work only on bytea.
And apparently - we can't.
'text-value'::bytea won't work for some specific text values.
convert_to() isn't immutable.
I figured out that I can do something like:
SELECT
sha256(
string_agg( ascii( t )::text, ',' ORDER BY idx )::bytea
)
FROM
regexp_split_to_table( 'INPUT_STRING', '' ) WITH ORDINALITY AS x ( t, idx );
But that's hardly sane solution.
I've read bug report from 2008:
https://www.postgresql.org/message-id/flat/48D20645.1090503%40gmx.net#ce27df4802c9854a9eb77066a5c7cb05
And while I kinda undestand, create-conversion, server-encoding, I don't
really *grok* why we can't have immutable conversion to bytea. And/or
versions of sha* functions that simply work on text.
Is it doable? How does it work in md5()? Apparently it does also work in
pgcrypto/digest(), so there should be a way to get it in core sha*
functions?
Best regards,
depesz