Alex et al,
I believe that many of you are all too narrowly linking HASH and
whether it is appropriate for cryptography.
What about particular CRC32 - it produces output value with length just
4 bytes, even using SHA1 with 20 bytes returned was accepted as security
risk in some places (like SRP), making us move to hashes with longer
output. Sean, I do not know under what condition can it be used as
cryptographic hash. In our favorite tomcrypt hashes usable for crypto
purporses are collected together and may be interchangeably used in
other (higher level) crypto functions like RSA signing messages. But
CRC32 is standalone, it can not be used for something like signing
messages.
The point here is that for [99% of] SQL users HASH means simply a
*hash*, not necessarily a *cryptographic hash*. And it's
counter-intuitive to think about HASH only from the cryptography POV.
Referring to tomcrypt is just an implementation detail, it should not
affect the design (API). Given that HASH supports non-crypto hashes
(legacy one and not-so-crypto-anymore MD5), adding CRC32 to this list
looks logical.
However, when I agreed to Mark's suggestion, I didn't pay attention to
the output type. While it's technically doable to return different
outputs, at least as long as hash name is an identifier, this is not
really desirable. But before rejecting this, we have to consider future
extensions to the HASH function.
(1) One question was already raised in this thread -- could hash name
become an expression one day? If yes, it's a show stopper and all HASH
(USING) calls should return VARBINARY. But so far I haven't seen good
reasons for expression-based hash names.
(2) Another question is whether we're going to introduce another
algorithms not backed by the tomcrypt library and possibly returning
numeric result? CRC64? MurmurHash? If so, it gives us a green light to
add CRC32 into HASH. But at the same time it conflicts with (1). So we
choose either (1) or (2), but not both.
Or we go the compromise route and add CRC32, CRC64, MurmurHash, etc as
separate functions, thus polluting the grammar (what we tried to avoid
with the unified HASH function).
Or we make HASH(USING CRC32) returning VARBINARY(4), HASH(USING CRC64)
returning VARBINARY(8) etc and allow casting from
VARBINARY(digit-of-two) to exact integers of the large-enough size. Not
very user-friendly, I'd say.
Dmitry
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel