On Wed, Aug 31, 2011 at 11:12 AM, Ross J. Reedstrom <reeds...@rice.edu> wrote: > Hmm, this thread seems to have petered out without a conclusion. Just > wanted to comment that there _are_ non-password storage uses for these > digests: I use them in a context of storing large files in a bytea > column, as a means to doing data deduplication, and avoiding pushing > files from clients to server and back.
Yes, agreed: there is no decent content-addressing type in PostgreSQL, so one rolls their own using shas and joins; I've seen this more than once. It's a useful way to get non-bloated index on a series of (larger than sha1) values where one only cares about the equality operator (hash indexes, as unattractive as they were before in PostgreSQL's implementation are even less so now with streaming replication). When that content to be addressed can be submitted from another source, anything with md5 is correctly met with suspicion. We have gone to the trouble of using pgcrypto to get sha1 access, but I know of other applications that would have preferred to use sha but settle for md5 simply because it's known to be bundled in core everywhere. CREATE EXTENSION -- particularly if there is *any* way (is there? even with ugliness like utility statement hooks) to configure it on the provider end to not require superuser for common extensions like 'pgcrypto' -- could ablate this issue and one could get off the hash "treadmill", including md5 -- but I think that would be a mistake. Applications need a high quality digest to enable any kind of principled content addressing use case, and I think making that any harder than a builtin is going to negatively impact the state of things at large. As a compromise, I'd also be happy with making CREATE EXTENSION so trivial that everyone who has that use case can get pgcrypto on any hosting provider. -- fdr -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers