On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed <dean.a.rash...@gmail.com> wrote: > Attached is a patch implementing a new aggregate function md5_agg() to > compute the aggregate MD5 sum across a number of rows. This is > something I've wished for a number of times. I think the primary use > case is to do a quick check that 2 tables, possibly on different > servers, contain the same data, using a query like > > SELECT md5_agg(foo.*::text) FROM (SELECT * FROM foo ORDER BY id) foo; > > or > > SELECT md5_agg(foo.*::text ORDER BY id) FROM foo; > > these would be equivalent to > > SELECT md5(string_agg(foo.*::text, '' ORDER BY id)) FROM foo; > > but without the excessive memory consumption for the intermediate > concatenated string, and the resulting 1GB table size limit.
It's more efficient to calculate per-row md5, and then sum() them. This avoids the need for ORDER BY. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers