Re: [GENERAL] md5 of table
I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. That is not correct. As long as the table definitions are precisely the same, you can move records across dblink without specifying fields. You do this by using record type for the composite which dblink sends across as text. merlin Do you have a quick example? This is what I have tried: select * from tbla except select * from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select * from tbla') The error I get back is: ERROR: a column definition list is required for functions returning record sure: select tbla from tbla except select t::tbla from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select tbla::text from tbla') R(t text); We tried something like that. Unfortunately, in 8.2 you can't cast a row type as text. there's a bunch of ways to do that -- you can also do the md5 on the remote side so you can just send the digests. select * from tbla except select (t::tbla).* from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select tbla::text from tbla') R(t text); should also work. This *might* work -- I didn't try. It's been a while since I've used stock dblink. select * from tbla except select (t).* from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select tbla from tbla') R(t tbla); merlin This looks like it might work for us. At least I would only need the table name for the field list instead of the entire column list. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
On Thursday 01 September 2011 11:47:24 Sim Zacks wrote: Is there a way to get an md5 or other hash of an entire table? I want to be able to easily compare 2 tables in different databases. I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. If I could return an md5 of the entire table, then I could check if the tables have the same hash and be confident enough that the tables were identical. Thanks Sim You might also want to take a look at http://pgfoundry.org/projects/pg-comparator/ which can give a more nuanced view of db differences and tries to be smart about performance. It looks a bit stale; I haven't used it in ages, but it used to be a trusty part of our test suite. -- Vincent de Phily -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] md5 of table
Is there a way to get an md5 or other hash of an entire table? I want to be able to easily compare 2 tables in different databases. I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. If I could return an md5 of the entire table, then I could check if the tables have the same hash and be confident enough that the tables were identical. Thanks Sim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
On Thu, Sep 01, 2011 at 11:47:24AM +0300, Sim Zacks wrote: Is there a way to get an md5 or other hash of an entire table? I want to be able to easily compare 2 tables in different databases. I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. If I could return an md5 of the entire table, then I could check if the tables have the same hash and be confident enough that the tables were identical. One option might be to pg_dump in an appropriate format and md5-compare the output ? Another option might be to - cross-check columns/column types - query from information_schema - compare row counts - may need a lock - compare table sizes - may need vaccum ? If all three match that may be good enough ? Karsten -- GPG key ID E4071346 @ gpg-keyserver.de E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
md5 has size limitations, the second approach seems more practical. Στις Thursday 01 September 2011 12:30:45 ο/η Karsten Hilbert έγραψε: On Thu, Sep 01, 2011 at 11:47:24AM +0300, Sim Zacks wrote: Is there a way to get an md5 or other hash of an entire table? I want to be able to easily compare 2 tables in different databases. I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. If I could return an md5 of the entire table, then I could check if the tables have the same hash and be confident enough that the tables were identical. One option might be to pg_dump in an appropriate format and md5-compare the output ? Another option might be to - cross-check columns/column types - query from information_schema - compare row counts - may need a lock - compare table sizes - may need vaccum ? If all three match that may be good enough ? Karsten -- GPG key ID E4071346 @ gpg-keyserver.de E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 -- Achilleas Mantzios -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
On 09/01/2011 12:26 PM, Pavel Stehule wrote: Hello postgres=# create table tt(a int, b varchar); CREATE TABLE postgres=# insert into tt values(10,'hello'); INSERT 0 1 postgres=# select md5(array_to_string(array_agg(md5(tt::text)),'')) from tt; md5 -- 20a92a676f52699e613da1bb114bd6f0 (1 row) Regards Pavel Stehule Would be perfect, but 8.2 can't cast a UDT to text. ERROR: cannot cast type tt to text LINE 1: select tt::text from tt Thanks Sim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
On Thu, Sep 1, 2011 at 11:14 AM, Sim Zacks s...@compulab.co.il wrote: On 09/01/2011 12:26 PM, Pavel Stehule wrote: Hello postgres=# create table tt(a int, b varchar); CREATE TABLE postgres=# insert into tt values(10,'hello'); INSERT 0 1 postgres=# select md5(array_to_string(array_agg(md5(tt::text)),'')) from I do that as well, but it might have questionable performance when your table has 16M rows, and is 50GB + -- GJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
I am not sure if this will work, but you can try it http://www.pgsql.cz/index.php/PostgreSQL_SQL_Tricks#Cast_to_varchar Pavel I appreciate your help, but UDTs don't have input/ouput functions unless you define them manually and I need this for all of my tables. Thanks Sim
Re: [GENERAL] md5 of table
On 09/01/2011 01:35 PM, Grzegorz Jaśkiewicz wrote: On Thu, Sep 1, 2011 at 11:14 AM, Sim Zacks s...@compulab.co.il wrote: On 09/01/2011 12:26 PM, Pavel Stehule wrote: Hello postgres=# create table tt(a int, b varchar); CREATE TABLE postgres=# insert into tt values(10,'hello'); INSERT 0 1 postgres=# select md5(array_to_string(array_agg(md5(tt::text)),'')) from I do that as well, but it might have questionable performance when your table has 16M rows, and is 50GB + I don't need performance. This is for regression testing for a new database version. I want to run my functions in the old db and the new db and when it modifies a table, I want to be able to check that the tables are the same. Sim
Re: [GENERAL] md5 of table
On Thu, Sep 1, 2011 at 3:48 AM, Achilleas Mantzios ach...@matrix.gatewaynet.com wrote: md5 has size limitations, the second approach seems more practical. Really? I was not aware of size limits of md5, what are they? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
Στις Thursday 01 September 2011 15:50:21 ο/η Scott Marlowe έγραψε: On Thu, Sep 1, 2011 at 3:48 AM, Achilleas Mantzios ach...@matrix.gatewaynet.com wrote: md5 has size limitations, the second approach seems more practical. Really? I was not aware of size limits of md5, what are they? sorry, i was wrong. i dont know why i had this impression, just checked with a 43GB table on a freebsd machine and went fine. -- Achilleas Mantzios -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
2011/9/1 Grzegorz Jaśkiewicz gryz...@gmail.com: On Thu, Sep 1, 2011 at 11:14 AM, Sim Zacks s...@compulab.co.il wrote: On 09/01/2011 12:26 PM, Pavel Stehule wrote: Hello postgres=# create table tt(a int, b varchar); CREATE TABLE postgres=# insert into tt values(10,'hello'); INSERT 0 1 postgres=# select md5(array_to_string(array_agg(md5(tt::text)),'')) from I do that as well, but it might have questionable performance when your table has 16M rows, and is 50GB + you need order by for that to work. I would do it like this: select md5(array(select foo from foo order by foo_pkey)::text); it's great quick'n'dirty, but not much scalable beyond millions. OP: I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. That is not correct. As long as the table definitions are precisely the same, you can move records across dblink without specifying fields. You do this by using record type for the composite which dblink sends across as text. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
OP: I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. That is not correct. As long as the table definitions are precisely the same, you can move records across dblink without specifying fields. You do this by using record type for the composite which dblink sends across as text. merlin Do you have a quick example? This is what I have tried: select * from tbla except select * from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select * from tbla') The error I get back is: ERROR: a column definition list is required for functions returning record Sim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
2011/9/1 Merlin Moncure mmonc...@gmail.com: 2011/9/1 Grzegorz Jaśkiewicz gryz...@gmail.com: On Thu, Sep 1, 2011 at 11:14 AM, Sim Zacks s...@compulab.co.il wrote: On 09/01/2011 12:26 PM, Pavel Stehule wrote: Hello postgres=# create table tt(a int, b varchar); CREATE TABLE postgres=# insert into tt values(10,'hello'); INSERT 0 1 postgres=# select md5(array_to_string(array_agg(md5(tt::text)),'')) from I do that as well, but it might have questionable performance when your table has 16M rows, and is 50GB + you need order by for that to work. I would do it like this: select md5(array(select foo from foo order by foo_pkey)::text); it's great quick'n'dirty, but not much scalable beyond millions. I've always liked doing this with my pager: [robert@client-168] export PAGER=md5 -=[11:40:25 Thu Sep 01]=---=[ pagila-0.10.1 ]=- [robert@client-168] psql -hlocalhost -dpagila psql (9.0.4, server 9.1beta3) WARNING: psql version 9.0, server version 9.1. Some psql features might not work. Type help for help. pagila=# select * from actor order by actor_id; f381ebdefe0aada9c0bc14e657962c1f Robert Treat conjecture: xzilla.net consulting: omniti.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
2011/9/1 Sim Zacks s...@compulab.co.il: OP: I thought about using dblink and the EXCEPT query, but then I need to know the field list of each query result, which is a pain in the butt. That is not correct. As long as the table definitions are precisely the same, you can move records across dblink without specifying fields. You do this by using record type for the composite which dblink sends across as text. merlin Do you have a quick example? This is what I have tried: select * from tbla except select * from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select * from tbla') The error I get back is: ERROR: a column definition list is required for functions returning record sure: select tbla from tbla except select t::tbla from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select tbla::text from tbla') R(t text); there's a bunch of ways to do that -- you can also do the md5 on the remote side so you can just send the digests. select * from tbla except select (t::tbla).* from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select tbla::text from tbla') R(t text); should also work. This *might* work -- I didn't try. It's been a while since I've used stock dblink. select * from tbla except select (t).* from dblink('host=dbhost dbname=otherdb user=myuser password=mypwd'::text, 'select tbla from tbla') R(t tbla); merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] md5 of table
On Thu, Sep 1, 2011 at 7:56 AM, Achilleas Mantzios ach...@matrix.gatewaynet.com wrote: Στις Thursday 01 September 2011 15:50:21 ο/η Scott Marlowe έγραψε: Really? I was not aware of size limits of md5, what are they? sorry, i was wrong. i dont know why i had this impression, just checked with a 43GB table on a freebsd machine and went fine. Well, it might have been an older version or a 32 bit version or something you were thinking of. Sometimes a long memory helps us get the wrong answer. :) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general