On 15 August 2015 at 02:32, David Nelson <dlnelson77...@outlook.com> wrote:
> Hello list,<br><br>Apologies if this has been asked before. My search only > turned up ways to list the total non-null values for all columns as a > single number. I want the count for each column by column.<br><br>I have > inherited a database consisting of two related huge monolithic tables that > lack referential integrity between them, or even basic data constraints. > One of the problems these tables have is every single non-PK column is > NULLable. I am trying to understand the information that is actually stored > and used so that I can implement a (hopefully) better design. Towards that > end I would like to know the count of non-null values in each column per > column. In other words I would like to get the following output from a > table (the numbers are totally made up):<br><br>column_name | > num_values<br>------------+-----------<br>col1 > | > 5787<br>------------+-----------<br>col2 > | > 17<br>------------+-----------<br>col3 > | > 567<br>------------+-----------<br>col4 > | > 5787<br>------------+-----------<br>col5 > | > 143<br>------------+-----------<br>col6 > | > 1<br>------------+-----------<br>...<br><br>Is this possible through one or > more of the system views, or will I need to write a function to do this? > Obviously I can just issue multiple SELECT COUNT(column)... statements, but > I'd rather not.<br><br>Thanks,<br>David > > I didn't quite catch a Postgres version number in that mess :) I assume the tables are quite large if you don't want to just issue a: SELECT COUNT(a) FILTER(WHERE a IS NULL),COUNT(b) FILTER(WHERE b IS NULL) ... (assuming you're on a version new enough to support agg FILTER) On the other hand if you were happy with just an approximation then you could look at pg_stats; create table abc(a int, b int,c int); insert into abc values(1, 1, NULL),(1, NULL, NULL),(NULL, NULL, NULL); analyze abc; select attname, null_frac from pg_stats where tablename = 'abc'; attname | null_frac ---------+----------- a | 0.333333 b | 0.666667 c | 1 Keep in mind though that this is an *approximation* and possibly could be inaccurate. If you want exact do the COUNT(col). If you're not on a version new enough for COUNT(col) FILTER WHERE ..., you could just SUM(CASE WHEN col IS NULL THEN 1 ELSE 0 END) Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/> PostgreSQL Development, 24x7 Support, Training & Services