Re: [HACKERS] strange row number estimates in pg9.1rc1

2011-09-01 Thread Bruce Momjian
Tom Lane wrote:
 I wrote:
  Sergey E. Koposov m...@sai.msu.ru writes:
  I'm seeing something weird which looks like a bug in 9.1rc1 after the 
  upgrade 8.4-9.0-9.1 done using pg_upgrade.
 
  Hm, I wonder what pg_upgrade left relpages/reltuples set to ...
 
 Sure enough, that's the problem.  pg_upgrade leaves relpages/reltuples
 set to zero, but it also imports the visibility map pages from the old
 cluster.  If the old visibility map shows the table as all-visible,
 then this happens when you try to VACUUM ANALYZE the table:
 
 1. VACUUM doesn't process any pages, so it has no tuple density
 estimate.  It leaves reltuples set to zero, but it does set relpages.
 
 2. ANALYZE scans some part of the table.  It gets a tuple density
 estimate for those pages ... but if that's only a small fraction of
 the table, it believes the zero estimate of tuple density elsewhere.
 So you get only a small update of reltuples.
 
 (The above behavior is new as of commit
 b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8, BTW.)
 
 Basically, step 1 is buggy here: if we aren't making an update to
 reltuples, we shouldn't set relpages either.  Setting it nonzero
 changes the implied tuple density from unknown to known zero,
 which is wrong.
 
 I'll go fix that, but I think it might be a good idea for pg_upgrade
 to think about preserving the relpages/reltuples columns ...

pg_upgrade currently only restores some oids and frozenxids.  We would
need to modify pg_dump --binary-upgrade mode to restore those values ---
it isn't hard to do.

 PS: right now, you cannot reproduce this in a 9.0 - HEAD upgrade,
 because of this patch:
 
 commit 00a7c9014a8fbb7388a807daeba3e0a85b49a747
 Author: Bruce Momjian br...@momjian.us
 Date:   Fri Aug 19 11:20:30 2011 -0400
 
 In pg_upgrade, don't copy visibility map files from clusters that did not
 have crash-safe visibility maps to clusters that expect crash-safety.
 
 Request from Robert Haas.
 
 I did reproduce it in a 9.0-9.1 test.

Right, that is expected.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange row number estimates in pg9.1rc1

2011-08-30 Thread Tom Lane
I wrote:
 Sergey E. Koposov m...@sai.msu.ru writes:
 I'm seeing something weird which looks like a bug in 9.1rc1 after the 
 upgrade 8.4-9.0-9.1 done using pg_upgrade.

 Hm, I wonder what pg_upgrade left relpages/reltuples set to ...

Sure enough, that's the problem.  pg_upgrade leaves relpages/reltuples
set to zero, but it also imports the visibility map pages from the old
cluster.  If the old visibility map shows the table as all-visible,
then this happens when you try to VACUUM ANALYZE the table:

1. VACUUM doesn't process any pages, so it has no tuple density
estimate.  It leaves reltuples set to zero, but it does set relpages.

2. ANALYZE scans some part of the table.  It gets a tuple density
estimate for those pages ... but if that's only a small fraction of
the table, it believes the zero estimate of tuple density elsewhere.
So you get only a small update of reltuples.

(The above behavior is new as of commit
b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8, BTW.)

Basically, step 1 is buggy here: if we aren't making an update to
reltuples, we shouldn't set relpages either.  Setting it nonzero
changes the implied tuple density from unknown to known zero,
which is wrong.

I'll go fix that, but I think it might be a good idea for pg_upgrade
to think about preserving the relpages/reltuples columns ...

regards, tom lane

PS: right now, you cannot reproduce this in a 9.0 - HEAD upgrade,
because of this patch:

commit 00a7c9014a8fbb7388a807daeba3e0a85b49a747
Author: Bruce Momjian br...@momjian.us
Date:   Fri Aug 19 11:20:30 2011 -0400

In pg_upgrade, don't copy visibility map files from clusters that did not
have crash-safe visibility maps to clusters that expect crash-safety.

Request from Robert Haas.

I did reproduce it in a 9.0-9.1 test.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange row number estimates in pg9.1rc1

2011-08-30 Thread Sergey E. Koposov

On Tue, 30 Aug 2011, Tom Lane wrote:


Sure enough, that's the problem.  pg_upgrade leaves relpages/reltuples
set to zero, but it also imports the visibility map pages from the old
cluster.  If the old visibility map shows the table as all-visible,
then this happens when you try to VACUUM ANALYZE the table:

1. VACUUM doesn't process any pages, so it has no tuple density
estimate.  It leaves reltuples set to zero, but it does set relpages.

2. ANALYZE scans some part of the table.  It gets a tuple density
estimate for those pages ... but if that's only a small fraction of
the table, it believes the zero estimate of tuple density elsewhere.
So you get only a small update of reltuples.


Thanks for figuring this out.
I wonder what should be the best way to proceed for already migrated 
databases -- running analyze repeatedly may not be the best way for 
very large clusters with large tables...


S

***
Sergey E. Koposov, PhD
Institute for Astronomy, Cambridge/Sternberg Astronomical Institute
Web: http://lnfm1.sai.msu.ru/~math
E-mail: m...@sai.msu.ru

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange row number estimates in pg9.1rc1

2011-08-30 Thread Tom Lane
Sergey E. Koposov m...@sai.msu.ru writes:
 I wonder what should be the best way to proceed for already migrated 
 databases -- running analyze repeatedly may not be the best way for 
 very large clusters with large tables...

You can just manually update pg_class.reltuples to a more reasonable
value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange row number estimates in pg9.1rc1

2011-08-29 Thread Tom Lane
Sergey E. Koposov m...@sai.msu.ru writes:
 I'm seeing something weird which looks like a bug in 9.1rc1 after the 
 upgrade 8.4-9.0-9.1 done using pg_upgrade.

Hm, I wonder what pg_upgrade left relpages/reltuples set to ...

 INFO:  lassource: found 0 removable, 0 nonremovable row versions in 0 
 out of 6451184 pages

VACUUM is deciding that the table is all-visible and it need not scan
any of it ...

 INFO:  lassource: scanned 3 of 6451184 pages, containing 27 live 
 rows and 0 dead rows; 3 rows in sample, 806239 estimated total rows

... and then ANALYZE is scanning only 3 of 6451184 pages, so even
though it knows the correct tuple density in those pages, it's not
licensed to change the reltuples total very much.  If reltuples is way
off base to start with, then it will take a lot of ANALYZE cycles to
converge on a saner value.  However, this doesn't leave us much closer
to understanding why it's off base to start with.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] strange row number estimates in pg9.1rc1

2011-08-28 Thread Sergey E. Koposov

Hello hackers,

I'm seeing something weird which looks like a bug in 9.1rc1 after the 
upgrade 8.4-9.0-9.1 done using pg_upgrade.


I have a set of *static* tables for which explain select *  gives 
row number estimates which are an order of magnitude lower than the actual 
number of rows in a table (despite the vacuum analyze executed  on a 
table immediately before). See:


wsdb= vacuum verbose analyze ukidssdr7.lassource;
INFO:  vacuuming ukidssdr7.lassource
INFO:  index ukidssdr7lassource_q3c_idx now contains 58060655 row 
versions in

143515 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.55s/0.19u sec elapsed 3.23 sec.
.
INFO:  lassource: found 0 removable, 0 nonremovable row versions in 0 
out of 6451184 pages

DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 2.66s/0.94u sec elapsed 17.92 sec.
INFO:  analyzing ukidssdr7.lassource
INFO:  lassource: scanned 3 of 6451184 pages, containing 27 live 
rows and 0 dead rows; 3 rows in sample, 806239 estimated total rows


wsdb= explain select * from ukidssdr7.lassource ;
  QUERY PLAN
--
 Seq Scan on lassource  (cost=0.00..6459246.39 rows=806239 width=766)

wsdb= select count(*) from ukidssdr7.lassource ;
  count
--
 58060655
(1 row)

All the columns in that table have fixed width types (e.g. 
real,int,bigint etc; no varchars, texts). So I don't see the reason why 
the row number estimate must be so much off. I also checked that the size 
of the relation is almost exactly equal to  width * count(*) = 
766*58060655. So there is no empty space anywhere in the relation ( as it 
should be because there was completely no write activity on the table).


And I noticed that at least for several tables with hundreds of millions 
rows, explain select * shows ridiculously small number of expected rows:

wsdb= explain select * from sdssdr7.phototag ;
  QUERY PLAN
--
 Seq Scan on phototag  (cost=0.00..24408626.00 rows=72 width=288)

I guess it may be important that I did upgrade the cluster from 8.4 to 
9.0 and to 9.1 using pg_upgrade. vacuum analyze have been run on the cluster. 
after the ugprades.


Am i missing something or is it a bug ? it looks to me like some 
arithmetic error in the computation of the number of rows in the tables.

At least before when I was using PG 8.4 for the same data, I was used to
do explain select * to get the number of rows in the tables, instead of 
count(*) (my tables are very large), now it seems that there is a huge 
discrepancy between the numbers.


Thanks,
Sergey

***
Sergey E. Koposov, PhD
Institute for Astronomy, Cambridge/Sternberg Astronomical Institute
Web: http://lnfm1.sai.msu.ru/~math
E-mail: m...@sai.msu.ru

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers