Re: [PERFORM] Curious about dead rows.

2007-11-16 Thread Brad Nicholson
On Wed, 2007-11-14 at 17:46 -0500, Tom Lane wrote:
 Russell Smith [EMAIL PROTECTED] writes:
  It is possible that analyze is not getting the number of dead rows right?
 
 Hah, I think you are on to something.  ANALYZE is telling the truth
 about how many dead rows it saw, but its notion of dead is not good
 according to SnapshotNow.  Thus, rows inserted by a not-yet-committed
 transaction would be counted as dead.  So if these are background
 auto-analyzes being done in parallel with inserting transactions that
 run for awhile, seeing a few not-yet-committed rows would be
 unsurprising.
 
 I wonder if that is worth fixing?  I'm not especially concerned about
 the cosmetic aspect of it, but if we mistakenly launch an autovacuum
 on the strength of an inflated estimate of dead rows, that could be
 costly.

Sounds to me like that could result in autovacuum kicking off while
doing large data loads.  This sounds suspiciously like problem someone
on -novice was having - tripping over a windows autovac bug while doing
a data load

http://archives.postgresql.org/pgsql-novice/2007-11/msg00025.php

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-16 Thread Jean-David Beyer
Craig James wrote:
 Alvaro Herrera wrote:
 To recap:

 - your app only does inserts
 - there has been no rollback lately
 - there are no updates
 - there are no deletes

 The only other source of dead rows I can think is triggers ... do you
 have any?  (Not necessarily on this table -- perhaps triggers on other
 tables can cause updates on this one).

 Oh, rolled back COPY can cause dead rows too.
 
 
 What about an unreliable network that causes lot of disconnects? 
 Wouldn't the server process do a rollback?
 
Perhaps in theory, but in practice my client and the postgreSQL servers are
on the same machine and the 127.0.0.1 is pretty reliable:

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:30097919 errors:0 dropped:0 overruns:0 frame:0
  TX packets:30097919 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:931924602 (888.7 MiB)  TX bytes:931924602 (888.7 MiB)



-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 22:10:01 up 22 days, 15:28, 0 users, load average: 4.25, 4.21, 4.12

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Curious about dead rows.

2007-11-16 Thread Merlin Moncure
On Nov 16, 2007 10:56 AM, Brad Nicholson [EMAIL PROTECTED] wrote:
 On Wed, 2007-11-14 at 17:46 -0500, Tom Lane wrote:
  Russell Smith [EMAIL PROTECTED] writes:
   It is possible that analyze is not getting the number of dead rows right?
 
  Hah, I think you are on to something.  ANALYZE is telling the truth
  about how many dead rows it saw, but its notion of dead is not good
  according to SnapshotNow.  Thus, rows inserted by a not-yet-committed
  transaction would be counted as dead.  So if these are background
  auto-analyzes being done in parallel with inserting transactions that
  run for awhile, seeing a few not-yet-committed rows would be
  unsurprising.
 
  I wonder if that is worth fixing?  I'm not especially concerned about
  the cosmetic aspect of it, but if we mistakenly launch an autovacuum
  on the strength of an inflated estimate of dead rows, that could be
  costly.

 Sounds to me like that could result in autovacuum kicking off while
 doing large data loads.  This sounds suspiciously like problem someone
 on -novice was having - tripping over a windows autovac bug while doing
 a data load

 http://archives.postgresql.org/pgsql-novice/2007-11/msg00025.php

I am almost 100% I've seen this behavior in the field...

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-16 Thread Kevin Grittner
 On Fri, Nov 16, 2007 at  4:01 PM, in message
[EMAIL PROTECTED], Merlin Moncure
[EMAIL PROTECTED] wrote: 
 On Nov 16, 2007 10:56 AM, Brad Nicholson [EMAIL PROTECTED] wrote:
 On Wed, 2007-11-14 at 17:46 -0500, Tom Lane wrote:
  Russell Smith [EMAIL PROTECTED] writes:
   It is possible that analyze is not getting the number of dead rows right?
 
  Hah, I think you are on to something.  ANALYZE is telling the truth
  about how many dead rows it saw, but its notion of dead is not good
  according to SnapshotNow.  Thus, rows inserted by a not-yet-committed
  transaction would be counted as dead.  So if these are background
  auto-analyzes being done in parallel with inserting transactions that
  run for awhile, seeing a few not-yet-committed rows would be
  unsurprising.
 
  I wonder if that is worth fixing?  I'm not especially concerned about
  the cosmetic aspect of it, but if we mistakenly launch an autovacuum
  on the strength of an inflated estimate of dead rows, that could be
  costly.

 Sounds to me like that could result in autovacuum kicking off while
 doing large data loads.  This sounds suspiciously like problem someone
 on -novice was having - tripping over a windows autovac bug while doing
 a data load

 http://archives.postgresql.org/pgsql-novice/2007-11/msg00025.php
 
 I am almost 100% I've seen this behavior in the field...
 
I know I've seen bulk loads go significantly faster with autovacuum
turned off.  It always seemed like a bigger difference than what the
ANALYZE would cause.  I bet this explains it.
 
-Kevin
 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Alvaro Herrera
Jean-David Beyer wrote:

 How do I reset the counters in pg_stat_database and pg_stat_all_tables?
 I tried just restarting postgres, but it seems to be saved in the database,
 not just in the RAM of the server.

There is a function called pg_stat_reset() or some such.

 I suppose that blks_read and blks_hit are zero because there are 8 GBytes
 RAM on this machine and I give 2GBytes to shared_buffers = 253000 so that
 all sits in RAM.

Perhaps you have stats_block_level set to off?

 I know there have been rollbacks but I do a REINDEX, CLUSTER, and VACUUM
 ANALYZE before starting the inserts in question.

You do all three on the same tables?  That seems pretty pointless.  A
sole CLUSTER has the same effect.

 Do I need to do a VACUUM FULL ANALYZE instead?

No.

-- 
Alvaro Herrera   Valdivia, Chile   ICBM: S 39º 49' 18.1, W 73º 13' 56.4
There was no reply (Kernel Traffic)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Jean-David Beyer
Merlin Moncure wrote:
 On Nov 13, 2007 9:26 PM, Jean-David Beyer [EMAIL PROTECTED] wrote:
 Merlin Moncure wrote:
 what does pg_stat_all_tables say (assuming row level stats are on)?
 It says stuff like this:

  relname  | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins |
 n_tup_upd | n_tup_del
 --+--+--+--+---+---+-
  ibd  |   75 |  9503850 |   11 |   2350555 |   2416845 |
 0 | 0
  vl_cf|  139 | 38722575 |   22 |   5392609 |   5692814 |
 0 | 0
  vl_li|  139 | 39992838 |   22 |   5569855 |   5885516 |
 0 | 0

 I removed the relid and schemaname and squeezed the other columns so it
 would not be quite so wide. Is this what you might like to know?
 
 it tells me that you aren't crazy, and that rollbacks are the likely
 the cause, although you appear to be watching the logs pretty
 carefully.  you can check pg_stat_database to confirm if your
 rollbacks are in line with your expectations.  or, you might by seeing
 some corner case conditions...are any fields in the table foreign
 keyed to another table (cascading update/delete)?  do you have any
 functions with handled exceptions or savepoints? (I'm guessing no to
 the latter).
 
How do I reset the counters in pg_stat_database and pg_stat_all_tables?
I tried just restarting postgres, but it seems to be saved in the database,
not just in the RAM of the server.

Right now I am getting:

stock= SELECT * FROM pg_stat_database;
 datid |  datname  | numbackends | xact_commit | xact_rollback | blks_read |
blks_hit
---+---+-+-+---+---+--

 16402 | stock |   1 |   261428429 |   3079861 | 0 |
   0
(4 rows)

I just watched these as the loading program runs, and I can account for all
the new rollbacks, that come after the dead rows are found.

I suppose that blks_read and blks_hit are zero because there are 8 GBytes
RAM on this machine and I give 2GBytes to shared_buffers = 253000 so that
all sits in RAM.

I know there have been rollbacks but I do a REINDEX, CLUSTER, and VACUUM
ANALYZE before starting the inserts in question. Do I need to do a VACUUM
FULL ANALYZE instead?

When there were errors in the input data, the program just rolls back the
transaction and gives up on that input file. (The program processes hundreds
of input files and I get an additional input file each week. I then correct
the error in the input file and start over. I do not do updates because the
input file needs to be corrected anyhow. and the easiest way to check it is
to load it into the database and let the loader programs check it.)

Keeping things in perspective, the autovacuum gets these eventually, and I
do not think it is really hurting performance all that much. But I would
like to understand what is going on.

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 06:25:01 up 21 days, 23:43, 0 users, load average: 4.02, 4.01, 4.00

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Andrew Sullivan
On Wed, Nov 14, 2007 at 07:12:45AM -0500, Jean-David Beyer wrote:
 
 I know there have been rollbacks but I do a REINDEX, CLUSTER, and VACUUM
 ANALYZE before starting the inserts in question. Do I need to do a VACUUM
 FULL ANALYZE instead?

I had another idea.  As Alvaro says, CLUSTER will do everything you need. 
But are you sure there are _no other_ transactions open when you do that? 
This could cause problems, and CLUSTER's behaviour with other open
transactions is not, um, friendly prior to the current beta.

A


-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Jean-David Beyer
Andrew Sullivan wrote:
 On Wed, Nov 14, 2007 at 07:12:45AM -0500, Jean-David Beyer wrote:
 I know there have been rollbacks but I do a REINDEX, CLUSTER, and
 VACUUM ANALYZE before starting the inserts in question. Do I need to do
 a VACUUM FULL ANALYZE instead?
 
 I had another idea.  As Alvaro says, CLUSTER will do everything you need.
  But are you sure there are _no other_ transactions open when you do
 that? This could cause problems, and CLUSTER's behaviour with other open 
 transactions is not, um, friendly prior to the current beta.
 
These were not done at exactly the same time, but as close as I can.

REINDEX
CLUSTER;
CLUSTER
 (part of a shell script that runs the other stuff)

File `/homeB/jdbeyer/stocks/DATA/valueLine/19860103.tsv' OK
File `/homeB/jdbeyer/stocks/DATA/valueLine/19860131.tsv' OK
File `/homeB/jdbeyer/stocks/DATA/valueLine/19860228.tsv' OK
File `/homeB/jdbeyer/stocks/DATA/valueLine/19860328.tsv' OK
File `/homeB/jdbeyer/stocks/DATA/valueLine/19860502.tsv' OK
File `/homeB/jdbeyer/stocks/DATA/valueLine/19860530.tsv' OK
File `/homeB/jdbeyer/stocks/DATA/valueLine/19860627.tsv' OK
(this is showing the program being run on different data).

stock=# SELECT * FROM pg_stat_database WHERE datname = 'stock';
 datid | datname | numbackends | xact_commit | xact_rollback | blks_read |
blks_hit
---+-+-+-+---+---+--
 16402 | stock   |   2 | 152 | 0 | 18048 |
15444563
(1 row)

stock=# SELECT * FROM pg_stat_all_tables WHERE schemaname = 'public' ORDER
BY relname;
 relid | schemaname | relname  | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
---++--+--+--+--+---+---+---+---
 89000 | public | co_name  |0 |0 |0 |
  0 | 0 | 0 | 0
 89004 | public | company  |0 |0 |   938764 |
 938764 | 0 | 0 | 0
 89029 | public | tick |0 |0 |   189737 |
 279580 | 0 | 0 | 0
 89034 | public | vl_as|0 |0 |0 |
  0 |140840 | 0 | 0
 89036 | public | vl_cf|0 |0 |0 |
  0 |140840 | 0 | 0
 89038 | public | vl_in|0 |0 |0 |
  0 |185667 | 0 | 0
 89040 | public | vl_li|0 |0 |0 |
  0 |140840 | 0 | 0
 89042 | public | vl_mi|0 |0 |0 |
  0 |140840 | 0 | 0
 89044 | public | vl_ranks |0 |0 |0 |
  0 |189737 | 0 | 0
(18 rows)

2007-11-14 12:00:31 EST DEBUG:  analyzing public.vl_in
2007-11-14 12:00:31 EST DEBUG:  vl_in: scanned 2001 of 2001 pages,
containing 183983 live rows and 52 dead rows; 3000 rows in sample, 183983
estimated total rows
2007-11-14 12:00:31 EST DEBUG:  analyzing public.vl_cf
2007-11-14 12:00:31 EST DEBUG:  vl_cf: scanned 1064 of 1064 pages,
containing 134952 live rows and 89 dead rows; 3000 rows in sample, 134952
estimated total rows
2007-11-14 12:00:31 EST DEBUG:  analyzing public.vl_as
2007-11-14 12:00:31 EST DEBUG:  vl_as: scanned 1732 of 1732 pages,
containing 134952 live rows and 120 dead rows; 3000 rows in sample, 134952
estimated total rows
2007-11-14 12:00:31 EST DEBUG:  analyzing public.vl_ranks
2007-11-14 12:00:31 EST DEBUG:  vl_ranks: scanned 1485 of 1485 pages,
containing 188415 live rows and 162 dead rows; 3000 rows in sample, 188415
estimated total rows
2007-11-14 12:00:31 EST DEBUG:  analyzing public.vl_mi
2007-11-14 12:00:31 EST DEBUG:  vl_mi: scanned 1325 of 1325 pages,
containing 134952 live rows and 191 dead rows; 3000 rows in sample, 134952
estimated total rows
2007-11-14 12:00:31 EST DEBUG:  analyzing public.vl_li
2007-11-14 12:00:31 EST DEBUG:  vl_li: scanned 1326 of 1326 pages,
containing 134952 live rows and 218 dead rows; 3000 rows in sample, 134952
estimated total rows



-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 11:55:01 up 22 days, 5:13, 3 users, load average: 5.13, 4.71, 4.74

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Jean-David Beyer
Andrew Sullivan wrote:
 On Wed, Nov 14, 2007 at 07:12:45AM -0500, Jean-David Beyer wrote:
 I know there have been rollbacks but I do a REINDEX, CLUSTER, and VACUUM
 ANALYZE before starting the inserts in question. Do I need to do a VACUUM
 FULL ANALYZE instead?
 
 I had another idea.  As Alvaro says, CLUSTER will do everything you need. 
 But are you sure there are _no other_ transactions open when you do that? 

I am sure. I have a single-threaded program, so unless the postgres server
processes begin and end transactions on their own initiative, the only
things that would initiate transactions would be my one of my applications
that I run only one at a time, or leaving psql running. But as I understand
it, psql does not bother with transactions, and besides, I normally just do
SELECTs with that. (I also do INSERTs and UPDATEs with it in shell scripts,
but I do not run those when I am running the application either.

 This could cause problems, and CLUSTER's behaviour with other open
 transactions is not, um, friendly prior to the current beta.
 
I suppose it might.

Right now I put

// Reset statistics counters.
EXEC SQL BEGIN WORK;
EXEC SQL
SELECT pg_stat_reset();
EXEC SQL COMMIT WORK;

into my application so that the statistics counters will not count previous
UPDATEs and ROLLBACKs when the main program that I intend and believe to do
only INSERTs is running. It will make those statistics easier to read than
having to subtract previous values to get the changes.

Well, it will not work because I must be superuser (i.e., postgres) to
execute that, and if I am, I cannot read the input files. I will do it
manually with psql but that means I have to watch it run to do it at the
right time.


-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 11:20:01 up 22 days, 4:38, 4 users, load average: 6.16, 5.98, 5.62

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Andrew Sullivan
On Wed, Nov 14, 2007 at 11:53:17AM -0500, Jean-David Beyer wrote:
 that I run only one at a time, or leaving psql running. But as I understand
 it, psql does not bother with transactions, and besides, I normally just do

No, every statement in psql is a transaction.  Even SELECT.  Every statement
under PostgreSQL runs in a transaction.  When you type SELECT (1), the
server implicitly adds the BEGIN; and END; around it.

 into my application so that the statistics counters will not count previous
 UPDATEs and ROLLBACKs when the main program that I intend and believe to do
 only INSERTs is running. It will make those statistics easier to read than
 having to subtract previous values to get the changes.

Yes.
 
 Well, it will not work because I must be superuser (i.e., postgres) to
 execute that, and if I am, I cannot read the input files. I will do it

You could grant superuser status to your user (or just connect as postgres
user) for the time being, while debugging this.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Andrew Sullivan
On Wed, Nov 14, 2007 at 11:58:23AM -0500, Andrew Sullivan wrote:
 No, every statement in psql is a transaction.  Even SELECT.  Every statement

Err, to be clearer, Every statement in psql is _somehow_ part of a
transaction; if you don't start one explicitly, the statement runs on its
own as a transaction.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Mario Weilguni

Jean-David Beyer schrieb:

I am doing lots of INSERTs on a table that starts out empty (I did a
TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
on. I moved logging up to debug2 level to see what was going on, and I get
things like this:

 vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and 1033
dead rows; 3000 rows in sample, 411224 estimated total rows

A little later, it says:

vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
dead rows; 3000 rows in sample, 538311 estimated total rows

(I suppose that means autovacuum is working.) Is this normal, or have I got
something wrong? Why so many dead rows when just doing inserts? It is not
that I think the number is too high, considering the number of rows in the
table at the point where I copied this line. It is just that I do not
understand why there are any.

  
Did you rollback some transactions? It will generate dead rows too - at 
least I think so.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Alvaro Herrera
Jean-David Beyer wrote:
 Mario Weilguni wrote:

  Did you rollback some transactions? It will generate dead rows too - at
  least I think so.
  
 No, and the statistics confirm this.

To recap:

- your app only does inserts
- there has been no rollback lately
- there are no updates
- there are no deletes

The only other source of dead rows I can think is triggers ... do you
have any?  (Not necessarily on this table -- perhaps triggers on other
tables can cause updates on this one).

Oh, rolled back COPY can cause dead rows too.

-- 
Alvaro Herrera   http://www.PlanetPostgreSQL.org/
Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are.  -- Charles J. Sykes' advice to teenagers

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Jean-David Beyer
Alvaro Herrera wrote:
 Jean-David Beyer wrote:
 Mario Weilguni wrote:
 
 Did you rollback some transactions? It will generate dead rows too - at
 least I think so.

 No, and the statistics confirm this.
 
 To recap:
 
 - your app only does inserts

True.

 - there has been no rollback lately

True.

 - there are no updates

True

 - there are no deletes

True.
 
 The only other source of dead rows I can think is triggers ... do you
 have any? 

No triggers at all. I have sequences that were not in the IBM DB2 version of
this stuff. But they are all done earlier, before the CLUSTER of the entire
database. Furthermore, they are only for two tables, not the ones that
attracted my notice in the first place.

 (Not necessarily on this table -- perhaps triggers on other
 tables can cause updates on this one).
 
 Oh, rolled back COPY can cause dead rows too.
 
The only copies I ever do are those inside dbdump -- dbrestore, and they
come after all this stuff. And they do not roll back -- though I suppose
they could in principle.

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 16:45:01 up 22 days, 10:03, 1 user, load average: 4.20, 4.22, 4.17

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Jean-David Beyer
Mario Weilguni wrote:
 Jean-David Beyer schrieb:
 I am doing lots of INSERTs on a table that starts out empty (I did a
 TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
 on. I moved logging up to debug2 level to see what was going on, and I
 get
 things like this:

  vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and
 1033
 dead rows; 3000 rows in sample, 411224 estimated total rows

 A little later, it says:

 vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
 dead rows; 3000 rows in sample, 538311 estimated total rows

 (I suppose that means autovacuum is working.) Is this normal, or have
 I got
 something wrong? Why so many dead rows when just doing inserts? It is not
 that I think the number is too high, considering the number of rows in
 the
 table at the point where I copied this line. It is just that I do not
 understand why there are any.

   
 Did you rollback some transactions? It will generate dead rows too - at
 least I think so.
 
No, and the statistics confirm this.

stock= SELECT * FROM pg_stat_database WHERE datname = 'stock';
 datid | datname | numbackends | xact_commit | xact_rollback | blks_read |
blks_hit
---+-+-+-+---+---+---
 16402 | stock   |   1 |1267 | 0 |232234 |
146426135
(1 row)

stock= SELECT * FROM pg_stat_all_tables WHERE schemaname = 'public' ORDER
BY relname;
 relid | schemaname | relname  | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
---++--+--+--+--+---+---+---+---
 89000 | public | co_name  |7 |   215873 |1 |
  30839 | 0 | 0 | 0
 89004 | public | company  |9 |   219519 |  5624483 |
5648873 | 0 | 0 | 0
 89008 | public | div  |7 |0 |1 |
  0 | 0 | 0 | 0
 89010 | public | djia |4 | 2044 |0 |
  0 | 0 | 0 | 0
 89012 | public | earn |2 |0 |0 |
  0 | 0 | 0 | 0
 89014 | public | ibd  |5 |0 |1 |
  0 | 0 | 0 | 0
 89016 | public | merg |2 |0 |0 |
  0 | 0 | 0 | 0
 89018 | public | price|9 |0 |1 |
  0 | 0 | 0 | 0
 89022 | public | source   |3 |   27 |0 |
  0 | 0 | 0 | 0
 89025 | public | sp_500   |2 |0 |0 |
  0 | 0 | 0 | 0
 89027 | public | split|3 |0 |1 |
  0 | 0 | 0 | 0
 89029 | public | tick |   13 |   400946 |   980983 |
1510922 | 0 | 0 | 0
 89034 | public | vl_as|7 |  6524595 |1 |
 932085 |932085 | 0 | 0
 89036 | public | vl_cf|7 |  6317808 |1 |
 902544 |902544 | 0 | 0
 89038 | public | vl_in|7 |  6798351 |1 |
 971193 |966989 | 0 | 0
 89040 | public | vl_li|7 |  6524595 |1 |
 932085 |932085 | 0 | 0
 89042 | public | vl_mi|7 |  6368579 |1 |
 909797 |909797 | 0 | 0
 89044 | public | vl_ranks |8 |  7624818 |1 |
 985548 |980982 | 0 | 0


-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 16:05:01 up 22 days, 9:23, 0 users, load average: 4.45, 4.11, 4.03

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Craig James

Alvaro Herrera wrote:

To recap:

- your app only does inserts
- there has been no rollback lately
- there are no updates
- there are no deletes

The only other source of dead rows I can think is triggers ... do you
have any?  (Not necessarily on this table -- perhaps triggers on other
tables can cause updates on this one).

Oh, rolled back COPY can cause dead rows too.



What about an unreliable network that causes lot of disconnects?  Wouldn't the 
server process do a rollback?

Craig



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Jean-David Beyer
Tom Lane wrote:
 Russell Smith [EMAIL PROTECTED] writes:
 It is possible that analyze is not getting the number of dead rows
 right?
 
 Hah, I think you are on to something.  ANALYZE is telling the truth about
 how many dead rows it saw, but its notion of dead is not good 
 according to SnapshotNow.  Thus, rows inserted by a not-yet-committed 
 transaction would be counted as dead.  So if these are background 
 auto-analyzes being done in parallel with inserting transactions that run
 for awhile,

They are.

 seeing a few not-yet-committed rows would be unsurprising.

That is a very interesting possibility. I can see that it is certainly a
possible explanation, since my insert transactions take between 0.04 to 0.1
minutes (sorry, decimal stopwatch) of real time, typically putting 1700 rows
into about a half dozen tables. And the ANALYZE is whatever autovacuum
chooses to do. So if new not-yet-committed rows are considered dead, that
would be a sufficient explanation.

So I am, retroactively, unsurprised.

 I wonder if that is worth fixing?  I'm not especially concerned about the
 cosmetic aspect of it, but if we mistakenly launch an autovacuum on the
 strength of an inflated estimate of dead rows, that could be costly.
 
Well, since I was more interested in the explanation than in the fixing, in
that sense I do not care if it is fixed or not. While it may create a slight
slowdown (if it is an error), the applications run fast enough.

I would not even get the fix until Red Hat get around to putting it in (I
run postgresql-8.1.9-1.el5 that is in their RHEL5 distribution), that
probably will not be until RHEL6 and the soonest, and I will probably skip
that one and wait until RHEL7 comes out in about 3 years.

But somewhere perhaps a reminder of this should be placed where someone like
me would find it, so we would not have to go through this again for someone
else.

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 22:05:01 up 22 days, 15:23, 0 users, load average: 4.16, 4.22, 4.10

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Curious about dead rows.

2007-11-14 Thread Tom Lane
Russell Smith [EMAIL PROTECTED] writes:
 It is possible that analyze is not getting the number of dead rows right?

Hah, I think you are on to something.  ANALYZE is telling the truth
about how many dead rows it saw, but its notion of dead is not good
according to SnapshotNow.  Thus, rows inserted by a not-yet-committed
transaction would be counted as dead.  So if these are background
auto-analyzes being done in parallel with inserting transactions that
run for awhile, seeing a few not-yet-committed rows would be
unsurprising.

I wonder if that is worth fixing?  I'm not especially concerned about
the cosmetic aspect of it, but if we mistakenly launch an autovacuum
on the strength of an inflated estimate of dead rows, that could be
costly.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Andrew Sullivan
Please don't drop the list, as someone else may see something.

On Tue, Nov 13, 2007 at 10:06:13AM -0500, Jean-David Beyer wrote:
 OK. I turned logging from none to mod and got a gawdawful lot of stuff.

Yes.

 Then I ran it and got all the inserts. Using
 grep -i delete file
 grep -i update file
 grep -i rollback file

How about ERROR?

 2007-11-13 08:11:20 EST DEBUG:  vl_ranks: scanned 540 of 540 pages,
 containing 67945 live rows and 554 dead rows; 3000 rows in sample, 67945
 estimated total rows

If there are dead rows, something is producing them.  Either INSERT is
firing a trigger that is doing something there (you won't see an UPDATE in
that case), or else something else is causing INSERTs to fail.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Jean-David Beyer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Andrew Sullivan wrote:
 Please don't drop the list, as someone else may see something.
 
 On Tue, Nov 13, 2007 at 10:06:13AM -0500, Jean-David Beyer wrote:
 OK. I turned logging from none to mod and got a gawdawful lot of stuff.
 
 Yes.
 
 Then I ran it and got all the inserts. Using
 grep -i delete file
 grep -i update file
 grep -i rollback file
 
 How about ERROR?

$ grep -i error Tue.log
$
 
 2007-11-13 08:11:20 EST DEBUG:  vl_ranks: scanned 540 of 540 pages,
 containing 67945 live rows and 554 dead rows; 3000 rows in sample, 67945
 estimated total rows
 
 If there are dead rows, something is producing them.  Either INSERT is
 firing a trigger that is doing something there (you won't see an UPDATE in
 that case), or else something else is causing INSERTs to fail.

I have no triggers in that database. I do have two sequences.

  List of relations
 Schema |  Name  |   Type   |  Owner
- ++--+-
 public | company_company_id_seq | sequence | jdbeyer
 public | source_source_id_seq   | sequence | jdbeyer

stock= \d company_company_id_seq
Sequence public.company_company_id_seq
Column |  Type
- ---+-
 sequence_name | name
 last_value| bigint
 increment_by  | bigint
 max_value | bigint
 min_value | bigint
 cache_value   | bigint
 log_cnt   | bigint
 is_cycled | boolean
 is_called | boolean

stock= \d source_source_id_seq
Sequence public.source_source_id_seq
Column |  Type
- ---+-
 sequence_name | name
 last_value| bigint
 increment_by  | bigint
 max_value | bigint
 min_value | bigint
 cache_value   | bigint
 log_cnt   | bigint
 is_cycled | boolean
 is_called | boolean

but they are not used after the last VACUUM FULL ANALYZE


- --
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 14:40:01 up 21 days, 7:58, 2 users, load average: 4.33, 4.43, 4.39
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org

iD8DBQFHOgAiPtu2XpovyZoRApmZAKDH2JaSlxH+DT1rs8E110P9L4r5+ACZAYGY
z2SQtUvRDHlpCwePE2cskX4=
=xS8V
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Andrew Sullivan
On Tue, Nov 13, 2007 at 02:50:59PM -0500, Jean-David Beyer wrote:
  How about ERROR?
 
 $ grep -i error Tue.log
 $

Well, without actually logging into the machine and looking at the
application, I confess I am stumped.  Oh, wait.  You do have the log level
high enough that you should see errors in the log, right?  That's not
controlled by the statement parameter. 

 I have no triggers in that database. I do have two sequences.

Sequences should not produce any dead rows on the table, unless they're used
as keys and you're attempting inserts that conflict with used sequence
values.  That should cause errors that you'd get in the log, presuming that
you have the log level set correctly.

A


-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Andrew Sullivan
I'm not a private support organisation; please send your replies to the
list, not me.

On Tue, Nov 13, 2007 at 04:57:23PM -0500, Jean-David Beyer wrote:
 What is it controlled by? The following are the non-default values in
 postgresql.conf:
 
 redirect_stderr = on
 log_directory = '/srv/dbms/dataB/pgsql/pg_log'
 log_filename = 'postgresql-%a.log'
 log_truncate_on_rotation = on
 log_rotation_age = 1440
 log_rotation_size = 0
 log_min_messages = debug2

This will certainly include error messages, then.  Or it ought to.  You do
see errors in the log when you create one, right?  (Try causing an error in
psql to make sure.)

 log_line_prefix = '%t '
 log_statement = 'none'   (this was 'mod', but it uses too much
   disk to leave it turned on -- only
   4 GBytes in that partition)
 
  
 They are; they are the primary keys of two tables. But those are all done
 before the last VACUUM FULL ANALYZE runs, so the dead rows should have been
 eliminated. And the output of the sequence is the only way of generating a
 primary key, so it should be impossible anyhow.

I thought you were doing INSERTs?  It's not true that the output of the
sequence is the only way -- if you insert directly, it will happily insert
into that column.  But it should cause an error to show in the log, which is
what's puzzling me.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Jean-David Beyer
Andrew Sullivan wrote:
 I'm not a private support organisation; please send your replies to the
 list, not me.

Sorry. Most of the lists I send to have ReplyTo set, but a few do not.
And then I forget.
 
 On Tue, Nov 13, 2007 at 04:57:23PM -0500, Jean-David Beyer wrote:
 What is it controlled by? The following are the non-default values in
 postgresql.conf:

 redirect_stderr = on
 log_directory = '/srv/dbms/dataB/pgsql/pg_log'
 log_filename = 'postgresql-%a.log'
 log_truncate_on_rotation = on
 log_rotation_age = 1440
 log_rotation_size = 0
 log_min_messages = debug2
 
 This will certainly include error messages, then.  Or it ought to.  You do
 see errors in the log when you create one, right?  (Try causing an error in
 psql to make sure.)
Right: I do see an error message when I try to insert a duplicate entry. It
happens to violate the (company_name, company_permno) uniqueness constraint.

2007-11-13 17:58:30 EST ERROR:  duplicate key violates unique constraint
company_name_x

(I tried to insert a duplicate entry in the company_name field of relation
_company_. company_name_x is defined as:
company_name_x UNIQUE, btree (company_name, company_permno), tablespace
stockd )
 
 log_line_prefix = '%t '
 log_statement = 'none'   (this was 'mod', but it uses too much
   disk to leave it turned on -- only
   4 GBytes in that partition)

 They are; they are the primary keys of two tables. But those are all done
 before the last VACUUM FULL ANALYZE runs, so the dead rows should have been
 eliminated. And the output of the sequence is the only way of generating a
 primary key, so it should be impossible anyhow.
 
 I thought you were doing INSERTs? 

Yes.

 It's not true that the output of the
 sequence is the only way -- if you insert directly, it will happily insert
 into that column. 

Yes, but I get those keys from a sequence only. I never enter them manually
or from a data file.

 But it should cause an error to show in the log, which is
 what's puzzling me.
 
Me too.


-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 17:50:01 up 21 days, 11:08, 4 users, load average: 5.12, 4.77, 4.68

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Alvaro Herrera
Jean-David Beyer wrote:
 Andrew Sullivan wrote:
  I'm not a private support organisation; please send your replies to the
  list, not me.
 
 Sorry. Most of the lists I send to have ReplyTo set, but a few do not.
 And then I forget.

If you use reply to all, it works wonderfully in both cases.
(Actually it works even when you're not using mailing lists at all).


-- 
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
If it wasn't for my companion, I believe I'd be having
the time of my life  (John Dunbar)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Trevor Talbot
On 11/13/07, Alvaro Herrera [EMAIL PROTECTED] wrote:
 Jean-David Beyer wrote:
  Andrew Sullivan wrote:
   I'm not a private support organisation; please send your replies to the
   list, not me.
 
  Sorry. Most of the lists I send to have ReplyTo set, but a few do not.
  And then I forget.

 If you use reply to all, it works wonderfully in both cases.

Then it upsets the people who don't want to get private copies, only
list copies, on most of the Reply-To lists.

There's no winning :(

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Alvaro Herrera
Trevor Talbot escribió:
 On 11/13/07, Alvaro Herrera [EMAIL PROTECTED] wrote:
  Jean-David Beyer wrote:
   Andrew Sullivan wrote:
I'm not a private support organisation; please send your replies to the
list, not me.
  
   Sorry. Most of the lists I send to have ReplyTo set, but a few do not.
   And then I forget.
 
  If you use reply to all, it works wonderfully in both cases.
 
 Then it upsets the people who don't want to get private copies, only
 list copies, on most of the Reply-To lists.
 
 There's no winning :(

I am on a couple of mailing lists with Reply-To set, and what my MUA
does is put only the list on the To:, so there is no extra private copy.
I use reply-to-group all the time and it works perfectly well.

-- 
Alvaro Herrera http://www.flickr.com/photos/alvherre/
PHP is what I call the Dumb Monkey language. [A]ny dumb monkey can code
something in PHP. Python takes actual thought to produce something useful.
   (J. Drake)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Ansgar -59cobalt- Wiechers
On 2007-11-13 Trevor Talbot wrote:
 On 11/13/07, Alvaro Herrera [EMAIL PROTECTED] wrote:
 Jean-David Beyer wrote:
 Sorry. Most of the lists I send to have ReplyTo set, but a few do
 not. And then I forget.

 If you use reply to all, it works wonderfully in both cases.
 
 Then it upsets the people who don't want to get private copies, only
 list copies, on most of the Reply-To lists.
 
 There's no winning :(

Unless you use a mailer that supports Reply, Group-Reply, *and*
List-Reply. ;)

Regards
Ansgar Wiechers
-- 
The Mac OS X kernel should never panic because, when it does, it
seriously inconveniences the user.
--http://developer.apple.com/technotes/tn2004/tn2118.html

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Merlin Moncure
On Nov 10, 2007 1:38 PM, Jean-David Beyer [EMAIL PROTECTED] wrote:
 Tom Lane wrote:
  Jean-David Beyer [EMAIL PROTECTED] writes:
  I am doing lots of INSERTs on a table that starts out empty (I did a
  TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
  on. I moved logging up to debug2 level to see what was going on, and I get
  things like this:
 
   vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and 1033
  dead rows; 3000 rows in sample, 411224 estimated total rows
 
  A little later, it says:
 
  vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
  dead rows; 3000 rows in sample, 538311 estimated total rows
 
  Well, *something* is doing deletes or updates in that table.  Better
  look a bit harder at your application ...
 
 OK, you agree that if I am doing only INSERTs, that there should not be any
 dead rows. Therefore, I _must_ be doing deletes or updates.

 But the program is pretty simple, and I see no UPDATEs or DELETEs. I
 searched all the program source files (that contain none of them) and all
 the libraries I have written, and they have none either. Right now the
 programs are not to the state where UPDATEs or DELETEs are required (though
 they will be later). I am still developing them and it is easier to just
 restore from backup or start over from the beginning since most of the
 changes are data laundering from an ever-increasing number of spreadsheets.

 Am I right that TRUNCATE deletes all the rows of a table. They may then be
 still there, but would not autovacuum clean out the dead rows? Or maybe it
 has not gotten to them yet? I could do an explicit one earlier.

what does pg_stat_all_tables say (assuming row level stats are on)?

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Jean-David Beyer
Merlin Moncure wrote:
 On Nov 10, 2007 1:38 PM, Jean-David Beyer [EMAIL PROTECTED] wrote:
 Tom Lane wrote:
 Jean-David Beyer [EMAIL PROTECTED] writes:
 I am doing lots of INSERTs on a table that starts out empty (I did a
 TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
 on. I moved logging up to debug2 level to see what was going on, and I get
 things like this:
  vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and 1033
 dead rows; 3000 rows in sample, 411224 estimated total rows
 A little later, it says:
 vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
 dead rows; 3000 rows in sample, 538311 estimated total rows
 Well, *something* is doing deletes or updates in that table.  Better
 look a bit harder at your application ...

 OK, you agree that if I am doing only INSERTs, that there should not be any
 dead rows. Therefore, I _must_ be doing deletes or updates.

 But the program is pretty simple, and I see no UPDATEs or DELETEs. I
 searched all the program source files (that contain none of them) and all
 the libraries I have written, and they have none either. Right now the
 programs are not to the state where UPDATEs or DELETEs are required (though
 they will be later). I am still developing them and it is easier to just
 restore from backup or start over from the beginning since most of the
 changes are data laundering from an ever-increasing number of spreadsheets.

 Am I right that TRUNCATE deletes all the rows of a table. They may then be
 still there, but would not autovacuum clean out the dead rows? Or maybe it
 has not gotten to them yet? I could do an explicit one earlier.
 
 what does pg_stat_all_tables say (assuming row level stats are on)?

# - Query/Index Statistics Collector -

#stats_start_collector = on
stats_start_collector = on

#stats_command_string = off
#stats_block_level = off

#stats_row_level = off
stats_row_level = on

#stats_reset_on_server_start = off

 
It says stuff like this:

 relname  | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del
--+--+--+--+---+---+-
 ibd  |   75 |  9503850 |   11 |   2350555 |   2416845 |
0 | 0
 vl_cf|  139 | 38722575 |   22 |   5392609 |   5692814 |
0 | 0
 vl_li|  139 | 39992838 |   22 |   5569855 |   5885516 |
0 | 0

I removed the relid and schemaname and squeezed the other columns so it
would not be quite so wide. Is this what you might like to know?

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 21:10:01 up 21 days, 14:28, 3 users, load average: 6.20, 5.69, 5.11

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Curious about dead rows.

2007-11-13 Thread Merlin Moncure
On Nov 13, 2007 9:26 PM, Jean-David Beyer [EMAIL PROTECTED] wrote:
 Merlin Moncure wrote:
  what does pg_stat_all_tables say (assuming row level stats are on)?
 It says stuff like this:

  relname  | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins |
 n_tup_upd | n_tup_del
 --+--+--+--+---+---+-
  ibd  |   75 |  9503850 |   11 |   2350555 |   2416845 |
 0 | 0
  vl_cf|  139 | 38722575 |   22 |   5392609 |   5692814 |
 0 | 0
  vl_li|  139 | 39992838 |   22 |   5569855 |   5885516 |
 0 | 0

 I removed the relid and schemaname and squeezed the other columns so it
 would not be quite so wide. Is this what you might like to know?

it tells me that you aren't crazy, and that rollbacks are the likely
the cause, although you appear to be watching the logs pretty
carefully.  you can check pg_stat_database to confirm if your
rollbacks are in line with your expectations.  or, you might by seeing
some corner case conditions...are any fields in the table foreign
keyed to another table (cascading update/delete)?  do you have any
functions with handled exceptions or savepoints? (I'm guessing no to
the latter).

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Curious about dead rows.

2007-11-10 Thread Jean-David Beyer
I am doing lots of INSERTs on a table that starts out empty (I did a
TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
on. I moved logging up to debug2 level to see what was going on, and I get
things like this:

 vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and 1033
dead rows; 3000 rows in sample, 411224 estimated total rows

A little later, it says:

vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
dead rows; 3000 rows in sample, 538311 estimated total rows

(I suppose that means autovacuum is working.) Is this normal, or have I got
something wrong? Why so many dead rows when just doing inserts? It is not
that I think the number is too high, considering the number of rows in the
table at the point where I copied this line. It is just that I do not
understand why there are any.

I could easily understand it if I were doing UPDATEs.

postgresql-8.1.9-1.el5

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 11:15:01 up 18 days, 4:33, 4 users, load average: 6.18, 5.76, 5.26

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Curious about dead rows.

2007-11-10 Thread Tom Lane
Jean-David Beyer [EMAIL PROTECTED] writes:
 I am doing lots of INSERTs on a table that starts out empty (I did a
 TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
 on. I moved logging up to debug2 level to see what was going on, and I get
 things like this:

  vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and 1033
 dead rows; 3000 rows in sample, 411224 estimated total rows

 A little later, it says:

 vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
 dead rows; 3000 rows in sample, 538311 estimated total rows

Well, *something* is doing deletes or updates in that table.  Better
look a bit harder at your application ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Curious about dead rows.

2007-11-10 Thread Jean-David Beyer
Tom Lane wrote:
 Jean-David Beyer [EMAIL PROTECTED] writes:
 I am doing lots of INSERTs on a table that starts out empty (I did a
 TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
 on. I moved logging up to debug2 level to see what was going on, and I get
 things like this:
 
  vl_as: scanned 3000 of 5296 pages, containing 232944 live rows and 1033
 dead rows; 3000 rows in sample, 411224 estimated total rows
 
 A little later, it says:
 
 vl_as: scanned 3000 of 6916 pages, containing 233507 live rows and 493
 dead rows; 3000 rows in sample, 538311 estimated total rows
 
 Well, *something* is doing deletes or updates in that table.  Better
 look a bit harder at your application ...
 
OK, you agree that if I am doing only INSERTs, that there should not be any
dead rows. Therefore, I _must_ be doing deletes or updates.

But the program is pretty simple, and I see no UPDATEs or DELETEs. I
searched all the program source files (that contain none of them) and all
the libraries I have written, and they have none either. Right now the
programs are not to the state where UPDATEs or DELETEs are required (though
they will be later). I am still developing them and it is easier to just
restore from backup or start over from the beginning since most of the
changes are data laundering from an ever-increasing number of spreadsheets.

Am I right that TRUNCATE deletes all the rows of a table. They may then be
still there, but would not autovacuum clean out the dead rows? Or maybe it
has not gotten to them yet? I could do an explicit one earlier.

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 13:10:01 up 18 days, 6:28, 7 users, load average: 4.46, 4.34, 4.23

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Curious about dead rows.

2007-11-10 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, 10 Nov 2007 13:38:23 -0500
Jean-David Beyer [EMAIL PROTECTED] wrote:

 Tom Lane wrote:
  Jean-David Beyer [EMAIL PROTECTED] writes:
  I am doing lots of INSERTs on a table that starts out empty (I did
  a TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs.
  Autovacuum is on. I moved logging up to debug2 level to see what
  was going on, and I get things like this:
  
   vl_as: scanned 3000 of 5296 pages, containing 232944 live rows
  and 1033 dead rows; 3000 rows in sample, 411224 estimated total
  rows
  
  A little later, it says:
  
  vl_as: scanned 3000 of 6916 pages, containing 233507 live rows
  and 493 dead rows; 3000 rows in sample, 538311 estimated total rows
  
  Well, *something* is doing deletes or updates in that table.  Better
  look a bit harder at your application ...
  
 OK, you agree that if I am doing only INSERTs, that there should not
 be any dead rows. Therefore, I _must_ be doing deletes or updates.
 
 But the program is pretty simple, and I see no UPDATEs or DELETEs. I
 searched all the program source files (that contain none of them) and
 all the libraries I have written, and they have none either. Right
 now the programs are not to the state where UPDATEs or DELETEs are
 required (though they will be later). I am still developing them and
 it is easier to just restore from backup or start over from the
 beginning since most of the changes are data laundering from an
 ever-increasing number of spreadsheets.
 
 Am I right that TRUNCATE deletes all the rows of a table. They may
 then be still there, but would not autovacuum clean out the dead
 rows? Or maybe it has not gotten to them yet? I could do an explicit
 one earlier.

Truncate will not create dead rows. However ROLLBACK will. Are you
getting any duplicate key errors or anything like that when you insert?

Sincerely,

Joshua D. Drake

 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHNf2pATb/zqfZUUQRApYEAKCWp107koBhpWQbMjwLybBB6SvDmQCgj8Q6
kPAE4qe1fT6RNbFtqlIw52M=
=/5us
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Curious about dead rows.

2007-11-10 Thread Jean-David Beyer
Joshua D. Drake wrote:
 On Sat, 10 Nov 2007 13:38:23 -0500 Jean-David Beyer
 [EMAIL PROTECTED] wrote:
 
 Tom Lane wrote:
 Jean-David Beyer [EMAIL PROTECTED] writes:
 I am doing lots of INSERTs on a table that starts out empty (I
 did a TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs.
  Autovacuum is on. I moved logging up to debug2 level to see what
  was going on, and I get things like this: vl_as: scanned 3000
 of 5296 pages, containing 232944 live rows and 1033 dead rows;
 3000 rows in sample, 411224 estimated total rows A little later,
 it says: vl_as: scanned 3000 of 6916 pages, containing 233507
 live rows and 493 dead rows; 3000 rows in sample, 538311
 estimated total rows
 Well, *something* is doing deletes or updates in that table.
 Better look a bit harder at your application ...
 
 OK, you agree that if I am doing only INSERTs, that there should not 
 be any dead rows. Therefore, I _must_ be doing deletes or updates.
 
 But the program is pretty simple, and I see no UPDATEs or DELETEs. I 
 searched all the program source files (that contain none of them) and
  all the libraries I have written, and they have none either. Right 
 now the programs are not to the state where UPDATEs or DELETEs are 
 required (though they will be later). I am still developing them and 
 it is easier to just restore from backup or start over from the 
 beginning since most of the changes are data laundering from an 
 ever-increasing number of spreadsheets.
 
 Am I right that TRUNCATE deletes all the rows of a table. They may 
 then be still there, but would not autovacuum clean out the dead 
 rows? Or maybe it has not gotten to them yet? I could do an explicit 
 one earlier.
 
 Truncate will not create dead rows. However ROLLBACK will. Are you 
 getting any duplicate key errors or anything like that when you insert?
 
On the mistaken assumption that TRUNCATE left dead rows, I did a
VACUUM FULL ANALYZE before running the program full of INSERTs. This did not
make any difference.

As far as ROLLBACK are concerned, every one is immediately preceded by a
message output to the standard error file, and no such messages are produced.

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 14:50:01 up 18 days, 8:08, 5 users, load average: 5.23, 5.35, 5.34

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Curious about dead rows.

2007-11-10 Thread Scott Marlowe
On Nov 10, 2007 1:57 PM, Jean-David Beyer [EMAIL PROTECTED] wrote:

 Joshua D. Drake wrote:
 
  Truncate will not create dead rows. However ROLLBACK will. Are you
  getting any duplicate key errors or anything like that when you insert?
 
 On the mistaken assumption that TRUNCATE left dead rows, I did a
 VACUUM FULL ANALYZE before running the program full of INSERTs. This did not
 make any difference.

 As far as ROLLBACK are concerned, every one is immediately preceded by a
 message output to the standard error file, and no such messages are produced.

So, there are NO failed inserts, and no updates?  Cause that's what
I'd expect to create the dead rows.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Curious about dead rows.

2007-11-10 Thread Jean-David Beyer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Scott Marlowe wrote:
 On Nov 10, 2007 1:57 PM, Jean-David Beyer [EMAIL PROTECTED] wrote:
 Joshua D. Drake wrote:
 Truncate will not create dead rows. However ROLLBACK will. Are you
 getting any duplicate key errors or anything like that when you insert?

 On the mistaken assumption that TRUNCATE left dead rows, I did a
 VACUUM FULL ANALYZE before running the program full of INSERTs. This did not
 make any difference.

 As far as ROLLBACK are concerned, every one is immediately preceded by a
 message output to the standard error file, and no such messages are produced.
 
 So, there are NO failed inserts, and no updates?  Cause that's what
 I'd expect to create the dead rows.
 
So would I. Hence the original question.

- --
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 21:20:01 up 18 days, 14:38, 0 users, load average: 4.38, 4.40, 4.31
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org

iD8DBQFHNmeBPtu2XpovyZoRAqxzAJ9wLNf7Y9egSd/COtMjWaqKWfJXowCfdDj7
HEulOz8v4DKtAqWCGTf/22Y=
=79AU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly