Re: [HACKERS] [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when

2007-09-24 Thread Heikki Linnakangas
Simon Riggs wrote:
 On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote:
 How about just using MaxHeapTuplesPerPage? With the default 8K block
 size, it's not that much more than 200, but makes the above gripes
 completely go away. That seems like the safest option at this point.
 
 It would be much better to use a value for each table. Any constant
 value will be sub-optimal in many cases. 

Allocating extra memory doesn't usually do much harm, as long as you
don't actually use it. The reason we're now limiting it is to avoid Out
Of Memory errors if you're running with overcommit turned off, and
autovacuum triggers a vacuum on multiple tables at the same time.

Let's keep it simple. Per-table setting would be much more complex and
would be something that the DBA would need to calculate and set. If you
really do run into this problem, you can just dial down
maintenance_work_mem.

 Let's use our knowledge of the table to calculate a sensible value. We
 often have average row length available from last VACUUM, don't we? Use
 that, plus 10%.

If there's dead line pointers in the table, left behind by HOT pruning,
the average row length would be completely bogus. Using the dead_tuples
stats directly would be more reasonable, but I would like us to keep
this even more conservative than that.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when

2007-09-24 Thread Simon Riggs
On Mon, 2007-09-24 at 10:39 +0100, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote:
  How about just using MaxHeapTuplesPerPage? With the default 8K block
  size, it's not that much more than 200, but makes the above gripes
  completely go away. That seems like the safest option at this point.
  
  It would be much better to use a value for each table. Any constant
  value will be sub-optimal in many cases. 
 
 Allocating extra memory doesn't usually do much harm, as long as you
 don't actually use it. The reason we're now limiting it is to avoid Out
 Of Memory errors if you're running with overcommit turned off, and
 autovacuum triggers a vacuum on multiple tables at the same time.
 
 Let's keep it simple. Per-table setting would be much more complex and
 would be something that the DBA would need to calculate and set. If you
 really do run into this problem, you can just dial down
 maintenance_work_mem.

Much more complex? RelationGetAvgFSM(relation)??

  Let's use our knowledge of the table to calculate a sensible value. We
  often have average row length available from last VACUUM, don't we? Use
  that, plus 10%.
 
 If there's dead line pointers in the table, left behind by HOT pruning,
 the average row length would be completely bogus. Using the dead_tuples
 stats directly would be more reasonable, but I would like us to keep
 this even more conservative than that.

That's a better argument.

Since we have committed HOT, I would suggest we move the default of
autovacuum_max_workers down to 2. That will limit the default amount of
memory used by VACUUMs much better than trying to get the values
precisely and we expect them to run much less frequently anyway now.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Reduce the size of memoryallocations by lazy vacuum when

2007-09-24 Thread Gregory Stark
Heikki Linnakangas [EMAIL PROTECTED] writes:

 Simon Riggs wrote:
 On Mon, 2007-09-24 at 10:02 +0100, Heikki Linnakangas wrote:
 How about just using MaxHeapTuplesPerPage? With the default 8K block
 size, it's not that much more than 200, but makes the above gripes
 completely go away. That seems like the safest option at this point.
 
 It would be much better to use a value for each table. Any constant
 value will be sub-optimal in many cases. 

 Allocating extra memory doesn't usually do much harm, as long as you
 don't actually use it. The reason we're now limiting it is to avoid Out
 Of Memory errors if you're running with overcommit turned off, and
 autovacuum triggers a vacuum on multiple tables at the same time.

For reference, MaxHeapTuplesPerPage on an 8k block is 291. If there are any
columns in your tuples (meaning they're not either HOT updates which have been
pruned or rows with 8 or fewer columns all of which are null) then the most
you can have is 255 rows.

For the small difference between 200 and 291 it seems safer to just use
MaxHeapTuplesPerPage.


BS  MHTPG   Max w/data
--
4096145 127
8192291 255
16384   584 511 
32768   11691023

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match