Re: [HACKERS] Additional stats for Relations

2006-10-19 Thread NikhilS
Hi Jim, On 10/18/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
Also how many times a relation has been vacuumed (which puts all theother numbers in more perspective... good catch Simon). And I thinknumber of pages that could not be added to the FSM would also beextremely valuable.

By the above, do you mean the number of pages that could not be added
to the FSM because they had freespace which was less than the threshold
for this particular relation?

Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On Wed, Oct 18, 2006 at 11:27:39AM +0530, NikhilS wrote: Hi,
 So: heap_blks_reused (with Jim's semantics), heap_blks_extend, heap_blks_truncate are the interesting stats? Will try to work up a patch for this. Regards, Nikhils
 EnterpriseDB http://www.enterprisedb.com On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote: 
 On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:   On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote:I'm also not sure if this metric is what you actually want,
  since a  single page can be returned many times from the FSM even  between  vacuums. Tracking how many pages for a relation have been put
  into the  FSM might be more useful...   Nikhils  Pages might be put into the FSM, but by this metric don't we get the
  actual usage of the pages from the FSM? Agreed a single page can be  returned multiple times, but since it serves a new tuple, shouldn't we  track it?  Nikhils
  This makes sense for indexes, but only makes sense for heaps when we know that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads.
  IMHO Jim's proposal makes more sense for general use.heap_blks_extend: The number of times file extend was  invoked on the   relation
  Sounds goodheap_blks_truncate: The total number of blocks that have  been truncated due   to vacuum activity 
e.g.  Sounds goodAs an addendum to the truncate stats above, we can also have  the additional   following stats:
 heap_blks_maxtruncate: The max block of buffers truncated in  one go heap_blks_ntruncate: The number of times truncate was called
  on this   relation  Those last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
 a more interesting metric? We've got last vacuum date, but no indication of how frequently a vacuum has run.   Do you have a use-case for this info? I can see where it might
  be neat  to know, but I'm not sure how you'd actually use it in the  real world.   Nikhils  The use-case according to me is that these stats help prove the
  effectiveness of autovacuum/vacuum operations. By varying some autovac  guc variables, and doing subsequent (pgbench e.g.) runs, one can find  out the optimum values for these variables using these stats.
  Nikhils  This should be useful for tuning space allocation/deallocation. If we get this patch in early it should help get feedback on this area. 
 -- Simon Riggs EnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.
--Jim
Nasby[EMAIL PROTECTED]EnterpriseDBhttp://enterprisedb.com512.569.9461 (cell)
-- All the world's a stage, and most of us are desperately unrehearsed.


Re: [HACKERS] Additional stats for Relations

2006-10-19 Thread Jim C. Nasby
On Thu, Oct 19, 2006 at 04:10:46PM +0530, NikhilS wrote:
 Hi Jim,
 
 On 10/18/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
 
 Also how many times a relation has been vacuumed (which puts all the
 other numbers in more perspective... good catch Simon). And I think
 number of pages that could not be added to the FSM would also be
 extremely valuable.
 
 
 By the above, do you mean the number of pages that could not be added to the
 FSM because they had freespace which was less than the threshold for this
 particular relation?

Yes... but... :)

We want to ignore pages that have less than the average request size,
because vacuum will never try and put them in the FSM anyway. We only
care about pages that were dropped because MaxFSMPages was less than
DesiredFSMPages (see freespace.c for more info).

It would also be useful to keep track of what relations have been bumped
out of the FSM (or never got recorded, though I'm not sure if that's
possible) because we've run into the MaxFSMRelations limit.

BTW, if we add these counters we'll be up to 7 stats dealing with vacuum
and analyze, and NikhilS has a patch we're finalizing that would add 3
more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe
we should have a separate view for vacuum/FSM statistics?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Additional stats for Relations

2006-10-19 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 BTW, if we add these counters we'll be up to 7 stats dealing with vacuum
 and analyze, and NikhilS has a patch we're finalizing that would add 3
 more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe
 we should have a separate view for vacuum/FSM statistics?

I've seen no demonstration of a need for *any* of them, actually,
and am pretty dubious that we want to add so much collection overhead.
At least not without a major redesign of the stats reporting mechanism.
If we just drop in another seven counters, we'll create an immediate 50%
increase in the stats-file I/O volume, even when no vacuuming is
happening at all.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Additional stats for Relations

2006-10-19 Thread Jim C. Nasby
On Thu, Oct 19, 2006 at 11:47:53AM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  BTW, if we add these counters we'll be up to 7 stats dealing with vacuum
  and analyze, and NikhilS has a patch we're finalizing that would add 3
  more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe
  we should have a separate view for vacuum/FSM statistics?
 
 I've seen no demonstration of a need for *any* of them, actually,
 and am pretty dubious that we want to add so much collection overhead.
 At least not without a major redesign of the stats reporting mechanism.
 If we just drop in another seven counters, we'll create an immediate 50%
 increase in the stats-file I/O volume, even when no vacuuming is
 happening at all.

Yeah, for stuff like vacuuming the current stats system may not make any
sense. Almost anything dealing with vacuum can really just be put into a
table, because it doesn't happen all that often.

My concern is that there's enough useful data to collect about vacuuming
and the FSM that it should probably get it's own set of tables/views,
rather than piggy-backing on pg_stat_*. But that ship has pretty much
sailed, so we're probably stuck with at least the last_* stuff in
pg_stat_* for the immediate future.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Additional stats for Relations

2006-10-18 Thread Jim C. Nasby
Also how many times a relation has been vacuumed (which puts all the
other numbers in more perspective... good catch Simon). And I think
number of pages that could not be added to the FSM would also be
extremely valuable.

On Wed, Oct 18, 2006 at 11:27:39AM +0530, NikhilS wrote:
 Hi,
 
 So:
 heap_blks_reused (with Jim's semantics), heap_blks_extend,
 heap_blks_truncate are the interesting stats? Will try to work up a patch
 for this.
 
 Regards,
 Nikhils
 EnterpriseDB   http://www.enterprisedb.com
 On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote:
 
 On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:
 
  On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
 
 
  I'm also not sure if this metric is what you actually want,
  since a
  single page can be returned many times from the FSM even
  between
  vacuums. Tracking how many pages for a relation have been put
  into the
  FSM might be more useful...
 
  Nikhils
  Pages might be put into the FSM, but by this metric don't we get the
  actual usage of the pages from the FSM? Agreed a single page can be
  returned multiple times, but since it serves a new tuple, shouldn't we
  track it?
  Nikhils
 
 This makes sense for indexes, but only makes sense for heaps when we
 know that the backend will keep re-accessing the block until it is full
 - so only of interest in steady-state workloads.
 
 IMHO Jim's proposal makes more sense for general use.
 
   heap_blks_extend: The number of times file extend was
  invoked on the
   relation
 
 Sounds good
 
   heap_blks_truncate: The total number of blocks that have
  been truncated due
   to vacuum activity e.g.
 
 Sounds good
 
   As an addendum to the truncate stats above, we can also have
  the additional
   following stats:
  
   heap_blks_maxtruncate: The max block of buffers truncated in
  one go
  
   heap_blks_ntruncate: The number of times truncate was called
  on this
   relation
 
 Those last 2 sound too complex for normal use and ntruncate is most
 likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
 a more interesting metric? We've got last vacuum date, but no indication
 of how frequently a vacuum has run.
 
  Do you have a use-case for this info? I can see where it might
  be neat
  to know, but I'm not sure how you'd actually use it in the
  real world.
 
  Nikhils
  The use-case according to me is that these stats help prove the
  effectiveness of autovacuum/vacuum operations. By varying some autovac
  guc variables, and doing subsequent (pgbench e.g.) runs, one can find
  out the optimum values for these variables using these stats.
  Nikhils
 
 This should be useful for tuning space allocation/deallocation. If we
 get this patch in early it should help get feedback on this area.
 
 --
   Simon Riggs
   EnterpriseDB   http://www.enterprisedb.com
 
 
 
 
 
 -- 
 All the world's a stage, and most of us are desperately unrehearsed.

-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Additional stats for Relations

2006-10-17 Thread Simon Riggs
On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:

 On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote:


 I'm also not sure if this metric is what you actually want,
 since a
 single page can be returned many times from the FSM even
 between 
 vacuums. Tracking how many pages for a relation have been put
 into the
 FSM might be more useful...
  
 Nikhils
 Pages might be put into the FSM, but by this metric don't we get the
 actual usage of the pages from the FSM? Agreed a single page can be
 returned multiple times, but since it serves a new tuple, shouldn't we
 track it?
 Nikhils

This makes sense for indexes, but only makes sense for heaps when we
know that the backend will keep re-accessing the block until it is full
- so only of interest in steady-state workloads.

IMHO Jim's proposal makes more sense for general use.

  heap_blks_extend: The number of times file extend was
 invoked on the
  relation

Sounds good

  heap_blks_truncate: The total number of blocks that have
 been truncated due 
  to vacuum activity e.g.

Sounds good

  As an addendum to the truncate stats above, we can also have
 the additional
  following stats:
 
  heap_blks_maxtruncate: The max block of buffers truncated in
 one go 
 
  heap_blks_ntruncate: The number of times truncate was called
 on this
  relation

Those last 2 sound too complex for normal use and ntruncate is most
likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
a more interesting metric? We've got last vacuum date, but no indication
of how frequently a vacuum has run.

 Do you have a use-case for this info? I can see where it might
 be neat 
 to know, but I'm not sure how you'd actually use it in the
 real world.
  
 Nikhils
 The use-case according to me is that these stats help prove the
 effectiveness of autovacuum/vacuum operations. By varying some autovac
 guc variables, and doing subsequent (pgbench e.g.) runs, one can find
 out the optimum values for these variables using these stats. 
 Nikhils

This should be useful for tuning space allocation/deallocation. If we
get this patch in early it should help get feedback on this area.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Additional stats for Relations

2006-10-17 Thread NikhilS
Hi, 
So: 
heap_blks_reused (with Jim's semantics), heap_blks_extend,
heap_blks_truncate are the interesting stats? Will try to work up a
patch for this. 

Regards,
Nikhils
 EnterpriseDB http://www.enterprisedb.com
On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote:
On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: I'm also not sure if this metric is what you actually want,
 since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the
 FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be
 returned multiple times, but since it serves a new tuple, shouldn't we track it? NikhilsThis makes sense for indexes, but only makes sense for heaps when weknow that the backend will keep re-accessing the block until it is full
- so only of interest in steady-state workloads.IMHO Jim's proposal makes more sense for general use.  heap_blks_extend: The number of times file extend was invoked on the
  relationSounds good  heap_blks_truncate: The total number of blocks that have been truncated due  to vacuum activity e.g.Sounds good
  As an addendum to the truncate stats above, we can also have the additional  following stats:   heap_blks_maxtruncate: The max block of buffers truncated in
 one go   heap_blks_ntruncate: The number of times truncate was called on this  relationThose last 2 sound too complex for normal use and ntruncate is most
likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums isa more interesting metric? We've got last vacuum date, but no indicationof how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might
 be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the
 effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats.
 NikhilsThis should be useful for tuning space allocation/deallocation. If weget this patch in early it should help get feedback on this area.--Simon RiggsEnterpriseDB 
http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.


Re: [HACKERS] Additional stats for Relations

2006-10-14 Thread NikhilS
Hi Jim,
On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote: Currently a select * from pg_statio_user_tables; displays only
 heap_blks_read, heap_blks_hit stats amongst others for the main relation. It would be good to have the following stats collected too. I think these stats can be used to better statistically analyze/understand the block I/O
 activity on the relation: heap_blks_reused: The number of buffers returned by the FSM for use to store a new tuple inThe description on this is misleading... FSM doesn't return buffers, it
returns pages that have free space on them.

Nikhils
FSM returns the block number from which we fetch the buffer. This is similar to the way we track buffer_read stats in ReadBuffer.
Nikhils

I'm also not sure if this metric is what you actually want, since asingle page can be returned many times from the FSM even between
vacuums. Tracking how many pages for a relation have been put into theFSM might be more useful...

Nikhils
Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it?

Nikhils

 heap_blks_extend: The number of times file extend was invoked on the relation heap_blks_truncate: The total number of blocks that have been truncated due
 to vacuum activity e.g. As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go
 heap_blks_ntruncate: The number of times truncate was called on this relation I can come up with a patch (already have one) for the above. Any thought/comments?Do you have a use-case for this info? I can see where it might be neat
to know, but I'm not sure how you'd actually use it in the real world.

Nikhils
The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. 

Nikhils

Regards, 
Nikhils 
EnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.


[HACKERS] Additional stats for Relations

2006-10-13 Thread NikhilS
Hi, 

Currently a select * from pg_statio_user_tables; displays only
heap_blks_read, heap_blks_hit stats amongst others for the main
relation. It would be good to have the following stats collected too. I
think these stats can be used to better statistically
analyze/understand the block I/O activity on the relation:

heap_blks_reused: The number of buffers returned by the FSM for use to store a new tuple in

heap_blks_extend: The number of times file extend was invoked on the relation

heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g.

As an addendum to the truncate stats above, we can also have the additional following stats:

heap_blks_maxtruncate: The max block of buffers truncated in one go

heap_blks_ntruncate: The number of times truncate was called on this relation
I can come up with a patch (already have one) for the above. Any thought/comments?

Regards,
Nikhils 
(www.enterprisedb.com)-- All the world's a stage, and most of us are desperately unrehearsed.


Re: [HACKERS] Additional stats for Relations

2006-10-13 Thread Jim C. Nasby
On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote:
 Currently a select * from pg_statio_user_tables; displays only
 heap_blks_read, heap_blks_hit stats amongst others for the main relation. It
 would be good to have the following stats collected too. I think these stats
 can be used to better statistically analyze/understand the block I/O
 activity on the relation:
 
 heap_blks_reused: The number of buffers returned by the FSM for use to store
 a new tuple in
 
The description on this is misleading... FSM doesn't return buffers, it
returns pages that have free space on them.

I'm also not sure if this metric is what you actually want, since a
single page can be returned many times from the FSM even between
vacuums. Tracking how many pages for a relation have been put into the
FSM might be more useful...

 heap_blks_extend: The number of times file extend was invoked on the
 relation
 
 heap_blks_truncate: The total number of blocks that have been truncated due
 to vacuum activity e.g.
 
 As an addendum to the truncate stats above, we can also have the additional
 following stats:
 
 heap_blks_maxtruncate: The max block of buffers truncated in one go
 
 heap_blks_ntruncate: The number of times truncate was called on this
 relation
 
 I can come up with a patch (already have one) for the above. Any
 thought/comments?

Do you have a use-case for this info? I can see where it might be neat
to know, but I'm not sure how you'd actually use it in the real world.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend