Re: [HACKERS] Additional stats for Relations
Hi Jim, On 10/18/06, Jim C. Nasby [EMAIL PROTECTED] wrote: Also how many times a relation has been vacuumed (which puts all theother numbers in more perspective... good catch Simon). And I thinknumber of pages that could not be added to the FSM would also beextremely valuable. By the above, do you mean the number of pages that could not be added to the FSM because they had freespace which was less than the threshold for this particular relation? Regards, Nikhils EnterpriseDB http://www.enterprisedb.com On Wed, Oct 18, 2006 at 11:27:39AM +0530, NikhilS wrote: Hi, So: heap_blks_reused (with Jim's semantics), heap_blks_extend, heap_blks_truncate are the interesting stats? Will try to work up a patch for this. Regards, Nikhils EnterpriseDB http://www.enterprisedb.com On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote: On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote:I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? Nikhils This makes sense for indexes, but only makes sense for heaps when we know that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads. IMHO Jim's proposal makes more sense for general use.heap_blks_extend: The number of times file extend was invoked on the relation Sounds goodheap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. Sounds goodAs an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation Those last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is a more interesting metric? We've got last vacuum date, but no indication of how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. Nikhils This should be useful for tuning space allocation/deallocation. If we get this patch in early it should help get feedback on this area. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed. --Jim Nasby[EMAIL PROTECTED]EnterpriseDBhttp://enterprisedb.com512.569.9461 (cell) -- All the world's a stage, and most of us are desperately unrehearsed.
Re: [HACKERS] Additional stats for Relations
On Thu, Oct 19, 2006 at 04:10:46PM +0530, NikhilS wrote: Hi Jim, On 10/18/06, Jim C. Nasby [EMAIL PROTECTED] wrote: Also how many times a relation has been vacuumed (which puts all the other numbers in more perspective... good catch Simon). And I think number of pages that could not be added to the FSM would also be extremely valuable. By the above, do you mean the number of pages that could not be added to the FSM because they had freespace which was less than the threshold for this particular relation? Yes... but... :) We want to ignore pages that have less than the average request size, because vacuum will never try and put them in the FSM anyway. We only care about pages that were dropped because MaxFSMPages was less than DesiredFSMPages (see freespace.c for more info). It would also be useful to keep track of what relations have been bumped out of the FSM (or never got recorded, though I'm not sure if that's possible) because we've run into the MaxFSMRelations limit. BTW, if we add these counters we'll be up to 7 stats dealing with vacuum and analyze, and NikhilS has a patch we're finalizing that would add 3 more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe we should have a separate view for vacuum/FSM statistics? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Additional stats for Relations
Jim C. Nasby [EMAIL PROTECTED] writes: BTW, if we add these counters we'll be up to 7 stats dealing with vacuum and analyze, and NikhilS has a patch we're finalizing that would add 3 more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe we should have a separate view for vacuum/FSM statistics? I've seen no demonstration of a need for *any* of them, actually, and am pretty dubious that we want to add so much collection overhead. At least not without a major redesign of the stats reporting mechanism. If we just drop in another seven counters, we'll create an immediate 50% increase in the stats-file I/O volume, even when no vacuuming is happening at all. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Additional stats for Relations
On Thu, Oct 19, 2006 at 11:47:53AM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: BTW, if we add these counters we'll be up to 7 stats dealing with vacuum and analyze, and NikhilS has a patch we're finalizing that would add 3 more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe we should have a separate view for vacuum/FSM statistics? I've seen no demonstration of a need for *any* of them, actually, and am pretty dubious that we want to add so much collection overhead. At least not without a major redesign of the stats reporting mechanism. If we just drop in another seven counters, we'll create an immediate 50% increase in the stats-file I/O volume, even when no vacuuming is happening at all. Yeah, for stuff like vacuuming the current stats system may not make any sense. Almost anything dealing with vacuum can really just be put into a table, because it doesn't happen all that often. My concern is that there's enough useful data to collect about vacuuming and the FSM that it should probably get it's own set of tables/views, rather than piggy-backing on pg_stat_*. But that ship has pretty much sailed, so we're probably stuck with at least the last_* stuff in pg_stat_* for the immediate future. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Additional stats for Relations
Also how many times a relation has been vacuumed (which puts all the other numbers in more perspective... good catch Simon). And I think number of pages that could not be added to the FSM would also be extremely valuable. On Wed, Oct 18, 2006 at 11:27:39AM +0530, NikhilS wrote: Hi, So: heap_blks_reused (with Jim's semantics), heap_blks_extend, heap_blks_truncate are the interesting stats? Will try to work up a patch for this. Regards, Nikhils EnterpriseDB http://www.enterprisedb.com On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote: On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? Nikhils This makes sense for indexes, but only makes sense for heaps when we know that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads. IMHO Jim's proposal makes more sense for general use. heap_blks_extend: The number of times file extend was invoked on the relation Sounds good heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. Sounds good As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation Those last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is a more interesting metric? We've got last vacuum date, but no indication of how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. Nikhils This should be useful for tuning space allocation/deallocation. If we get this patch in early it should help get feedback on this area. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com -- All the world's a stage, and most of us are desperately unrehearsed. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Additional stats for Relations
On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? Nikhils This makes sense for indexes, but only makes sense for heaps when we know that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads. IMHO Jim's proposal makes more sense for general use. heap_blks_extend: The number of times file extend was invoked on the relation Sounds good heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. Sounds good As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation Those last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is a more interesting metric? We've got last vacuum date, but no indication of how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. Nikhils This should be useful for tuning space allocation/deallocation. If we get this patch in early it should help get feedback on this area. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Additional stats for Relations
Hi, So: heap_blks_reused (with Jim's semantics), heap_blks_extend, heap_blks_truncate are the interesting stats? Will try to work up a patch for this. Regards, Nikhils EnterpriseDB http://www.enterprisedb.com On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote: On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? NikhilsThis makes sense for indexes, but only makes sense for heaps when weknow that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads.IMHO Jim's proposal makes more sense for general use. heap_blks_extend: The number of times file extend was invoked on the relationSounds good heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g.Sounds good As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relationThose last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums isa more interesting metric? We've got last vacuum date, but no indicationof how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. NikhilsThis should be useful for tuning space allocation/deallocation. If weget this patch in early it should help get feedback on this area.--Simon RiggsEnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.
Re: [HACKERS] Additional stats for Relations
Hi Jim, On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote: Currently a select * from pg_statio_user_tables; displays only heap_blks_read, heap_blks_hit stats amongst others for the main relation. It would be good to have the following stats collected too. I think these stats can be used to better statistically analyze/understand the block I/O activity on the relation: heap_blks_reused: The number of buffers returned by the FSM for use to store a new tuple inThe description on this is misleading... FSM doesn't return buffers, it returns pages that have free space on them. Nikhils FSM returns the block number from which we fetch the buffer. This is similar to the way we track buffer_read stats in ReadBuffer. Nikhils I'm also not sure if this metric is what you actually want, since asingle page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into theFSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? Nikhils heap_blks_extend: The number of times file extend was invoked on the relation heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation I can come up with a patch (already have one) for the above. Any thought/comments?Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. Nikhils Regards, Nikhils EnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.
[HACKERS] Additional stats for Relations
Hi, Currently a select * from pg_statio_user_tables; displays only heap_blks_read, heap_blks_hit stats amongst others for the main relation. It would be good to have the following stats collected too. I think these stats can be used to better statistically analyze/understand the block I/O activity on the relation: heap_blks_reused: The number of buffers returned by the FSM for use to store a new tuple in heap_blks_extend: The number of times file extend was invoked on the relation heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation I can come up with a patch (already have one) for the above. Any thought/comments? Regards, Nikhils (www.enterprisedb.com)-- All the world's a stage, and most of us are desperately unrehearsed.
Re: [HACKERS] Additional stats for Relations
On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote: Currently a select * from pg_statio_user_tables; displays only heap_blks_read, heap_blks_hit stats amongst others for the main relation. It would be good to have the following stats collected too. I think these stats can be used to better statistically analyze/understand the block I/O activity on the relation: heap_blks_reused: The number of buffers returned by the FSM for use to store a new tuple in The description on this is misleading... FSM doesn't return buffers, it returns pages that have free space on them. I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... heap_blks_extend: The number of times file extend was invoked on the relation heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation I can come up with a patch (already have one) for the above. Any thought/comments? Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend