Re: [HACKERS] [PATCH] Unremovable tuple monitoring
On 17/11/2011, at 1:47 AM, Robert Haas wrote: On Wed, Nov 16, 2011 at 9:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Not sure about the log line, but allowing pgstattuple to distinguish between recently-dead and quite-thoroughly-dead seems useful. The dividing line is enormously unstable though. pgstattuple's idea of RecentGlobalXmin could even be significantly different from that of a concurrently running VACUUM. I can see the point of having VACUUM log what it did, but opinions from the peanut gallery aren't worth much. Hmm, you have a point. But as Yeb points out, it seems like we should at least try to be more consistent about the terminology. Thanks for the discussion so far all. Would it be worthwhile to make another patch that addresses the points from Yeb's reviews? It's not sounding like this unremovable tuple count is something that postgres wants, but I'm happy to keep the patch up to scratch if we're still not sure. Cheers, --Royce -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Unremovable tuple monitoring
On 18/11/2011, at 10:44 AM, Robert Haas wrote: On Thu, Nov 17, 2011 at 5:49 PM, Royce Ausburn royce...@inomial.com wrote: Thanks for the discussion so far all. Would it be worthwhile to make another patch that addresses the points from Yeb's reviews? It's not sounding like this unremovable tuple count is something that postgres wants, but I'm happy to keep the patch up to scratch if we're still not sure. One question to ask yourself at this point in the process is whether *you* still think the feature is useful. I'm fairly persuaded by Tom's point that the value monitored by this patch will change so quickly that it won't be useful to have VACUUM store it; it'll be obsolete by the time you look at the numbers. If you are also persuaded by that argument, then clearly it's time to throw in the towel! But let's suppose you're NOT persuaded by that argument and you still want the feature. In that case, don't just wait for someone else to stick up for the feature; tell us why you still think it's a good idea. Make a case for what you want. People here are usually receptive to good ideas, well-presented. Okay - thanks Robert. I think I'll drop it. Now that I know what to look for in this situation, the changes this patch includes aren't of value to me. That's a pretty good indicator that it's probably not valuable to anyone else. I've changed the status of the patch to rejected in the commit fest. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Review] Include detailed information about a row failing a CHECK constraint into the error message
The patch applies cleanly to the current git master and is in context diff format.The patch fails the regression tests because it is outputting new DETAIL line which four of tests aren't expecting. The tests will need to be updated.Functionality:The patch works as advertised. An insert or update that fails a check condition indeed logs the row that has failed:test=# create table test (test(# a integer,test(# b integer CHECK (b 2),test(# c text,test(# d integertest(#test(#test(# );CREATE TABLEtest=#test=# insert into test select 1, 2, 'Test', 4;ERROR: new row for relation "test" violates check constraint "test_b_check"DETAIL: Failing row: (1, 2, Test, 4).One comment I have on the output is that strings are not in quotes. It's a little jarring, but might not be that big a deal. A contrived case that is pretty confusing:test=# insert into test select 1, 2, '3, 4', 4;ERROR: new row for relation "test" violates check constraint "test_b_check"DETAIL: Failing row: (1, 2, 3, 4, 4).A select inserting 4 columns seemingly results in a 5 column row ;)Another super minor thing, postgres doesn't seem to put periods at the end of log messages, yet this new detail line does.CodeI'm no C or postgres expert, but the code looks okay to me.Attached is a small script I used to test/demo the functionality.--Royce testCheckConstraints.sql Description: Binary data On 11/11/2011, at 2:56 AM, Jan Kundrát wrote:On 11/10/11 16:05, Tom Lane wrote:I agree with Jan that this is probably useful; I'm pretty sure therehave been requests for it before. We just have to make sure that thelength of the message stays in bounds.One tip for keeping the length down: there is no value in repeatinginformation from the primary error message, such as the name of theconstraint.Thanks to your comments and suggestions, I appreciate the time of thereviewers.Attached is a second version of this patch which keeps the size of theoutput at 64 characters per column (which is an arbitrary value definedas a const int, which I hope matches your style). Longer values havetheir last three characters replaced by "...", so there's no way todistinguish them from a legitimate string that ends with just that.There's also no escaping of special-string values, similar to how theBuildIndexValueDescription operates.Cheers,Jan-- Trojita, a fast e-mail client -- http://trojita.flaska.net/context_in_check_constraints-v2.patch
Re: [HACKERS] [PATCH] Unremovable tuple monitoring
On 16/11/2011, at 2:05 AM, Yeb Havinga wrote: On 2011-10-05 00:45, Royce Ausburn wrote: Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO. I've also fixed the name of an argument to pgstat_report_vacuum which I don't think was particularly good, and I've replace the word tuple with row in some docs I added for consistency. I reviewed your patch. I think it is in good shape, my two main remarks (name of n_unremovable_tup and a remark about documentation at the end of this review) are highly subjective and I wouldn't spend time on it unless other people have the same opinion. Remarks: * rules regression test fails because pg_stat_all_tables is changed, pg_stat_sys_tables and pg_stat_user_tables as well, but the new expected/rules.out is not part of the patch. Doh! Thank you for spotting this. Should we decide to continue this patch I'll look in to fixing this. * I'm not sure about the name n_unremovable_tup, since it doesn't convey it is about dead tuples and judging from only the name it might as well include the live tuples. It also doesn't hint that it is a transient condition, which vacuum verbose does with the word 'yet'. What about n_locked_dead_tup? - this contains both information that it is about dead tuples, and the lock suggests that once the lock is removed, the dead tuple can be removed. Looks like we have some decent candidates later in the thread. Should this patch survive I'll look at updating it. * The number shows correctly in the pg_stat_relation. This is a testcase that gives unremovable dead rows: I was puzzled for a while that n_unremovable_tup = n_dead_tup doesn't hold, after all, the unremovable tuples are dead as well. Neither the current documentation nor the documentation added by the patch do help in explaining the meaning of n_dead_tup and n_unremovable_tup, which may be clear to seasoned vacuum hackers, but not to me. In both the case of n_dead_tup it would have been nice if the docs mentioned that dead tuples are tuples that are deleted or previous versions of updated tuples, and that only analyze updates n_dead_tup (since vacuum cleans them), in contrast with n_unremovable_tup that gets updated by vacuum. Giving an example of how unremovable dead tuples can be caused would IMHO also help understanding. Fair enough! I'll review this as well. Thanks again! --Royce -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Unremovable tuple monitoring
On 16/11/2011, at 8:04 AM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Nov 15, 2011 at 10:29 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of mar nov 15 12:16:54 -0300 2011: I guess this is a dumb question, but why don't we remove all the dead tuples? They were deleted but there are transactions with older snapshots. Oh. I was thinking dead meant no longer visible to anyone. But it sounds what we call unremovable here is what we elsewhere call recently dead. Would have to look at the code to be sure, but I think that nonremovable is meant to count both live tuples and dead-but-still-visible-to-somebody tuples. The question that I think needs to be asked is why it would be useful to track this using the pgstats mechanisms. By definition, the difference between this and the live-tuple count is going to be extremely unstable --- I don't say small, necessarily, but short-lived. So it's debatable whether it's worth memorializing the count obtained by the last VACUUM at all. And doing it through pgstats is an expensive thing. We've already had push-back about the size of the stats table on large (lots-o-tables) databases. Adding another counter will impose a performance overhead on everybody, whether they care about this number or not. What's more, to the extent that I can think of use-cases for knowing this number, I think I would want a historical trace of it --- that is, not only the last VACUUM's result but those of previous VACUUM cycles. So pgstats seems like it's both expensive and useless for the purpose. Right now the only good solution is trawling the postmaster log. Possibly something like pgfouine could track the numbers in a more useful fashion. Thanks all for the input. Tom: My first patch attempted to log the number of unremovable tuples in this log, but it was done inappropriately -- it was included as part of the log_autovacuum_min_duration's output. You rightly objected to that patch :) Personally I think some log output, done better, would have been more useful for me at the time. At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong. I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job. Something, anything that I could google. Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal. Should we consider taking a logging approach instead? --Royce -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Unremovable tuple monitoring
Personally I think some log output, done better, would have been more useful for me at the time. At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong. I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job. Something, anything that I could google. Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal. Should we consider taking a logging approach instead? Dopey suggestion: Instead of logging around vacuum and auto_vacuum, perhaps log transactions that are open for longer than some (perhaps configurable) time? The default might be pretty large, say 6 hours. Are there common use cases for txs that run for longer than 6 hours? Seeing a message such as: WARNING: Transaction X has been open more than Y. This tx may be holding locks preventing other txs from operating and may prevent vacuum from cleaning up deleted rows. Would give a pretty clear indication of a problem :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Unremovable tuple monitoring
On 16/11/2011, at 12:26 PM, Robert Haas wrote: On Tue, Nov 15, 2011 at 7:59 PM, Royce Ausburn royce...@inomial.com wrote: Personally I think some log output, done better, would have been more useful for me at the time. At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong. I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job. Something, anything that I could google. Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal. Should we consider taking a logging approach instead? Dopey suggestion: Instead of logging around vacuum and auto_vacuum, perhaps log transactions that are open for longer than some (perhaps configurable) time? The default might be pretty large, say 6 hours. Are there common use cases for txs that run for longer than 6 hours? Seeing a message such as: WARNING: Transaction X has been open more than Y. This tx may be holding locks preventing other txs from operating and may prevent vacuum from cleaning up deleted rows. Would give a pretty clear indication of a problem :) Well, you could that much just by periodically querying pg_stat_activity. Fair enough -- someone knowledgable could set that up if they wanted. My goal was mostly to have something helpful in the logs. If that's not something postgres wants/needs Ill drop it =) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG or strange behaviour of update on primary key
On 18/10/2011, at 1:00 PM, Robert Haas wrote: On Mon, Oct 17, 2011 at 7:30 PM, desmodemone desmodem...@gmail.com wrote: Seems an Oracle bug not Postgresql one! I don't think it's a bug for it to work. It'd probably work in PostgreSQL too, if you inserted (2) first and then (1). It's just that, as Tom says, if you want it to be certain to work (rather than depending on the order in which the rows are inserted), you need the checks to be deferred. Do deferred checks such as this have a memory impact for bulk updates? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Index only scan paving the way for auto clustered tables?
Hi all, I wonder, could the recent work on index only scans pave the way for auto clustered tables? Consider a wide, mostly insert table with some subset of columns that I'd like to cluster on. I'm after locality of tuples that are very frequently fetched together, but not keen on the downtime for a cluster, nor the maintenance that it requires. Would it be a stretch to have an index that branches on the subset of cluster columns, but still stores all the columns, making it a covering index? Given that we can already index concurrently, such an index would not require downtime, and would be self maintaining. From my understanding of the index-only scan implementation, I suspect that such an index would effectively give locality, with some caveats… I'd expect the overhead of inserting in to such a table would be high, perhaps prohibitive. Perhaps better ways have been discussed. Stupid idea? --Royce -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters
On 08/10/2011, at 1:56 AM, Yeb Havinga wrote: Attach is v2 of the patch. Mixed notation now raises an error. In contrast with what I said above, named parameter related errors are thrown before any syntax errors. I tested with raising syntax errors first but the resulting code was a bit more ugly and the sql checking under a error condition (i.e. double named parameter error means there is one parameter in short) was causing serious errors. Documentation was also added, regression tests updated. I've tested this patch out and can confirm mixed notation produces an error: psql:plsqltest.sql:27: ERROR: mixing positional and named parameter assignment not allowed in cursor cur1 LINE 10: open cur1( param2 := 4, 2, 5); Just one small thing: it'd be nice to have an example for cursor declaration with named parameters. Your patch adds one for opening a cursor, but not for declaring one. Other than that, I think the patch is good. Everything works as advertised =) --Royce -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters
On 11/10/2011, at 11:38 PM, Yeb Havinga wrote: Declaration of cursors with named parameters is already part of PostgreSQL (so it is possible to use the parameter names in the cursor query instead of $1, $2, etc.) and it also already documented with an example, just a few lines above the open examples. See curs3 on http://developer.postgresql.org/pgdocs/postgres/plpgsql-cursors.html Doh - my apologies!
[HACKERS] [REVIEW] Patch for cursor calling with named parameters
Initial Review for patch: http://archives.postgresql.org/pgsql-hackers/2011-09/msg00744.php Submission review The patch is in context diff format and applies cleanly to the git master. The patch includes an update to regression tests. The regression tests pass. The patch does not include updates to the documentation reflecting the new functionality. Usability review The patch adds a means of specifying named cursor parameter arguments in pg/plsql. The syntax is straight forward: cur1 CURSOR (param1 int, param2 int, param3 int) for select …; … open cur1(param3 := 4, param2 := 1, param1 := 5); The old syntax continues to work: cur1 CURSOR (param1 int, param2 int, param3 int) for select …; … open cur1(5, 1, 3); • Does the patch actually implement that? Yes, the feature works as advertised. • Do we want that? I very rarely use pg/plsql, so I won't speak to its utility. However there has been some discussion about the idea: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01440.php • Do we already have it? Not AFAIK • Does it follow SQL spec, or the community-agreed behavior? There's some discussion about the syntax ( := or = ), I'm not sure there's a consensus yet. • Does it include pg_dump support (if applicable)? Yes -- pgplsql functions using named parameters are correctly dumped. • Are there dangers? Potentially. See below. • Have all the bases been covered? I don't think so. The new feature accepts opening a cursor with some parameter names not specified: open cur1(param3 := 4, 1, param1 := 5); It seems that if a parameter is not named, its position is used to bind to a variable. For example, the following fail: psql:plsqltest.sql:26: ERROR: cursor cur1 argument 2 param2 provided multiple times LINE 10: open cur1(param3 := 4, 1, param2 := 5); and psql:plsqltest.sql:26: ERROR: cursor cur1 argument 2 param2 provided multiple times LINE 10: open cur1(param2 := 4, 2, param1 := 5); I think that postgres ought to enforce some consistency here. Use one way or the other, never both. I can also produce some unhelpful errors when I give bad syntax. For example: psql:plsqltest.sql:28: ERROR: cursor cur1 argument 1 param1 provided multiple times LINE 11: open cur1( param3 : = 4, 2, param1 := 5); (notice the space between the : and =) -- psql:plsqltest.sql:28: ERROR: cursor cur1 argument 1 param1 provided multiple times LINE 11: open cur1( param3 = 4, 2, param1 := 5); (Wrong assignment operator) -- psql:plsqltest.sql:27: ERROR: cursor cur1 argument 3 param3 provided multiple times LINE 10: open cur1( 1 , param3 := 2, param2 = 3 ); (Wrong assignment operator) -- psql:plsqltest.sql:27: ERROR: cursor cur1 argument 3 param3 provided multiple times LINE 10: ... open cur1( param3 = param3 , param3 := 2, param2 = 3 ); -- open cur1( param3 := param3 , param2 = 3, param1 := 1 ); psql:plsqltest.sql:29: ERROR: column param2 does not exist LINE 2: ,param2 = 3 ^ QUERY: SELECT 1 ,param2 = 3 ,param3; CONTEXT: PL/pgSQL function named_para_test line 7 at OPEN I haven't been able to make something syntactically incorrect that doesn't produce an error, even if the error isn't spot on. I haven't been able to make it crash, and when the syntax is correct, it has always produced correct results. Performance review I've done some limited performance testing and I can't really see a difference between the unpatched and patched master. • Does it follow the project coding guidelines? I believe so, but someone more familiar with them will probably spot violations better than me =) • Are there portability issues? I wouldn't know -- I don't have any experience with C portability. • Will it work on Windows/BSD etc? Tested under OS X, so BSD is presumably okay. No idea about other unixes nor windows. • Are the comments sufficient and accurate? I'm happy enough with them. • Does it do what it says, correctly? Yes, excepting my comments above. • Does it produce compiler warnings? Yes: In file included from gram.y:12962: scan.c: In function ‘yy_try_NUL_trans’: scan.c:16243: warning: unused variable ‘yyg’ But this was not added by this patch -- it's also in the unpatched master. • Can you make it crash? No. --Royce
Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters
Forgive my ignorance -- do I need to be doing anything else now seeing as I started the review? On 07/10/2011, at 7:15 AM, Pavel Stehule wrote: 2011/10/6 Robert Haas robertmh...@gmail.com: On Thu, Oct 6, 2011 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: David E. Wheeler da...@kineticode.com writes: Would it then be added as an alias for := for named function parameters? Or would that come still later? Once we do that, it will be impossible not merely deprecated to use = as an operator name. I think that has to wait at least another release cycle or two past where we're using it ourselves. Okay. I kind of like := so there's no rush AFAIC. :-) Hmm ... actually, that raises another issue that I'm not sure whether there's consensus for or not. Are we intending to keep name := value syntax forever, as an alternative to the standard name = value syntax? I can't immediately see a reason not to, other than the it's not standard argument. Because if we *are* going to keep it forever, there's no very good reason why we shouldn't accept this plpgsql cursor patch now. We'd just have to remember to extend plpgsql to take = at the same time we do that for core function calls. It's hard to see adding support for = and dropping support for := in the same release. That would be a compatibility nightmare. If := is used by the standard for some other, incompatible purpose, then I suppose we would want to add support for =, wait a few releases, deprecate :=, wait a couple of releases, remove := altogether. But IIRC we picked := precisely because the standard didn't use it at all, or at least not for anything related... in which case we may as well keep it around more or less indefinitely. +1 Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Unremovable tuple monitoring (was: Addition of some trivial auto vacuum logging)
On 04/10/2011, at 11:26 PM, Robert Haas wrote: On Mon, Oct 3, 2011 at 9:17 AM, Royce Ausburn royce...@inomial.com wrote: - I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h. In this patch I have. Generally that is left to the committer, as the correct value depends on the value at the time of commit, not the time you submit the patch; and including it in the patch tends to result in failing hunks, since the value changes fairly frequently. - The VACUUM FULL implementation in cluster.c doesn't do any stats updating similar to vacuumlazy.c, so I haven't don't anything there… (is this right? A vacuum full may also encounter unremovable tuples, right?) We've occasionally heard grumblings about making cluster do more stats updating, but your patch should just go along with whatever's being done now in similar cases. I think I get this stats stuff now. Unless someone here thinks it's too hard for a new postgres dev's 2nd patch, I could take a stab. I might take a look at it tonight to get a feel for how hard, and what stats we could collect. I'll start a new thread for discussion. Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO. I've also fixed the name of an argument to pgstat_report_vacuum which I don't think was particularly good, and I've replace the word tuple with row in some docs I added for consistency. I'm not sure what my next step should be. I've added this patch to the open commit fest -- is that all for now until the commit fest begins review? --Royce diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index a19e3f0..8692580 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -328,7 +328,8 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re belonging to the table), number of live rows fetched by index scans, numbers of row insertions, updates, and deletions, number of row updates that were HOT (i.e., no separate index update), - numbers of live and dead rows, + numbers of live and dead rows, + the number of dead rows not removed in the last vacuum, the last time the table was non-optionFULL/ vacuumed manually, the last time it was vacuumed by the autovacuum daemon, the last time it was analyzed manually, @@ -764,6 +765,14 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re Number of dead rows in table /entry /row + + row + entryliteralfunctionpg_stat_get_unremovable_tuples/function(typeoid/type)/literal/entry + entrytypebigint/type/entry + entry + Number of dead rows not removed in the table's last vacuum + /entry + /row row entryliteralfunctionpg_stat_get_blocks_fetched/function(typeoid/type)/literal/entry diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2253ca8..9c18dc7 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -353,6 +353,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd, pg_stat_get_live_tuples(C.oid) AS n_live_tup, pg_stat_get_dead_tuples(C.oid) AS n_dead_tup, +pg_stat_get_unremovable_tuples(C.oid) AS n_unremovable_tup, pg_stat_get_last_vacuum_time(C.oid) as last_vacuum, pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index cf8337b..140fe92 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -91,6 +91,7 @@ typedef struct LVRelStats double scanned_tuples; /* counts only tuples on scanned pages */ double old_rel_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ + double unremovable_tuples; /* count of dead tuples not yet removable */ BlockNumber pages_removed; double tuples_deleted; BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */ @@ -245,7 +246,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, /* report results to the stats collector, too */ pgstat_report_vacuum(RelationGetRelid(onerel), onerel-rd_rel-relisshared, -new_rel_tuples); +new_rel_tuples, + vacrelstats-unremovable_tuples); /* and log the action if appropriate */ if (IsAutoVacuumWorkerProcess() Log_autovacuum_min_duration = 0) @@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats
Re: [HACKERS] [PATCH] Unremovable tuple monitoring (was: Addition of some trivial auto vacuum logging)
On 28/09/2011, at 11:17 AM, Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Royce Ausburn's message of mar sep 27 21:28:26 -0300 2011: Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it. It's not that difficult. Just do a search on git log src/backend/postmaster/pgstat.c for patches that add a new column somewhere. Yeah, I was just about to say the same thing. Find a previous patch that adds a feature similar to what you have in mind, and crib like mad. We've added enough stats counters over time that you should have several models to work from. This patch does as Tom suggested, adding a column to the pg_stat_all_tables view which exposes the number of unremovable tuples in the last vacuum. This patch does not include my previous work to log this information as part of auto_vacuum's logging. User visible additions: New column pg_stat_all_tables.n_unremovable_tup New function bigint pg_stat_get_unremovable_tuples(oid) A few notes / questions: - I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h. In this patch I have. - I'm not sure about how I should be selecting an OID for my new stats function. I used the unused_oids script and picked one that seemed reasonable. - The VACUUM FULL implementation in cluster.c doesn't do any stats updating similar to vacuumlazy.c, so I haven't don't anything there… (is this right? A vacuum full may also encounter unremovable tuples, right?) - I'm not usually a C developer, so peeps reviewing please watch for noob mistakes. Cheers, --Royce diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index a19e3f0..af7b235 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -328,7 +328,8 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re belonging to the table), number of live rows fetched by index scans, numbers of row insertions, updates, and deletions, number of row updates that were HOT (i.e., no separate index update), - numbers of live and dead rows, + numbers of live and dead rows, + the number of dead tuples not removed in the last vacuum, the last time the table was non-optionFULL/ vacuumed manually, the last time it was vacuumed by the autovacuum daemon, the last time it was analyzed manually, @@ -764,6 +765,14 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re Number of dead rows in table /entry /row + + row + entryliteralfunctionpg_stat_get_unremovable_tuples/function(typeoid/type)/literal/entry + entrytypebigint/type/entry + entry + Number of dead rows not removed in the table's last vacuum + /entry + /row row entryliteralfunctionpg_stat_get_blocks_fetched/function(typeoid/type)/literal/entry diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2253ca8..9c18dc7 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -353,6 +353,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd, pg_stat_get_live_tuples(C.oid) AS n_live_tup, pg_stat_get_dead_tuples(C.oid) AS n_dead_tup, +pg_stat_get_unremovable_tuples(C.oid) AS n_unremovable_tup, pg_stat_get_last_vacuum_time(C.oid) as last_vacuum, pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index cf8337b..140fe92 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -91,6 +91,7 @@ typedef struct LVRelStats double scanned_tuples; /* counts only tuples on scanned pages */ double old_rel_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ + double unremovable_tuples; /* count of dead tuples not yet removable */ BlockNumber pages_removed; double tuples_deleted; BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */ @@ -245,7 +246,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, /* report results to the stats collector, too */ pgstat_report_vacuum(RelationGetRelid(onerel), onerel-rd_rel-relisshared, -new_rel_tuples); +new_rel_tuples, + vacrelstats-unremovable_tuples); /* and log the action if appropriate */ if (IsAutoVacuumWorkerProcess()
[HACKERS] [PATCH] Addition of some trivial auto vacuum logging
Hi all, I spent a bit of today diagnosing a problem where long held transactions were preventing auto vacuum from doing its job. Eventually I had set log_autovacuum_min_duration to 0 to see what was going on. Unfortunately it wasn't until I tried a VACUUM VERBOSE that I found there were dead tuples not being removed. Had the unremoveable tuple count been included in the autovacuum log message life would have been a tiny bit easier. I've been meaning for a while to dabble in postgres, so I thought this might be a good trivial thing for me to improve. The attached patch adds extra detail the the existing autovacuum log message that is emitted when the log_autovacuum_min_duration threshold is met, exposing the unremovable dead tuple count similar to what you get from VACUUM VERBOSE. Sample log output (my addition in bold): LOG: automatic vacuum of table test.public.test: index scans: 0 pages: 0 removed, 5 remain tuples: 0 removed, 1000 remain, 999 dead but not removable system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec My patch adds another member to the LVRelStats struct named undeleteable_dead_tuples. lazy_scan_heap() sets undeleteable_dead_tuples to the value of lazy_scan_heap()'s local variable nkeep, which is the same variable that is used to emit the VACUUM VERBOSE unremoveable dead row count. As this is my first patch to postgresql, I'm expecting I've done something wrong. Please if you want me to fix something up, or just go away please say so ;) I appreciate that this is a trivial patch, and perhaps doesn't add value except for my very specific use case… feel free to ignore it =) --Royce diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index cf8337b..12f03d7 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -91,6 +91,7 @@ typedef struct LVRelStats double scanned_tuples; /* counts only tuples on scanned pages */ double old_rel_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ + double undeleteable_dead_tuples; /* count of dead tuples not yet removeable */ BlockNumber pages_removed; double tuples_deleted; BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */ @@ -256,7 +257,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, ereport(LOG, (errmsg(automatic vacuum of table \%s.%s.%s\: index scans: %d\n pages: %d removed, %d remain\n - tuples: %.0f removed, %.0f remain\n + tuples: %.0f removed, %.0f remain, %.0f dead but not removable\n system usage: %s, get_database_name(MyDatabaseId), get_namespace_name(RelationGetNamespace(onerel)), @@ -266,6 +267,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, vacrelstats-rel_pages, vacrelstats-tuples_deleted, new_rel_tuples, + vacrelstats-undeleteable_dead_tuples, pg_rusage_show(ru0; } } @@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, /* save stats for use later */ vacrelstats-scanned_tuples = num_tuples; vacrelstats-tuples_deleted = tups_vacuumed; + vacrelstats-undeleteable_dead_tuples = nkeep; /* now we can compute the new value for pg_class.reltuples */ vacrelstats-new_rel_tuples = vac_estimate_reltuples(onerel, false,
Re: [HACKERS] [PATCH] Addition of some trivial auto vacuum logging
Thanks for the tips guys. Just a question: is the utility great enough to warrant me working further on this? I have no real desire to implement this particular feature -- I simply saw an opportunity to cut my teeth on something easy. I'd be happy to find something on the TODO list instead if this feature isn't really worthwhile. Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it. Cheers! --Royce On 28/09/2011, at 4:42 AM, Kevin Grittner wrote: Royce Ausburn royce...@inomial.com wrote: As this is my first patch to postgresql, I'm expecting I've done something wrong. Please if you want me to fix something up, or just go away please say so ;) I appreciate that this is a trivial patch, and perhaps doesn't add value except for my very specific use case* feel free to ignore it =) Thanks for offering this to the community. I see you've already gotten feedback on the patch, with a suggestion for a different approach. Don't let that discourage you -- very few patches get in without needing to be modified based on review and feedback. If you haven't already done so, please review this page and its links: http://www.postgresql.org/developer/ Of particular interest is the Developer FAQ: http://wiki.postgresql.org/wiki/Developer_FAQ You should also be aware of the development cycle, which (when not in feature freeze for beta testing) involves alternating periods of focused development and code review (the latter called CommitFests): http://wiki.postgresql.org/wiki/CommitFest We are now in the midst of a CF, so it would be great if you could join in that as a reviewer. Newly submitted patches should be submitted to the open CF: http://commitfest.postgresql.org/action/commitfest_view/open You might want to consider what Tom said and submit a modified patch for the next review cycle. Welcome! -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers