Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-17 Thread Royce Ausburn

On 17/11/2011, at 1:47 AM, Robert Haas wrote:

 On Wed, Nov 16, 2011 at 9:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Not sure about the log line, but allowing pgstattuple to distinguish
 between recently-dead and quite-thoroughly-dead seems useful.
 
 The dividing line is enormously unstable though.  pgstattuple's idea of
 RecentGlobalXmin could even be significantly different from that of a
 concurrently running VACUUM.  I can see the point of having VACUUM log
 what it did, but opinions from the peanut gallery aren't worth much.
 
 Hmm, you have a point.
 
 But as Yeb points out, it seems like we should at least try to be more
 consistent about the terminology.


Thanks for the discussion so far all.  Would it be worthwhile to make another 
patch that addresses the points from Yeb's reviews?  It's not sounding like 
this unremovable tuple count is something that postgres wants, but I'm happy to 
keep the patch up to scratch if we're still not sure.

Cheers,

--Royce


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-17 Thread Royce Ausburn

On 18/11/2011, at 10:44 AM, Robert Haas wrote:

 On Thu, Nov 17, 2011 at 5:49 PM, Royce Ausburn royce...@inomial.com wrote:
 Thanks for the discussion so far all.  Would it be worthwhile to make 
 another patch that addresses the points from Yeb's reviews?  It's not 
 sounding like this unremovable tuple count is something that postgres wants, 
 but I'm happy to keep the patch up to scratch if we're still not sure.
 
 One question to ask yourself at this point in the process is whether
 *you* still think the feature is useful.  I'm fairly persuaded by
 Tom's point that the value monitored by this patch will change so
 quickly that it won't be useful to have VACUUM store it; it'll be
 obsolete by the time you look at the numbers.  If you are also
 persuaded by that argument, then clearly it's time to throw in the
 towel!
 
 But let's suppose you're NOT persuaded by that argument and you still
 want the feature.  In that case, don't just wait for someone else to
 stick up for the feature; tell us why you still think it's a good
 idea.  Make a case for what you want.  People here are usually
 receptive to good ideas, well-presented.

Okay - thanks Robert.  I think I'll drop it.  Now that I know what to look for 
in this situation, the changes this patch includes aren't of value to me.  
That's a pretty good indicator that it's probably not valuable to anyone else.

I've changed the status of the patch to rejected in the commit fest.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [Review] Include detailed information about a row failing a CHECK constraint into the error message

2011-11-16 Thread Royce Ausburn
The patch applies cleanly to the current git master and is in context diff format.The patch fails the regression tests because it is outputting new DETAIL line which four of tests aren't expecting. The tests will need to be updated.Functionality:The patch works as advertised. An insert or update that fails a check condition indeed logs the row that has failed:test=#  create table test (test(#   a integer,test(#   b integer CHECK (b  2),test(#   c text,test(#   d integertest(#test(#test(#  );CREATE TABLEtest=#test=#  insert into test select 1, 2, 'Test', 4;ERROR: new row for relation "test" violates check constraint "test_b_check"DETAIL: Failing row: (1, 2, Test, 4).One comment I have on the output is that strings are not in quotes. It's a little jarring, but might not be that big a deal. A contrived case that is pretty confusing:test=#  insert into test select 1, 2, '3, 4', 4;ERROR: new row for relation "test" violates check constraint "test_b_check"DETAIL: Failing row: (1, 2, 3, 4, 4).A select inserting 4 columns seemingly results in a 5 column row ;)Another super minor thing, postgres doesn't seem to put periods at the end of log messages, yet this new detail line does.CodeI'm no C or postgres expert, but the code looks okay to me.Attached is a small script I used to test/demo the functionality.--Royce

testCheckConstraints.sql
Description: Binary data
On 11/11/2011, at 2:56 AM, Jan Kundrát wrote:On 11/10/11 16:05, Tom Lane wrote:I agree with Jan that this is probably useful; I'm pretty sure therehave been requests for it before. We just have to make sure that thelength of the message stays in bounds.One tip for keeping the length down: there is no value in repeatinginformation from the primary error message, such as the name of theconstraint.Thanks to your comments and suggestions, I appreciate the time of thereviewers.Attached is a second version of this patch which keeps the size of theoutput at 64 characters per column (which is an arbitrary value definedas a const int, which I hope matches your style). Longer values havetheir last three characters replaced by "...", so there's no way todistinguish them from a legitimate string that ends with just that.There's also no escaping of special-string values, similar to how theBuildIndexValueDescription operates.Cheers,Jan-- Trojita, a fast e-mail client -- http://trojita.flaska.net/context_in_check_constraints-v2.patch

Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-15 Thread Royce Ausburn

On 16/11/2011, at 2:05 AM, Yeb Havinga wrote:

 On 2011-10-05 00:45, Royce Ausburn wrote:
 Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO.  I've 
 also fixed the name of an argument to pgstat_report_vacuum which I don't 
 think was particularly good, and I've replace the word tuple with row in 
 some docs I added for consistency.
 
 
 I reviewed your patch. I think it is in good shape, my two main remarks (name 
 of n_unremovable_tup and a remark about documentation at the end of this 
 review) are highly subjective and I wouldn't spend time on it unless other 
 people have the same opinion.
 
 Remarks:
 
 * rules regression test fails because pg_stat_all_tables is changed, 
 pg_stat_sys_tables and pg_stat_user_tables as well, but the new 
 expected/rules.out is not part of the patch.

Doh!  Thank you for spotting this.  Should we decide to continue this patch 
I'll look in to fixing this.

 
 * I'm not sure about the name n_unremovable_tup, since it doesn't convey it 
 is about dead tuples and judging from only the name it might as well include 
 the live tuples. It also doesn't hint that it is a transient condition, which 
 vacuum verbose does with the word 'yet'.
 What about n_locked_dead_tup? - this contains both information that it is 
 about dead tuples, and the lock suggests that once the lock is removed, the 
 dead tuple can be removed.
 

Looks like we have some decent candidates later in the thread.  Should this 
patch survive I'll look at updating it.

 * The number shows correctly in the pg_stat_relation. This is a testcase that 
 gives unremovable dead rows:
 
 I was puzzled for a while that n_unremovable_tup = n_dead_tup doesn't hold, 
 after all, the unremovable tuples are dead as well. Neither the current 
 documentation nor the documentation added by the patch do help in explaining 
 the meaning of n_dead_tup and n_unremovable_tup, which may be clear to 
 seasoned vacuum hackers, but not to me. In both the case of n_dead_tup it 
 would have been nice if the docs mentioned that dead tuples are tuples that 
 are deleted or previous versions of updated tuples, and that only analyze 
 updates n_dead_tup (since vacuum cleans them), in contrast with 
 n_unremovable_tup that gets updated by vacuum. Giving an example of how 
 unremovable dead tuples can be caused would IMHO also help understanding.

Fair enough!  I'll review this as well.

Thanks again!

--Royce
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-15 Thread Royce Ausburn

On 16/11/2011, at 8:04 AM, Tom Lane wrote:

 Robert Haas robertmh...@gmail.com writes:
 On Tue, Nov 15, 2011 at 10:29 AM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mar nov 15 12:16:54 -0300 2011:
 I guess this is a dumb question, but why don't we remove all the dead
 tuples?
 
 They were deleted but there are transactions with older snapshots.
 
 Oh.  I was thinking dead meant no longer visible to anyone.   But
 it sounds what we call unremovable here is what we elsewhere call
 recently dead.
 
 Would have to look at the code to be sure, but I think that
 nonremovable is meant to count both live tuples and
 dead-but-still-visible-to-somebody tuples.
 
 The question that I think needs to be asked is why it would be useful
 to track this using the pgstats mechanisms.  By definition, the
 difference between this and the live-tuple count is going to be
 extremely unstable --- I don't say small, necessarily, but short-lived.
 So it's debatable whether it's worth memorializing the count obtained
 by the last VACUUM at all.  And doing it through pgstats is an expensive
 thing.  We've already had push-back about the size of the stats table
 on large (lots-o-tables) databases.  Adding another counter will impose
 a performance overhead on everybody, whether they care about this number
 or not.
 
 What's more, to the extent that I can think of use-cases for knowing
 this number, I think I would want a historical trace of it --- that is,
 not only the last VACUUM's result but those of previous VACUUM cycles.
 So pgstats seems like it's both expensive and useless for the purpose.
 
 Right now the only good solution is trawling the postmaster log.
 Possibly something like pgfouine could track the numbers in a more
 useful fashion.


Thanks all for the input.

Tom: 

My first patch attempted to log the number of unremovable tuples in this log, 
but it was done inappropriately -- it was included as part of the 
log_autovacuum_min_duration's output.  You rightly objected to that patch :)

Personally I think some log output, done better, would have been more useful 
for me at the time.  At the time I was trying to diagnose an ineffective vacuum 
and postgres' logs weren't giving me any hints about what was wrong.  I turned 
to the mailing list and got immediate help, but I felt that ideally postgres 
would be logging something to tell me that some 1 day old transactions were 
preventing auto vacuum from doing its job.  Something, anything that I could 
google.  Other novices in my situation probably wouldn't know to look in the 
pg_stats* tables, so in retrospect my patch isn't really achieving my original 
goal.

Should we consider taking a logging approach instead?

--Royce



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-15 Thread Royce Ausburn
 
 
 Personally I think some log output, done better, would have been more useful 
 for me at the time.  At the time I was trying to diagnose an ineffective 
 vacuum and postgres' logs weren't giving me any hints about what was wrong.  
 I turned to the mailing list and got immediate help, but I felt that ideally 
 postgres would be logging something to tell me that some 1 day old 
 transactions were preventing auto vacuum from doing its job.  Something, 
 anything that I could google.  Other novices in my situation probably 
 wouldn't know to look in the pg_stats* tables, so in retrospect my patch 
 isn't really achieving my original goal.
 
 Should we consider taking a logging approach instead?

Dopey suggestion: 

Instead of logging around vacuum and auto_vacuum, perhaps log transactions that 
are open for longer than some (perhaps configurable) time?  The default might 
be pretty large, say 6 hours.  Are there common use cases for txs that run for 
longer than 6 hours?  Seeing a message such as:

WARNING: Transaction X has been open more than Y.  This tx may be holding 
locks preventing other txs from operating and may prevent vacuum from cleaning 
up deleted rows.

Would give a pretty clear indication of a problem :)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Unremovable tuple monitoring

2011-11-15 Thread Royce Ausburn

On 16/11/2011, at 12:26 PM, Robert Haas wrote:

 On Tue, Nov 15, 2011 at 7:59 PM, Royce Ausburn royce...@inomial.com wrote:
 Personally I think some log output, done better, would have been more 
 useful for me at the time.  At the time I was trying to diagnose an 
 ineffective vacuum and postgres' logs weren't giving me any hints about 
 what was wrong.  I turned to the mailing list and got immediate help, but I 
 felt that ideally postgres would be logging something to tell me that some 
 1 day old transactions were preventing auto vacuum from doing its job.  
 Something, anything that I could google.  Other novices in my situation 
 probably wouldn't know to look in the pg_stats* tables, so in retrospect my 
 patch isn't really achieving my original goal.
 
 Should we consider taking a logging approach instead?
 
 Dopey suggestion:
 
 Instead of logging around vacuum and auto_vacuum, perhaps log transactions 
 that are open for longer than some (perhaps configurable) time?  The default 
 might be pretty large, say 6 hours.  Are there common use cases for txs that 
 run for longer than 6 hours?  Seeing a message such as:
 
 WARNING: Transaction X has been open more than Y.  This tx may be holding 
 locks preventing other txs from operating and may prevent vacuum from 
 cleaning up deleted rows.
 
 Would give a pretty clear indication of a problem :)
 
 Well, you could that much just by periodically querying pg_stat_activity.

Fair enough -- someone knowledgable could set that up if they wanted.  My goal 
was mostly to have something helpful in the logs.  If that's not something 
postgres wants/needs Ill drop it =)





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG or strange behaviour of update on primary key

2011-10-17 Thread Royce Ausburn

On 18/10/2011, at 1:00 PM, Robert Haas wrote:

 On Mon, Oct 17, 2011 at 7:30 PM, desmodemone desmodem...@gmail.com wrote:
 Seems an Oracle bug not Postgresql one!
 
 I don't think it's a bug for it to work.  It'd probably work in
 PostgreSQL too, if you inserted (2) first and then (1).  It's just
 that, as Tom says, if you want it to be certain to work (rather than
 depending on the order in which the rows are inserted), you need the
 checks to be deferred.

Do deferred checks such as this have a memory impact for bulk updates?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Royce Ausburn
Hi all,

I wonder, could the recent work on index only scans pave the way for auto 
clustered tables?  Consider a wide, mostly insert table with some subset of 
columns that I'd like to cluster on.  I'm after locality of tuples that are 
very frequently fetched together, but not keen on the downtime for a cluster, 
nor the maintenance that it requires.  Would it be a stretch to have an index 
that branches on the subset of cluster columns, but still stores all the 
columns, making it a covering index?  Given that we can already index 
concurrently, such an index would not require downtime, and would be self 
maintaining.  From my understanding of the index-only scan implementation, I 
suspect that such an index would effectively give locality, with some caveats… 

I'd expect the overhead of inserting in to such a table would be high, perhaps 
prohibitive.  Perhaps better ways have been discussed.  Stupid idea?

--Royce


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-11 Thread Royce Ausburn

On 08/10/2011, at 1:56 AM, Yeb Havinga wrote:

 Attach is v2 of the patch.
 
 Mixed notation now raises an error.
 
 In contrast with what I said above, named parameter related errors are thrown 
 before any syntax errors. I tested with raising syntax errors first but the 
 resulting code was a bit more ugly and the sql checking under a error 
 condition (i.e. double named parameter error means there is one parameter in 
 short)  was causing serious errors.
 
 Documentation was also added, regression tests updated.

I've tested this patch out and can confirm mixed notation produces an error:

psql:plsqltest.sql:27: ERROR:  mixing positional and named parameter assignment 
not allowed in cursor cur1
LINE 10:   open cur1( param2 := 4, 2, 5); 

Just one small thing: it'd be nice to have an example for cursor declaration 
with named parameters.  Your patch adds one for opening a cursor, but not for 
declaring one.

Other than that, I think the patch is good.  Everything works as advertised =)

--Royce


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-11 Thread Royce Ausburn

On 11/10/2011, at 11:38 PM, Yeb Havinga wrote:

 Declaration of cursors with named parameters is already part of PostgreSQL 
 (so it is possible to use the parameter names in the cursor query instead of 
 $1, $2, etc.) and it also already documented with an example, just a few 
 lines above the open examples. See curs3 on 
 http://developer.postgresql.org/pgdocs/postgres/plpgsql-cursors.html

Doh - my apologies!

[HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-06 Thread Royce Ausburn
Initial Review for patch:

http://archives.postgresql.org/pgsql-hackers/2011-09/msg00744.php


Submission review

The patch is in context diff format and applies cleanly to the git master.  The 
patch includes an update to regression tests.  The regression tests pass.  The 
patch does not include updates to the documentation reflecting the new 
functionality.

Usability review

The patch adds a means of specifying named  cursor parameter arguments in 
pg/plsql.  

The syntax is straight forward:

cur1 CURSOR (param1 int, param2 int, param3 int) for select …;
…
open cur1(param3 := 4, param2 := 1, param1 := 5);


The old syntax continues to work:

cur1 CURSOR (param1 int, param2 int, param3 int) for select …;
…
open cur1(5, 1, 3);


• Does the patch actually implement that?

Yes, the feature works as advertised.

• Do we want that?

I very rarely use pg/plsql, so I won't speak to its utility.  However there has 
been some discussion about the idea:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01440.php

• Do we already have it?

Not AFAIK

• Does it follow SQL spec, or the community-agreed behavior?

There's some discussion about the syntax ( := or = ), I'm not sure there's a 
consensus yet.


• Does it include pg_dump support (if applicable)?

Yes -- pgplsql functions using named parameters are correctly dumped.

• Are there dangers?

Potentially.  See below.

• Have all the bases been covered?

I don't think so.  The new feature accepts opening a cursor with some parameter 
names not specified:

  open cur1(param3 := 4, 1, param1 := 5);

It seems that if a parameter is not named, its position is used to bind to a 
variable.  For example, the following fail:

psql:plsqltest.sql:26: ERROR:  cursor cur1 argument 2 param2 provided 
multiple times
LINE 10:   open cur1(param3 := 4, 1, param2 := 5);

and

psql:plsqltest.sql:26: ERROR:  cursor cur1 argument 2 param2 provided 
multiple times
LINE 10:   open cur1(param2 := 4, 2, param1 := 5);


I think that postgres ought to enforce some consistency here.  Use one way or 
the other, never both.





I can also produce some unhelpful errors when I give bad syntax.  For example:

psql:plsqltest.sql:28: ERROR:  cursor cur1 argument 1 param1 provided 
multiple times
LINE 11:   open cur1( param3 : = 4, 2, param1 := 5); 
(notice the space between the : and =)

--

psql:plsqltest.sql:28: ERROR:  cursor cur1 argument 1 param1 provided 
multiple times
LINE 11:   open cur1( param3 = 4, 2, param1 := 5); 
(Wrong assignment operator)

--

psql:plsqltest.sql:27: ERROR:  cursor cur1 argument 3 param3 provided 
multiple times
LINE 10:   open cur1( 1 , param3 := 2, param2 = 3 );
(Wrong assignment operator)

--

psql:plsqltest.sql:27: ERROR:  cursor cur1 argument 3 param3 provided 
multiple times
LINE 10: ... open cur1( param3 = param3 , param3 := 2, param2 = 3 );


--

  open cur1( param3 := param3 , param2 = 3, param1 := 1 );

psql:plsqltest.sql:29: ERROR:  column param2 does not exist
LINE 2: ,param2 = 3
 ^
QUERY:  SELECT 1
,param2 = 3
,param3;
CONTEXT:  PL/pgSQL function named_para_test line 7 at OPEN


I haven't been able to make something syntactically incorrect that doesn't 
produce an error, even if the error isn't spot on.  I haven't been able to make 
it crash, and when the syntax is correct, it has always produced correct 
results.

Performance review

I've done some limited performance testing and I can't really see a difference 
between the unpatched and patched master.

• Does it follow the project coding guidelines?

I believe so, but someone more familiar with them will probably spot violations 
better than me =)

• Are there portability issues?

I wouldn't know -- I don't have any experience with C portability.

• Will it work on Windows/BSD etc?

Tested under OS X, so BSD is presumably okay.  No idea about other unixes nor 
windows.

• Are the comments sufficient and accurate?

I'm happy enough with them.

• Does it do what it says, correctly?

Yes, excepting my comments above.

• Does it produce compiler warnings?

Yes:

In file included from gram.y:12962:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16243: warning: unused variable ‘yyg’


But this was not added by this patch -- it's also in the unpatched master.

• Can you make it crash?

No.

--Royce







Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-06 Thread Royce Ausburn
Forgive my ignorance -- do I need to be doing anything else now seeing as I 
started the review?  


On 07/10/2011, at 7:15 AM, Pavel Stehule wrote:

 2011/10/6 Robert Haas robertmh...@gmail.com:
 On Thu, Oct 6, 2011 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 David E. Wheeler da...@kineticode.com writes:
 Would it then be added as an alias for := for named function parameters? 
 Or would that come still later?
 
 Once we do that, it will be impossible not merely deprecated to use =
 as an operator name.  I think that has to wait at least another release
 cycle or two past where we're using it ourselves.
 
 Okay. I kind of like := so there's no rush AFAIC. :-)
 
 Hmm ... actually, that raises another issue that I'm not sure whether
 there's consensus for or not.  Are we intending to keep name := value
 syntax forever, as an alternative to the standard name = value syntax?
 I can't immediately see a reason not to, other than the it's not
 standard argument.
 
 Because if we *are* going to keep it forever, there's no very good
 reason why we shouldn't accept this plpgsql cursor patch now.  We'd
 just have to remember to extend plpgsql to take = at the same time
 we do that for core function calls.
 
 It's hard to see adding support for = and dropping support for := in
 the same release.  That would be a compatibility nightmare.
 
 If := is used by the standard for some other, incompatible purpose,
 then I suppose we would want to add support for =, wait a few
 releases, deprecate :=, wait a couple of releases, remove :=
 altogether.  But IIRC we picked := precisely because the standard
 didn't use it at all, or at least not for anything related... in which
 case we may as well keep it around more or less indefinitely.
 
 +1
 
 Pavel
 
 
 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 
 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Unremovable tuple monitoring (was: Addition of some trivial auto vacuum logging)

2011-10-04 Thread Royce Ausburn

On 04/10/2011, at 11:26 PM, Robert Haas wrote:

 On Mon, Oct 3, 2011 at 9:17 AM, Royce Ausburn royce...@inomial.com wrote:
 - I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h.  
 In this patch I have.
 
 Generally that is left to the committer, as the correct value depends
 on the value at the time of commit, not the time you submit the patch;
 and including it in the patch tends to result in failing hunks, since
 the value changes fairly frequently.

 - The VACUUM FULL implementation in cluster.c doesn't do any stats updating 
 similar to vacuumlazy.c, so I haven't don't anything there… (is this right?  
 A vacuum full may also encounter unremovable tuples, right?)
 
 We've occasionally heard grumblings about making cluster do more stats
 updating, but your patch should just go along with whatever's being
 done now in similar cases.

I think I get this stats stuff now.  Unless someone here thinks it's too hard 
for a new postgres dev's 2nd patch, I could take a stab.  I might take a look 
at it tonight to get a feel for how hard, and what stats we could collect.  
I'll start a new thread for discussion.

Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO.  I've also 
fixed the name of an argument to pgstat_report_vacuum which I don't think was 
particularly good, and I've replace the word tuple with row in some docs I 
added for consistency.

I'm not sure what my next step should be.  I've added this patch to the open 
commit fest -- is that all for now until the commit fest begins review?

--Royce



diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a19e3f0..8692580 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -328,7 +328,8 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
   belonging to the table), number of live rows fetched by index
   scans, numbers of row insertions, updates, and deletions,
   number of row updates that were HOT (i.e., no separate index update),
-  numbers of live and dead rows,
+  numbers of live and dead rows, 
+  the number of dead rows not removed in the last vacuum,
   the last time the table was non-optionFULL/ vacuumed manually,
   the last time it was vacuumed by the autovacuum daemon,
   the last time it was analyzed manually,
@@ -764,6 +765,14 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
Number of dead rows in table
   /entry
  /row
+ 
+ row
+  
entryliteralfunctionpg_stat_get_unremovable_tuples/function(typeoid/type)/literal/entry
+  entrytypebigint/type/entry
+  entry
+   Number of dead rows not removed in the table's last vacuum
+  /entry
+ /row
 
  row
   
entryliteralfunctionpg_stat_get_blocks_fetched/function(typeoid/type)/literal/entry
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 2253ca8..9c18dc7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -353,6 +353,7 @@ CREATE VIEW pg_stat_all_tables AS
 pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd,
 pg_stat_get_live_tuples(C.oid) AS n_live_tup,
 pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
+pg_stat_get_unremovable_tuples(C.oid) AS n_unremovable_tup,
 pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
 pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
 pg_stat_get_last_analyze_time(C.oid) as last_analyze,
diff --git a/src/backend/commands/vacuumlazy.c 
b/src/backend/commands/vacuumlazy.c
index cf8337b..140fe92 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -91,6 +91,7 @@ typedef struct LVRelStats
double  scanned_tuples; /* counts only tuples on scanned pages 
*/
double  old_rel_tuples; /* previous value of pg_class.reltuples 
*/
double  new_rel_tuples; /* new estimated total # of tuples */
+   double  unremovable_tuples; /* count of dead tuples not yet 
removable */
BlockNumber pages_removed;
double  tuples_deleted;
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -245,7 +246,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
/* report results to the stats collector, too */
pgstat_report_vacuum(RelationGetRelid(onerel),
 onerel-rd_rel-relisshared,
-new_rel_tuples);
+new_rel_tuples,
+
vacrelstats-unremovable_tuples);
 
/* and log the action if appropriate */
if (IsAutoVacuumWorkerProcess()  Log_autovacuum_min_duration = 0)
@@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats

Re: [HACKERS] [PATCH] Unremovable tuple monitoring (was: Addition of some trivial auto vacuum logging)

2011-10-03 Thread Royce Ausburn
On 28/09/2011, at 11:17 AM, Tom Lane wrote:

 Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Royce Ausburn's message of mar sep 27 21:28:26 -0300 2011:
 Tom's suggestion looks like it's less trivial that I can do just yet, but 
 I'll take a look and ask for help if I need it.
 
 It's not that difficult.  Just do a search on git log
 src/backend/postmaster/pgstat.c for patches that add a new column
 somewhere.
 
 Yeah, I was just about to say the same thing.  Find a previous patch
 that adds a feature similar to what you have in mind, and crib like mad.
 We've added enough stats counters over time that you should have several
 models to work from.


This patch does as Tom suggested, adding a column to the pg_stat_all_tables 
view which exposes the number of unremovable tuples in the last vacuum.  This 
patch does not include my previous work to log this information as part of 
auto_vacuum's logging.

User visible additions:
New column pg_stat_all_tables.n_unremovable_tup
New function bigint pg_stat_get_unremovable_tuples(oid)

A few notes / questions:

- I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h.  In 
this patch I have.

- I'm not sure about how I should be selecting an OID for my new stats 
function.  I used the unused_oids script and picked one that seemed reasonable.

- The VACUUM FULL implementation in cluster.c doesn't do any stats updating 
similar to vacuumlazy.c, so I haven't don't anything there… (is this right?  A 
vacuum full may also encounter unremovable tuples, right?)

- I'm not usually a C developer, so peeps reviewing please watch for noob 
mistakes.

Cheers,

--Royce



diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a19e3f0..af7b235 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -328,7 +328,8 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
   belonging to the table), number of live rows fetched by index
   scans, numbers of row insertions, updates, and deletions,
   number of row updates that were HOT (i.e., no separate index update),
-  numbers of live and dead rows,
+  numbers of live and dead rows, 
+  the number of dead tuples not removed in the last vacuum,
   the last time the table was non-optionFULL/ vacuumed manually,
   the last time it was vacuumed by the autovacuum daemon,
   the last time it was analyzed manually,
@@ -764,6 +765,14 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
Number of dead rows in table
   /entry
  /row
+ 
+ row
+  
entryliteralfunctionpg_stat_get_unremovable_tuples/function(typeoid/type)/literal/entry
+  entrytypebigint/type/entry
+  entry
+   Number of dead rows not removed in the table's last vacuum
+  /entry
+ /row
 
  row
   
entryliteralfunctionpg_stat_get_blocks_fetched/function(typeoid/type)/literal/entry
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 2253ca8..9c18dc7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -353,6 +353,7 @@ CREATE VIEW pg_stat_all_tables AS
 pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd,
 pg_stat_get_live_tuples(C.oid) AS n_live_tup,
 pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
+pg_stat_get_unremovable_tuples(C.oid) AS n_unremovable_tup,
 pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
 pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
 pg_stat_get_last_analyze_time(C.oid) as last_analyze,
diff --git a/src/backend/commands/vacuumlazy.c 
b/src/backend/commands/vacuumlazy.c
index cf8337b..140fe92 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -91,6 +91,7 @@ typedef struct LVRelStats
double  scanned_tuples; /* counts only tuples on scanned pages 
*/
double  old_rel_tuples; /* previous value of pg_class.reltuples 
*/
double  new_rel_tuples; /* new estimated total # of tuples */
+   double  unremovable_tuples; /* count of dead tuples not yet 
removable */
BlockNumber pages_removed;
double  tuples_deleted;
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -245,7 +246,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
/* report results to the stats collector, too */
pgstat_report_vacuum(RelationGetRelid(onerel),
 onerel-rd_rel-relisshared,
-new_rel_tuples);
+new_rel_tuples,
+
vacrelstats-unremovable_tuples);
 
/* and log the action if appropriate */
if (IsAutoVacuumWorkerProcess()  

[HACKERS] [PATCH] Addition of some trivial auto vacuum logging

2011-09-27 Thread Royce Ausburn
Hi all,

I spent a bit of today diagnosing a problem where long held transactions were 
preventing auto vacuum from doing its job.  Eventually I had set 
log_autovacuum_min_duration to 0 to see what was going on.  Unfortunately it 
wasn't until I tried a VACUUM VERBOSE that I found there were dead tuples not 
being removed.  Had the unremoveable tuple count been included in the 
autovacuum log message life would have been a tiny bit easier.

I've been meaning for a while to dabble in postgres, so I thought this might be 
a good trivial thing for me to improve.  The attached patch adds extra detail 
the the existing autovacuum log message that is emitted when the 
log_autovacuum_min_duration threshold is met, exposing the unremovable dead 
tuple count similar to what you get from VACUUM VERBOSE.

Sample log output (my addition in bold):

LOG:  automatic vacuum of table test.public.test: index scans: 0
pages: 0 removed, 5 remain
tuples: 0 removed, 1000 remain, 999 dead but not removable
system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec


My patch adds another member to the LVRelStats struct named 
undeleteable_dead_tuples.  lazy_scan_heap() sets undeleteable_dead_tuples to 
the value of lazy_scan_heap()'s local variable nkeep, which is the same 
variable that is used to emit the VACUUM VERBOSE unremoveable dead row count.

As this is my first patch to postgresql, I'm expecting I've done something 
wrong.  Please if you want me to fix something up, or just go away please say 
so ;)  I appreciate that this is a trivial patch, and perhaps doesn't add value 
except for my very specific use case… feel free to ignore it =)

--Royce





diff --git a/src/backend/commands/vacuumlazy.c 
b/src/backend/commands/vacuumlazy.c
index cf8337b..12f03d7 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -91,6 +91,7 @@ typedef struct LVRelStats
double  scanned_tuples; /* counts only tuples on scanned pages 
*/
double  old_rel_tuples; /* previous value of pg_class.reltuples 
*/
double  new_rel_tuples; /* new estimated total # of tuples */
+   double  undeleteable_dead_tuples; /* count of dead tuples not 
yet removeable */
BlockNumber pages_removed;
double  tuples_deleted;
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -256,7 +257,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
ereport(LOG,
(errmsg(automatic vacuum of table 
\%s.%s.%s\: index scans: %d\n
pages: %d removed, %d 
remain\n
-   tuples: %.0f removed, 
%.0f remain\n
+   tuples: %.0f removed, 
%.0f remain, %.0f dead but not removable\n
system usage: %s,

get_database_name(MyDatabaseId),

get_namespace_name(RelationGetNamespace(onerel)),
@@ -266,6 +267,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
vacrelstats-rel_pages,

vacrelstats-tuples_deleted,
new_rel_tuples,
+   
vacrelstats-undeleteable_dead_tuples,
pg_rusage_show(ru0;
}
 }
@@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
/* save stats for use later */
vacrelstats-scanned_tuples = num_tuples;
vacrelstats-tuples_deleted = tups_vacuumed;
+   vacrelstats-undeleteable_dead_tuples = nkeep;
 
/* now we can compute the new value for pg_class.reltuples */
vacrelstats-new_rel_tuples = vac_estimate_reltuples(onerel, false,



Re: [HACKERS] [PATCH] Addition of some trivial auto vacuum logging

2011-09-27 Thread Royce Ausburn
Thanks for the tips guys. 

Just a question:  is the utility great enough to warrant me working further on 
this?  I have no real desire to implement this particular feature -- I simply 
saw an opportunity to cut my teeth on something easy.  I'd be happy to find 
something on the TODO list instead if this feature isn't really worthwhile.

Tom's suggestion looks like it's less trivial that I can do just yet, but I'll 
take a look and ask for help if I need it.

Cheers!

--Royce


On 28/09/2011, at 4:42 AM, Kevin Grittner wrote:

 Royce Ausburn royce...@inomial.com wrote:
 
 As this is my first patch to postgresql, I'm expecting I've done
  something wrong.  Please if you want me to fix something up, or
 just go away please say so ;)  I appreciate that this is a trivial
 patch, and perhaps doesn't add value except for my very specific
 use case* feel free to ignore it =)
 
 Thanks for offering this to the community.  I see you've already
 gotten feedback on the patch, with a suggestion for a different
 approach.  Don't let that discourage you -- very few patches get in
 without needing to be modified based on review and feedback.
 
 If you haven't already done so, please review this page and its
 links:
 
 http://www.postgresql.org/developer/
 
 Of particular interest is the Developer FAQ:
 
 http://wiki.postgresql.org/wiki/Developer_FAQ
 
 You should also be aware of the development cycle, which (when not
 in feature freeze for beta testing) involves alternating periods of
 focused development and code review (the latter called CommitFests):
 
 http://wiki.postgresql.org/wiki/CommitFest
 
 We are now in the midst of a CF, so it would be great if you could
 join in that as a reviewer.  Newly submitted patches should be
 submitted to the open CF:
 
 http://commitfest.postgresql.org/action/commitfest_view/open
 
 You might want to consider what Tom said and submit a modified patch
 for the next review cycle.
 
 Welcome!
 
 -Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers