date:20110716

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Tom Lane

Robert Haas  writes:
> On Sat, Jul 16, 2011 at 10:04 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Is there any way that we could get *rid* of custom_variable_classes?

>> Well, we could just drop it and say you can set any dotted-name GUC
>> you feel like.

> ...and the fact that we've made them set an extra GUC to shoot
> themselves in the foot hardly seems like an improvement.  I was more
> thinking along the lines of having loadable modules register custom
> variable classes at load time, through some sort of C API provided for
> that purpose, rather than having the user declare a list that may or
> may not match what the .so files really care about.

Well, we *do* have a C API for that, of a sort.  The problem is, what do
you do in processes that have not loaded the relevant extension?  (And
no, I don't like the answer of "let's force the postmaster to load every
extension we want to set any parameters for".)

I agree custom_variable_classes is conceptually messy, but it's a
reasonably lightweight compromise that gives some error checking without
requiring a lot of possibly-irrelevant extensions to be loaded into
every postgres process.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Robert Haas

On Sat, Jul 16, 2011 at 10:04 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jul 15, 2011 at 8:02 PM, Tom Lane  wrote:
>>> 2. Tentatively apply the new custom_variable_classes setting if any.
>
>> Is there any way that we could get *rid* of custom_variable_classes?
>> The idea of using a GUC to define the set of valid GUCs seems
>> intrinsically problematic.
>
> Well, we could just drop it and say you can set any dotted-name GUC
> you feel like.  The only reason to have it is to have some modicum of
> error checking ... but I'm not sure why we should bother if there's no
> checking on the second half of the name.  Not sure if that's going too
> far in the laissez-faire direction, though.  I can definitely imagine
> people complaining "I set plpqsql.variable_conflict in postgresql.conf
> and it didn't do anything, how come?"

I'm not sure that's really making anything any better.  Maybe I'm
misunderstanding, but if that's going to be a problem, then presumably
this will create the same problem:

custom_variable_classes='plpgsql'
plpgsql.variable_conflict='on'

...and the fact that we've made them set an extra GUC to shoot
themselves in the foot hardly seems like an improvement.  I was more
thinking along the lines of having loadable modules register custom
variable classes at load time, through some sort of C API provided for
that purpose, rather than having the user declare a list that may or
may not match what the .so files really care about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-16 Thread Robert Haas

On Sat, Jul 16, 2011 at 10:04 PM, Bruce Momjian  wrote:
> This mostly revoles around the problem of trying to finalize 9.1 while
> applying 9.2 patches --- no surprise we don't have enough cycles to do
> that.

Well, sorta.  The fact that Josh got his head bitten off for
suggesting that we weren't going to get this CommitFest finished
without some more committer involvement could have happened at any
time in the release cycle (and has, in the past).

I think the question of whether we can overlap release cycles is a bit
of a red herring.  We had our first CommitFest for 9.1 in July 2010,
and that ran quite smoothly, though it was also a release-cycle
overlap.  One big difference this time is that Tom and Peter haven't
participated in this CommitFest very much at all (and I've done less
as well, due to other commitments), whereas they did last time around.
 So I think the real question is not "how much bandwidth do we have as
a community?" but rather "what works for the key people who make the
process function?" and maybe "how can we induce other people to make
the kind of time commitment that Tom and Peter have in the past?".

I think the fact that we've managed to get 18 patches committed - and
will probably squeeze in a few more - despite Tom and Peter being busy
is pretty good.  But it certainly emphasizes the extent to which we
depend on a relatively small number of contributors to do an awful lot
of the work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space

2011-07-16 Thread Tom Lane

Tatsuo Ishii  writes:
>> Remember that what will happens is probably:
>> 
>> ERROR:  aborting due to exceeding temp file limit. Current usage 8000kB,
>> requested size 8008kB, thus it will exceed temp file limit 8kB.

> Could you please elaborate why "Current usage 8000kB" can bigger than
> "temp file limit 8kB"? I undertstand the point that temp files are
> allocated by 8kB at once, but I don't understand why those numbers you
> suggested could happen. Actually I tried with the modified patches and
> got:

> test=# CREATE TEMPORARY TABLE resourcetemp1 AS SELECT 
> generate_series(1,10);
> SELECT 10
> test=# SET temp_file_limit = 578;
> SET
> test=# SELECT count(*) FROM (select * FROM resourcetemp1  ORDER BY 1) AS a;
> ERROR:  aborting due to exceeding temp file limit, current usage 576kB, 
> requested size 8192kB, thus it will exceed temp file limit 578kB

You didn't show us how you computed those numbers, but I'm really
dubious that FileWrite() has got any ability to produce numbers that
are helpful.  Like Cedric, I think the write amount in any one call
is usually going to be one block.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space

2011-07-16 Thread Tom Lane

Mark Kirkwood  writes:
> This  version moves the check *before* we write the new buffer, so 
> should take care of issues about really large write buffers, plugins 
> etc.

This logic seems pretty obviously wrong:

+ if (temp_file_limit >= 0 && VfdCache[file].fdstate & FD_TEMPORARY)
+ {
+ if (VfdCache[file].seekPos + amount >= VfdCache[file].fileSize)
+ {
+ increaseSize = true;
+ if ((temporary_files_size + (double)amount) / 1024.0 > 
(double)temp_file_limit)
+ ereport(ERROR,
+ (errcode(ERRCODE_QUERY_CANCELED),
+ errmsg("aborting due to exceeding temp file limit")));
+ }
+ }

It's supposing that the write length ("amount") is exactly the amount by
which the file length will grow, which would only be true if seekPos is
exactly equal to fileSize before the write.  I think that would often be
the case in our usage patterns, but surely code at this low level should
not be assuming that.

BTW, the definition of the GUC seems to constrain the total temp file
size to be no more than 2GB on 32-bit machines ... do we really want
it to act like that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-16 Thread David E. Wheeler

On Jul 16, 2011, at 7:16 PM, Robert Haas wrote:

> But what happens when and if we add global temporary tables?  Now we
> might very well decide to set the faux-relistemp to true for temporary
> and global temporary tables (they do have "temporary" in the name,
> after all!) and false for unlogged and permanent tables.  Or we might
> decide that the faux-relistemp should only be true for the kind of
> temporary tables that we've always had, and false for these new global
> temporary tables, perhaps on the theory that a global temporary table
> is not really temporary at all, though its contents are.  One of these
> decisions would probably be right for David (and pgTap) and the other
> would be wrong; and the decision that was right for pgTap might be
> wrong for some other client.  So instead of breaking pgTap we might
> just quietly make it stop working correctly.

Well I think it would continue to work exactly as it has in the past. And if 
one needed to know other information about the *type* of temp table, well then 
one would have to use relpersistence.

The idea is not to try to make it adapt to future changes. The idea is to try 
to preserve the previous behavior for some period of time.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-16 Thread Robert Haas

On Fri, Jul 15, 2011 at 8:17 PM, Jim Nasby  wrote:
> On Jul 13, 2011, at 2:23 PM, David E. Wheeler wrote:
>> Wasn't newsysviews supposed to deal with these sorts of issues? Why were 
>> they rejected?
>
> Unless they recently came up again and got rejected again; the original 
> complaint was that some of their conventions didn't follow information_schema 
> conventions. The community wanted that changed and that never happened.

I think, also, that the idea that newsysviews is going to fix all of
our problems is mostly wishful thinking.  Let's suppose that we had a
system view over pg_class that kept around some variant of relistemp
even though it's gone from pg_class per se.  Well, such a column would
probably be false for both unlogged and permanent tables and true for
temporary tables and David would be happy.

But what happens when and if we add global temporary tables?  Now we
might very well decide to set the faux-relistemp to true for temporary
and global temporary tables (they do have "temporary" in the name,
after all!) and false for unlogged and permanent tables.  Or we might
decide that the faux-relistemp should only be true for the kind of
temporary tables that we've always had, and false for these new global
temporary tables, perhaps on the theory that a global temporary table
is not really temporary at all, though its contents are.  One of these
decisions would probably be right for David (and pgTap) and the other
would be wrong; and the decision that was right for pgTap might be
wrong for some other client.  So instead of breaking pgTap we might
just quietly make it stop working correctly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jul 15, 2011 at 8:02 PM, Tom Lane  wrote:
>> 2. Tentatively apply the new custom_variable_classes setting if any.

> Is there any way that we could get *rid* of custom_variable_classes?
> The idea of using a GUC to define the set of valid GUCs seems
> intrinsically problematic.

Well, we could just drop it and say you can set any dotted-name GUC
you feel like.  The only reason to have it is to have some modicum of
error checking ... but I'm not sure why we should bother if there's no
checking on the second half of the name.  Not sure if that's going too
far in the laissez-faire direction, though.  I can definitely imagine
people complaining "I set plpqsql.variable_conflict in postgresql.conf
and it didn't do anything, how come?"

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-16 Thread Bruce Momjian

Robert Haas wrote:
> On Fri, Jul 15, 2011 at 5:37 PM, Bruce Momjian  wrote:
> > Josh Berkus wrote:
> >> Alvaro,
> >>
> >> > It seems that by mentioning some people but not all, you offended both
> >> > the people you mentioned (at least some of them, because they are
> >> > already actively helping) and those that you didn't (at least some of
> >> > them, because they are already actively helping; those that are not
> >> > completely inactive in the project, that is).
> >>
> >> Yeah, everybody's super-touchy this week. ? Must be the weather.
> >
> > Somehow blaming everyone else doesn't seem like the proper reaction. ?:-(
> 
> I don't think Josh's tone is really the problem we should be worrying
> about here.  He's pointing out a legitimate problem.  If you go back
> and look at the CF app for 9.1, you'll see that Tom, Peter, and I
> committed the overwhelming majority of the patches which were
> submitted to CFs and went on to get committed.  So if we have a
> CommitFest where Tom is on vacation and Peter is devoting his time to
> polishing release N-1 rather than new development on release N, then
> we're either going to need a much larger investment of time by one or
> more other committers, or we're not really going to get through
> everything.  When you lose the efforts of somebody who might commit 10
> or 20 patches in a CF and comment usefully on another 10 or 20, it
> leaves a big hole.

This mostly revoles around the problem of trying to finalize 9.1 while
applying 9.2 patches --- no surprise we don't have enough cycles to do
that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mysterious server crashes

2011-07-16 Thread Robert Haas

On Fri, Jul 15, 2011 at 5:37 PM, Žiga Kranjec  wrote:
> Recently we have upgraded our debian system (sid),
> which has since started crashing mysteriously.
> We are still looking into that. It runs on 3ware RAID.
> Postgres package is 8.4.8-2.
>
> The database came back up apparently ok, except
> for indexes. Running reindex produces this error on
> one of the tables:
>
> ERROR:  unexpected chunk number 1 (expected 0) for toast value 17539760 in
> pg_toast_16992
>
> Same with select.
>
> I tried running reindex on toast table didn't help. Running:
>
> select * from pg_toast.pg_toast_16992 where chunk_id = 17539760;
>
> crashed postgres backend (and apparently the whole server).
>
> Is there anything we can/should do to fix the problem, besides
> restoring the whole database from backup?

Well, in theory, an operating system crash shouldn't corrupt your
database.  Maybe you've configured fsync=off, or have some other
problem that is making it not work reliably.  There are some useful
resources here:

http://wiki.postgresql.org/wiki/Reliable_Writes

At this point, it sounds like things are pretty badly messed up.  A
restore from backup seems like a good idea, but first you might want
to try to track down what else is wrong with this machine (bad memory?
corrupted OS?), else you might find yourself back in the same
situation all over again pretty quickly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-16 Thread Robert Haas

On Sat, Jul 16, 2011 at 12:49 PM, Tom Lane  wrote:
> I wrote:
>> Josh Kupershmidt  writes:
>>> What's the Description displayed in that table?
>
>> What it ought to be is the comment (if any) attached to the index's
>> column.  Up through 8.4 this worked as expected, but in 9.0 and up
>> somebody seems to have disallowed comments on index columns.  Not
>> sure how carefully that was thought through.
>
> After a bit of review of the archives, the somebody was me:
> http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=b7d67954456f15762c04e5269b64adc88dcd0860
>
> and this thread was the discussion about it:
> http://archives.postgresql.org/pgsql-hackers/2009-12/msg01982.php
>
> It looks like we thought about pg_dump, but did not think about psql.
>
> I think it might be reasonable to remove the Description column from
> \d+ output for indexes and sequences, on the grounds that (1) it's
> useless against 9.x servers, and (2) for those relkinds we add other
> columns and so the horizontal space is precious.

Yeah, I think that's very reasonable.  We're talking about changing
this in 9.2, which will be the third release since that functionality
was deprecated.  Considering that the functionality is so minor that
there may be no one using it anyway, that seems more than generous.

> We could also consider showing Description only when talking to a
> pre-9.0 server; but that's going to render the code even more
> spaghetti-ish, and the value seems pretty limited.

I don't think we need to do that.  Backward compatibility is good, but
insisting that a 9.2 psql has to produce exactly the same output on an
8.3 server than an 8.3 psql would have done seems like it would be
taking things too far.  We should try to make it work and be useful,
but we shouldn't slavishly replicate obsolete functionality of
doubtful utility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Robert Haas

On Fri, Jul 15, 2011 at 8:02 PM, Tom Lane  wrote:
> 2. Tentatively apply the new custom_variable_classes setting if any.

Is there any way that we could get *rid* of custom_variable_classes?
The idea of using a GUC to define the set of valid GUCs seems
intrinsically problematic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-16 Thread Robert Haas

On Fri, Jul 15, 2011 at 5:37 PM, Bruce Momjian  wrote:
> Josh Berkus wrote:
>> Alvaro,
>>
>> > It seems that by mentioning some people but not all, you offended both
>> > the people you mentioned (at least some of them, because they are
>> > already actively helping) and those that you didn't (at least some of
>> > them, because they are already actively helping; those that are not
>> > completely inactive in the project, that is).
>>
>> Yeah, everybody's super-touchy this week.   Must be the weather.
>
> Somehow blaming everyone else doesn't seem like the proper reaction.  :-(

I don't think Josh's tone is really the problem we should be worrying
about here.  He's pointing out a legitimate problem.  If you go back
and look at the CF app for 9.1, you'll see that Tom, Peter, and I
committed the overwhelming majority of the patches which were
submitted to CFs and went on to get committed.  So if we have a
CommitFest where Tom is on vacation and Peter is devoting his time to
polishing release N-1 rather than new development on release N, then
we're either going to need a much larger investment of time by one or
more other committers, or we're not really going to get through
everything.  When you lose the efforts of somebody who might commit 10
or 20 patches in a CF and comment usefully on another 10 or 20, it
leaves a big hole.

I don't believe Josh's intent was to disparage your contributions, or
Simon's, or Alvaro's, and it certainly isn't mine.  I appreciate all
the work that has been done on this CommitFest by everyone who has
participated, reviewers as well as committers.  At the same time, part
of the thankless task of being CF manager is asking people to step up
to the plate and do more.  It takes a heck of a lot of work to get 70
patches reviewed and committed, and it is unlikely that we will ever
have enough people spontaneously step up to the plate to make that
happen.  Since we can't call up people's bosses and complain that they
aren't doing enough work on the CommitFest, the CF manager is left
with the options of (1) trying to review (and commit?) all 30 or 40
remaining patches single-handedly or (2) begging.  If we're then going
to complain when they do one or both of those things, well, I think
that's a bit unfair.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space

2011-07-16 Thread Tatsuo Ishii

>> I modeled the original message on what happens when statement timeout is
>> exceeded, which doesn't state its limit in the error message at all -
>> actually I did wonder if there is was informal standard for *not* stating
>> the value of the limit that is being exceeded! However, I agree with you and
>> think it makes sense to include it here. I wonder if the additional detail
>> you are suggesting above might be better added to a HINT - what do you
>> think? If it is a better idea to just add it in the message as above I can
>> certainly do that.
> 
> Remember that what will happens is probably:
> 
> ERROR:  aborting due to exceeding temp file limit. Current usage 8000kB,
> requested size 8008kB, thus it will exceed temp file limit 8kB.
> 
> because temp file are increased by 8kb at once, rarely more (and by
> rare I mean that it can happens via an extension or in the future, not
> with current core postgresql).

Could you please elaborate why "Current usage 8000kB" can bigger than
"temp file limit 8kB"? I undertstand the point that temp files are
allocated by 8kB at once, but I don't understand why those numbers you
suggested could happen. Actually I tried with the modified patches and
got:

test=# CREATE TEMPORARY TABLE resourcetemp1 AS SELECT generate_series(1,10);
SELECT 10
test=# SET temp_file_limit = 578;
SET
test=# SELECT count(*) FROM (select * FROM resourcetemp1  ORDER BY 1) AS a;
ERROR:  aborting due to exceeding temp file limit, current usage 576kB, 
requested size 8192kB, thus it will exceed temp file limit 578kB

Here temp_file_limit is not specified by 8kB unit, so "current usage"
becomes 576kB, which is 8kB unit. I don't think those numbers
will terribly confuse DBAs..
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Florian Pflug

On Jul16, 2011, at 22:55 , Tom Lane wrote:
> Florian Pflug  writes:
>> Btw, if we touch that, I think we should think about providing some way
>> to detect when a backend fails to apply a value.
> 
> Hm, maybe, but keep in mind that there are valid reasons for a backend
> to ignore a postgresql.conf setting --- in particular, it might have a
> local override from some flavor of SET command.  So I don't think we'd
> want the flag to have the semantics of "this backend is actually *using*
> the value";

Yeah, the flag would simply indicate whether a particular backend
encountered an error during config file reload or not.

Actually being able to inspect other backend's GUCs would be nice, but
is way beyond the scope of this of course.

> and yet, if that's not what it means, people could still be
> confused.

Hm, if it's called "cfgfile_valid" or a prettier version thereof I
think the risk is small.

> There might be some implementation gotchas as well.  I'm not
> sure offhand how thoroughly the GUC code checks a value that is being
> overridden.

If it doesn't, then what happens when the overriding scope ends, and
the value reverts (or attempts to revert) to its default?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Tom Lane

Florian Pflug  writes:
> Btw, if we touch that, I think we should think about providing some way
> to detect when a backend fails to apply a value.

Hm, maybe, but keep in mind that there are valid reasons for a backend
to ignore a postgresql.conf setting --- in particular, it might have a
local override from some flavor of SET command.  So I don't think we'd
want the flag to have the semantics of "this backend is actually *using*
the value"; and yet, if that's not what it means, people could still be
confused.  There might be some implementation gotchas as well.  I'm not
sure offhand how thoroughly the GUC code checks a value that is being
overridden.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Florian Pflug

On Jul16, 2011, at 21:23 , Tom Lane wrote:

> Florian Pflug  writes:
>> On the downside, the current behaviour prevents problems if someone changes
>> two interrelated GUCs, but makes a mistake at one of them. For example,
>> someone might drastically lower bgwriter_delay but might botch the matching
>> adjustment of bgwriter_lru_maxpages.
> 
> That's a fair point, but the current behavior only saves you if the
> botch is such that the new value is detectably invalid, as opposed to
> say just a factor of 100 off from what you meant.  Not sure that that's
> all that helpful.

True. And a forgotten zero or wrong unit probably is even more likely than
a totally botched value. So +1 from me.

Btw, if we touch that, I think we should think about providing some way
to detect when a backend fails to apply a value. Showing the precise
option that caused the trouble is probably hard, but could we add a flag to
PGPROC that gets set once a backend fails to apply some setting on SIGUP?
If we showed the state of such a flag in pg_stat_activity, it'd give an
admin a quick way of verifying that all is well after a SIGUP. We might also
want to record the timestamp of the last processed file so that backends
which haven't yet processed the SIHUP can also be detected.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Tom Lane

Florian Pflug  writes:
> On the downside, the current behaviour prevents problems if someone changes
> two interrelated GUCs, but makes a mistake at one of them. For example,
> someone might drastically lower bgwriter_delay but might botch the matching
> adjustment of bgwriter_lru_maxpages.

That's a fair point, but the current behavior only saves you if the
botch is such that the new value is detectably invalid, as opposed to
say just a factor of 100 off from what you meant.  Not sure that that's
all that helpful.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Florian Pflug

On Jul16, 2011, at 20:55 , Tom Lane wrote:
>> The original argument for the current behavior was to avoid applying
>> settings from a thoroughly munged config file, but I think that the
>> checks involved in steps 1-3 would be sufficient to reject files that
>> had major problems.  It's possible that step 1 is really sufficient to
>> cover the issue, in which case we could drop the separate step-3 pass
>> and just treat invalid GUC names as a reason to ignore the particular
>> line rather than the whole file.  That would make things simpler and
>> faster, and maybe less surprising too.
> 
> IOW, I'm now pretty well convinced that so long as the configuration
> file is syntactically valid, we should go ahead and attempt to apply
> each name = value setting individually, without allowing the invalidity
> of any one name or value to prevent others from being applied.

One benefit of this would be that it'd make the logic of ProcessConfigFile
and its interaction with set_config_option() much simpler, and the behaviour
more predictable. Given that it took me a while to work through the
interactions of these two functions, I all for that.

On the downside, the current behaviour prevents problems if someone changes
two interrelated GUCs, but makes a mistake at one of them. For example,
someone might drastically lower bgwriter_delay but might botch the matching
adjustment of bgwriter_lru_maxpages.

Not sure if that out-weights the benefits, but I thought I'd bring it up.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI error messages

2011-07-16 Thread Tom Lane

Heikki Linnakangas  writes:
> I think I would prefer something like this:

> ERROR:  could not serialize access due to read/write dependencies among 
> transactions
> DETAIL: Reason code: %s
> HINT:  The transaction might succeed if retried.

> That's my 2c, anyway. I see you committed this already, I don't 
> violently object to that either...

Well, as I mentioned in the commit message, I've thought for some time
that there were use cases for errdetail_internal.  Whether these
particular places in predicate.c use it or not doesn't affect that.

I don't have a strong opinion about whether to do it like you suggest,
other than that the proposed wording doesn't meet the message style
guideline about detail messages being complete sentences.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-16 Thread Tom Lane

I wrote:
> I think that it might be sensible to have the following behavior:

> 1. Parse the file, where "parse" means collect all the name = value
> pairs.  Bail out if we find any syntax errors at that level of detail.
> (With this patch, we could report some or all of the syntax errors
> first.)

> 2. Tentatively apply the new custom_variable_classes setting if any.

> 3. Check to see whether all the "name"s are valid.  If not, report
> the ones that aren't and bail out.

> 4. Apply each "value".  If some of them aren't valid, report that,
> but continue, and apply all the ones that are valid.

> We can expect that the postmaster and all backends will agree on the
> results of steps 1 through 3.  They might differ as to the validity
> of individual values in step 4 (as per my example of a setting that
> depends on database_encoding), but we should never end up with a
> situation where a globally correct value is not globally applied.

I thought some more about this, and it occurred to me that it's not that
hard to foresee a situation where different backends might have
different opinions about the results of step 3, ie, different ideas
about the set of valid GUC names.  This could arise as a result of some
of them having a particular extension module loaded and others not.

Right now, whether or not you have say plpgsql loaded will not affect
your ability to do "SET plpgsql.junk = foobar" --- as long as "plpgsql"
is listed in custom_variable_classes, we'll accept the command and
create a placeholder variable for plpgsql.junk.  But it seems perfectly
plausible that we might someday try to tighten that up so that once a
module has done EmitWarningsOnPlaceholders("plpgsql"), we'll no longer
allow creation of new placeholders named plpgsql.something.  If we did
that, we could no longer assume that all backends agree on the set of
legal GUC variable names.

So that seems like an argument --- not terribly strong, but still an
argument --- for doing what I suggested next:

> The original argument for the current behavior was to avoid applying
> settings from a thoroughly munged config file, but I think that the
> checks involved in steps 1-3 would be sufficient to reject files that
> had major problems.  It's possible that step 1 is really sufficient to
> cover the issue, in which case we could drop the separate step-3 pass
> and just treat invalid GUC names as a reason to ignore the particular
> line rather than the whole file.  That would make things simpler and
> faster, and maybe less surprising too.

IOW, I'm now pretty well convinced that so long as the configuration
file is syntactically valid, we should go ahead and attempt to apply
each name = value setting individually, without allowing the invalidity
of any one name or value to prevent others from being applied.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI error messages

2011-07-16 Thread Heikki Linnakangas


On 16.07.2011 03:14, Tom Lane wrote:

"Kevin Grittner"  writes:

OK, after getting distracted by test failures caused by an unrelated
commit, I've confirmed that this passes my usual tests.  I don't
know anything about the tools used for extracting the text for the
translators, so if that needs any corresponding adjustment, someone
will need to point me in the right direction or cover that part
separately.


Well, the point is that this function *isn't* going to be known to the
NLS code, so AFAICS no adjustments should be needed there.  You did miss
some places that ought to be updated (mumble sources.sgml mumble) but
unless I hear objections to the whole idea, I'll fix and apply this
tomorrow.


I find it strange to simply leave those strings untranslated. It's going 
to look wrong, like someone just forgot to translate them. However, I 
agree it's perhaps a bit too much detail to translate all of those 
messages, and the translations would probably sound weird because there 
isn't established terms for these things yet.


I think I would prefer something like this:

ERROR:  could not serialize access due to read/write dependencies among 
transactions

DETAIL: Reason code: %s
HINT:  The transaction might succeed if retried.

Where %s gets the current detail field, untranslated, like:

Canceled on commit attempt with conflict in from prepared pivot.

Or perhaps shorten that to just "conflict in from prepared pivot", as 
the fact that it happened on commit attempt should be clear from the 
context - the error happened at a COMMIT statement.


That would be similar to what we do with OS error messages, with %m. It 
would be more obvious that the untranslated message is some internal 
information that the user is not expect to understand, and that it is 
untranslated on purpose.


That's my 2c, anyway. I see you committed this already, I don't 
violently object to that either...


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FOR KEY LOCK foreign keys

2011-07-16 Thread Noah Misch

On Sat, Jul 16, 2011 at 01:03:31PM -0500, Kevin Grittner wrote:
> Noah Misch  wrote:
>   
> > With this patch in its final form, I have completed 180+ suite runs
> > without a failure.
>   
> The attached patch allows the tests to pass when
> default_transaction_isolation is stricter than 'read committed'. 
> This is a slight change from the previously posted version of the
> files (because of a change in the order of statements, based on the
> timeouts), and in patch form this time.
>   
> Since `make installcheck-world` works at all isolation level
> defaults, as do all previously included isolation tests, it seems
> like a good idea to keep this up.  It will simplify my testing of SSI
> changes, anyway.

This does seem sensible.  Thanks.

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI error messages

2011-07-16 Thread Tom Lane

"Kevin Grittner"  writes:
> Tom Lane  wrote:
>> You did miss some places that ought to be updated (mumble
>> sources.sgml mumble)
 
> Sorry I missed that; sources.sgml covered with the attached.

Oh, I'd already fixed that locally, but thanks.  Patch is committed now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI error messages

2011-07-16 Thread Kevin Grittner

Tom Lane  wrote:
> "Kevin Grittner"  writes:
>> OK, after getting distracted by test failures caused by an
>> unrelated commit, I've confirmed that this passes my usual tests.
>> I don't know anything about the tools used for extracting the text
>> for the translators, so if that needs any corresponding
>> adjustment, someone will need to point me in the right direction
>> or cover that part separately.
> 
> Well, the point is that this function *isn't* going to be known to
> the NLS code, so AFAICS no adjustments should be needed there.
 
That seemed likely, but not knowing the tools, I wasn't sure.
 
>  You did miss some places that ought to be updated (mumble
> sources.sgml mumble)
 
Sorry I missed that; sources.sgml covered with the attached.
 
-Kevin




errdetail-internal-2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FOR KEY LOCK foreign keys

2011-07-16 Thread Kevin Grittner

Noah Misch  wrote:

> With this patch in its final form, I have completed 180+ suite runs
> without a failure.

The attached patch allows the tests to pass when
default_transaction_isolation is stricter than 'read committed'. 
This is a slight change from the previously posted version of the
files (because of a change in the order of statements, based on the
timeouts), and in patch form this time.

Since `make installcheck-world` works at all isolation level
defaults, as do all previously included isolation tests, it seems
like a good idea to keep this up.  It will simplify my testing of SSI
changes, anyway.

-Kevin

fklocks-tests-strict-isolation.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] isolation test deadlocking on buildfarm member coypu

2011-07-16 Thread Noah Misch

On Sat, Jul 16, 2011 at 05:50:45PM +0200, Rémi Zara wrote:
> Isolation tests seem to deadlock on buildfarm member coypu (NetBSD/powerpc 
> 5.1).

Thanks for the report and detailed analysis.  I believe the patch here will fix
the problem:
http://archives.postgresql.org/message-id/20110716171121.gb2...@tornado.leadboat.com

If you're up for testing it locally, I'd be interested to hear how it works out.

Thanks,
nm

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FOR KEY LOCK foreign keys

2011-07-16 Thread Noah Misch

On Fri, Jul 15, 2011 at 07:01:26PM -0400, Alvaro Herrera wrote:
> Excerpts from Noah Misch's message of mié jul 13 01:34:10 -0400 2011:
> 
> > coypu failed during the run of the test due to a different session being 
> > chosen
> > as the deadlock victim.  We can now vary deadlock_timeout to prevent this; 
> > see
> > attached fklocks-tests-deadlock_timeout.patch.  This also makes the tests 
> > much
> > faster on a default postgresql.conf.
> 
> I applied your patch, thanks.  I couldn't reproduce the failures without
> it, even running only the three new tests in a loop a few dozen times.

It's probably more likely to crop up on a loaded system.  I did not actually
reproduce it myself.  However, if you swap the timeouts, the opposite session
finds the deadlock.  From there, I'm convinced that the right timing
perturbations could yield the symptom coypu exhibited.

> > crake failed when it reported waiting on the first step of an existing 
> > isolation
> > test ("two-ids.spec").  I will need to look into that further.
> 
> Actually, there are four failures in tests other than the two fixed by
> your patch.  These are:
> 
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2011-07-12%2022:32:02
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar&dt=2011-07-14%2016:27:00
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pitta&dt=2011-07-15%2015:00:08
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2011-07-15%2018:32:02

Thanks for summarizing.  These all boil down to lock waits not anticipated by
the test specs.  Having pondered this, I've been able to come up with just one
explanation.  If autovacuum runs VACUUM during the test and finds that it can
truncate dead space from the end of a relation, it will acquire an
AccessExclusiveLock.  When I decrease autovacuum_naptime to 1s, I do see
plenty of pg_type and pg_attribute truncations during a test run.

When I sought to reproduce this, what I first saw instead was an indefinite
test suite hang.  That turned out to arise from an unrelated thinko -- I
assumed that backend IDs were stable for the life of the backend, but they're
only stable for the life of a pgstat snapshot.  This fell down when a backend
older than one of the test backends exited during the test:

4199 2011-07-16 03:33:28.733 EDT DEBUG:  forked new backend, pid=23984 socket=8
23984 2011-07-16 03:33:28.737 EDT LOG:  statement: SET client_min_messages = 
warning;
23984 2011-07-16 03:33:28.739 EDT LOG:  statement: SELECT i FROM 
pg_stat_get_backend_idset() t(i) WHERE pg_stat_get_backend_pid(i) = 
pg_backend_pid()
23985 2011-07-16 03:33:28.740 EDT DEBUG:  autovacuum: processing database 
"postgres"
4199 2011-07-16 03:33:28.754 EDT DEBUG:  forked new backend, pid=23986 socket=8
23986 2011-07-16 03:33:28.754 EDT LOG:  statement: SET client_min_messages = 
warning;
4199 2011-07-16 03:33:28.755 EDT DEBUG:  server process (PID 23985) exited with 
exit code 0
23986 2011-07-16 03:33:28.755 EDT LOG:  statement: SELECT i FROM 
pg_stat_get_backend_idset() t(i) WHERE pg_stat_get_backend_pid(i) = 
pg_backend_pid()
4199 2011-07-16 03:33:28.766 EDT DEBUG:  forked new backend, pid=23987 socket=8
23987 2011-07-16 03:33:28.766 EDT LOG:  statement: SET client_min_messages = 
warning;
23987 2011-07-16 03:33:28.767 EDT LOG:  statement: SELECT i FROM 
pg_stat_get_backend_idset() t(i) WHERE pg_stat_get_backend_pid(i) = 
pg_backend_pid()

This led isolationtester to initialize backend_ids = {1,2,2}, making us unable
to detect lock waits correctly.  That's also consistent with the symptoms Rémi
Zara just reported.  With that fixed, I was able to reproduce the failure due
to autovacuum-truncate-induced transient waiting using this recipe:
- autovacuum_naptime = 1s
- src/test/isolation/Makefile changed to pass --use-existing during installcheck
- Run 'make installcheck' in a loop
- A concurrent session running this in a loop:
CREATE TABLE churn (a int, b int, c int, d int, e int, f int, g int, h int);
DROP TABLE churn;

That yields a steady stream of vacuum truncations, and an associated lock wait
generally capsized the suite within 5-10 runs.  Frankly, I have some
difficulty believing that this mechanic alone produced all four failures you
cite above; I suspect I'm still missing some more-frequent cause.  Any other
theories on which system background activities can cause a transient lock
wait?  It would have to produce a "pgstat_report_waiting(true)" call, so I
believe that excludes all LWLock and lighter contention.

In any event, I have attached a patch that fixes the problems I have described
here.  To ignore autovacuum, it only recognizes a wait when one of the
backends under test holds a conflicting lock.  (It occurs to me that perhaps
we should expose a pg_lock_conflicts(lockmode_held text, lockmode_req text)
function to simplify this query -- this is a fairly common monitoring need.)

With that change in place, my setup survived through about fifty suite runs at

Re: [HACKERS] isolation test deadlocking on buildfarm member coypu

2011-07-16 Thread Kevin Grittner

Rémi Zara wrote:

> Isolation tests seem to deadlock on buildfarm member coypu
> (NetBSD/powerpc 5.1).

It looks to me like both the extreme logging of "waiting" messages
and the days-long unrecognized deadlocks are coming from the new "fk"
tests added to the isolation testing schedule.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-16 Thread Tom Lane

I wrote:
> Josh Kupershmidt  writes:
>> What's the Description displayed in that table?

> What it ought to be is the comment (if any) attached to the index's
> column.  Up through 8.4 this worked as expected, but in 9.0 and up
> somebody seems to have disallowed comments on index columns.  Not
> sure how carefully that was thought through.

After a bit of review of the archives, the somebody was me:
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=b7d67954456f15762c04e5269b64adc88dcd0860

and this thread was the discussion about it:
http://archives.postgresql.org/pgsql-hackers/2009-12/msg01982.php

It looks like we thought about pg_dump, but did not think about psql.

I think it might be reasonable to remove the Description column from
\d+ output for indexes and sequences, on the grounds that (1) it's
useless against 9.x servers, and (2) for those relkinds we add other
columns and so the horizontal space is precious.

We could also consider showing Description only when talking to a
pre-9.0 server; but that's going to render the code even more
spaghetti-ish, and the value seems pretty limited.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-16 Thread Tom Lane

Josh Kupershmidt  writes:
> So, if you look at \d+ newtbl, the right-most column named Description
> should display any comments attached to newtbl's columns. You should
> see "bcol column comment" as the Description for column bcol. That
> works OK.

Right.

> Now, try this:

> test=# \d+ newtbl_idx_bcol
> Index "public.newtbl_idx_bcol"
>  Column |  Type   | Definition | Storage | Description
> +-++-+-
>  bcol   | integer | bcol   | plain   |

> What's the Description displayed in that table?

What it ought to be is the comment (if any) attached to the index's
column.  Up through 8.4 this worked as expected, but in 9.0 and up
somebody seems to have disallowed comments on index columns.  Not
sure how carefully that was thought through.

> A similar situation exists for sequences and views displayed by \d+.

Again, the ability to stick comments on columns of those relkinds
seems to have been shut off as of 9.0.  In 8.4 all of these description
columns are functional.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] psql: bogus descriptions displayed by \d+

2011-07-16 Thread Josh Kupershmidt

Hi all,

The psql output for \d+ on indexes, sequences, and views is rather
bogus. Examples below from the SQL at bottom.

So, if you look at \d+ newtbl, the right-most column named Description
should display any comments attached to newtbl's columns. You should
see "bcol column comment" as the Description for column bcol. That
works OK.

Now, try this:

test=# \d+ newtbl_idx_bcol
Index "public.newtbl_idx_bcol"
 Column |  Type   | Definition | Storage | Description
+-++-+-
 bcol   | integer | bcol   | plain   |

What's the Description displayed in that table? Well, right now it's
totally broken (not displaying anything). I'm not sure if we should
try to display the comment attached to column bcol in this case: if
so, what would we do for e.g. functional indexes?

A similar situation exists for sequences and views displayed by \d+.
I'd be OK with just ripping out "Description" entirely in these cases;
if you want to see the comment attached to the index or sequence
itself, you can use \di+ or \ds+. Although now might also be a good
time to think about properly displaying sequence or index comments via
\d+, and how they should be displayed.

Thoughts?

Josh

-- Example SQL creating a few objects with comments. --
CREATE TABLE newtbl (acol  serial PRIMARY KEY,
 bcol int NOT NULL,
 ccol text DEFAULT NULL,
 dcol text NOT NULL);

COMMENT ON TABLE newtbl IS 'newtbl table comment';
COMMENT ON COLUMN newtbl.bcol IS 'bcol column comment';
COMMENT ON SEQUENCE newtbl_acol_seq IS 'sequence comment';

CREATE INDEX newtbl_idx_bcol ON newtbl (bcol);
COMMENT ON INDEX newtbl_idx_bcol IS 'single-column index on newtbl';

CREATE VIEW myview AS SELECT * FROM newtbl;
COMMENT ON VIEW myview IS 'view comment';

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest Status: Sudden Death Overtime

2011-07-16 Thread Florian Pflug

On Jul15, 2011, at 23:05 , Josh Berkus wrote:
> * Bugfix for XPATH() if expression returns a scalar value

Well, Peter Eisentraut seemed to disagree with my approach initially,
and seemed to prefer a separate function for XPATH expressions which
return a scalar value.

  http://archives.postgresql.org/pgsql-hackers/2011-06/msg00401.php

I considered that, but came to the conclusion that it has problems
of it's own, described here:

  http://archives.postgresql.org/pgsql-hackers/2011-06/msg00608.php

Peter stopped responding at that point, so I assumed that my argument
convinced him.

Radoslaw complained about the fact the results of scalar values
come back escaped from XPATH() with the patch applied (without it,
an empty array is returned) and wanted that changed - Basically the
same objection he had to my other patch which made sure text nodes
are properly escaped (The fine print here is that text nodes *aren't*
scalar values, they're nodes. What fun.). He did upgrade that other
patch to "Ready for Committer" despite his objections, though. I
don't know whether he wanted to do the same with this one or not,
and my inquiry was left unanswered

  http://archives.postgresql.org/pgsql-hackers/2011-07/msg00783.php

I also don't know much code review the patch has received. I didn't
receive any complaints, but whether that reflects the quality of
the patch or the quantity of review I leave for someone else to judge.

I dunno what the best way forward is, but I'd hate to see this being
bumped to the next commit-fest.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] isolation test deadlocking on buildfarm member coypu

2011-07-16 Thread Rémi Zara

Hi,

Isolation tests seem to deadlock on buildfarm member coypu (NetBSD/powerpc 5.1).

Here are the process for one deadlock (several days) :

pgbuildfarm  2405  0.0  1.2 26948  7916 ? Is   Wed02AM0:00.25 postgres: 
pgbuildfarm isolationtest [local] INSERT waiting 
pgbuildfarm  6559  0.0  0.4 10288  2684 ? IWed02AM0:00.05 
./pg_isolation_regress --psqldir=/home/pgbuildfarm/workdir/HEAD/inst/bin 
--inputdir=. --schedule=./isolation_schedule (pg_isolation_reg)
pgbuildfarm  8296  0.0  1.8 25924 11880 ? Ss   Wed02AM0:59.79 postgres: 
writer process
pgbuildfarm 10630  0.0  0.4 16212  2916 ? Ss   Wed02AM1:43.16 postgres: 
stats collector process
pgbuildfarm 11814  0.0  0.7  9084  4408 ? IWed01AM0:00.39 
/usr/pkg/bin/perl ./run_branches.pl --run-one 
pgbuildfarm 14785  0.0  0.2  3344  1276 ? IWed02AM0:00.03 gmake 
NO_LOCALE=1 installcheck 
pgbuildfarm 15353  0.0  0.8 26948  4960 ? Ss   Wed02AM1:10.55 postgres: 
autovacuum launcher process
pgbuildfarm 16619  0.0  1.8 15992 11512 ? IWed01AM0:04.90 
/usr/pkg/bin/perl ./run_build.pl --config build-farm.conf HEAD 
pgbuildfarm 17532  0.0  0.5 25924  3052 ? Ss   Wed02AM0:34.49 postgres: 
wal writer process
pgbuildfarm 18073  0.0  1.0 25924  6844 ? IWed02AM0:19.83 
/home/pgbuildfarm/workdir/HEAD/inst/bin/postgres -D data-C 
pgbuildfarm 21310  0.0  1.2 26948  7948 ? Is   Wed02AM0:00.06 postgres: 
pgbuildfarm isolationtest [local] idle in transaction 
pgbuildfarm 24137  0.0  0.2  3456  1064 ? Is   Wed01AM0:00.01 /bin/sh 
-c cd /home/pgbuildfarm/client-code && ./run_branches.pl --run-one 
pgbuildfarm 26932  0.0  0.2  3456  1072 ? IWed02AM0:00.01 sh -c cd 
pgsql.16619/src/test/isolation && gmake NO_LOCALE=1 installcheck 2>&1 
pgbuildfarm 28430  0.0  1.3 26948  8816 ? Ss   Wed02AM  408:43.91 postgres: 
pgbuildfarm isolationtest [local] idle 
pgbuildfarm 28809  0.0  0.4 10476  2844 ? RWed02AM   27:48.24 
./isolationtester dbname=isolationtest


I killed one of the postgres process, and it lead to a failure of the buildfarm 
perl script (probably due to the 2GB+ log file), and persistance of the 
postgres cluster (which I had to clean manually)
I've kept the source and install directories, if useful for investigation.


Here are the processes for a second lock (several hours):

pgbuildfarm  1423  0.0  0.4 16212  2888 ? Ss2:11PM0:04.07 postgres: 
stats collector process
pgbuildfarm  1604  0.0  1.2 26948  7724 ? Is2:13PM0:00.05 postgres: 
pgbuildfarm isolationtest [local] idle in transaction 
pgbuildfarm  2380  0.0  0.4 10288  2624 ? I 2:12PM0:00.05 
./pg_isolation_regress --psqldir=/home/pgbuildfarm/workdir/HEAD/inst/bin 
--inputdir=. --schedule=./isolation_schedule (pg_isolation_reg)
pgbuildfarm  3156  0.0  0.8 26948  4920 ? Ss2:11PM0:02.71 postgres: 
autovacuum launcher process
pgbuildfarm  3253  0.0  0.5 25924  3044 ? Ss2:11PM0:01.65 postgres: 
wal writer process
pgbuildfarm  9889  0.0  1.8 25924 11856 ? Ss2:11PM0:04.87 postgres: 
writer process
pgbuildfarm 11359  0.0  1.4 26948  9032 ? Ss2:13PM   15:16.63 postgres: 
pgbuildfarm isolationtest [local] idle 
pgbuildfarm 13738  0.0  1.2 26948  7860 ? Ss2:13PM0:00.06 postgres: 
pgbuildfarm isolationtest [local] UPDATE waiting 
pgbuildfarm 14031  0.0  1.0 25924  6800 ? S 2:11PM0:01.82 
/home/pgbuildfarm/workdir/HEAD/inst/bin/postgres -D data-C 
pgbuildfarm 16555  0.0  0.4 10476  2764 ? S 2:13PM1:01.77 
./isolationtester dbname=isolationtest 
pgbuildfarm 18038  0.0  0.2  3456  1008 ? I 2:11PM0:00.01 sh -c cd 
pgsql.20707/src/test/isolation && gmake NO_LOCALE=1 installcheck 2>&1 
pgbuildfarm 19057  0.0  0.2  3456  1000 ? Is1:30PM0:00.01 /bin/sh 
-c cd /home/pgbuildfarm/client-code && ./run_branches.pl --run-one 
pgbuildfarm 20707  0.0  1.8 15992 11460 ? I 1:30PM0:04.92 
/usr/pkg/bin/perl ./run_build.pl --config build-farm.conf HEAD 
pgbuildfarm 21749  0.0  0.2  3344  1248 ? I 2:11PM0:00.04 gmake 
NO_LOCALE=1 installcheck 
pgbuildfarm 23054  0.0  0.7  9084  4352 ? I 1:30PM0:00.38 
/usr/pkg/bin/perl ./run_branches.pl --run-one 
pgbuildfarm  2125  0.0  0.0  2652 0 ttyp0 R+5:09PM0:00.00 grep 
pgbuildf (bash)
pgbuildfarm  9930  0.0  0.3  2652  2072 ttyp0 S 5:03PM0:00.09 bash 
pgbuildfarm 29194  0.0  0.1  3552   872 ttyp0 O+5:09PM0:00.01 ps -aux 
pgbuildfarm@cube:workdir$date
Sat Jul 16 17:10:13 CEST 2011

The log fills with:

[4e21807d.2c5f:1060074] LOG:  execute isolationtester_waiting: SELECT 1 WHERE 
pg_stat_get_backend_waiting($1)
[4e21807d.2c5f:1060075] DETAIL:  parameters: $1 = '5'
[4e21807d.2c5f:1060076] LOG:  execute isolationtester_waiting: SELECT 1 WHERE 
pg_stat_get_backend_waiting($1)
[4e21807d.2c5f:1060077] DETAIL:  parameters: $1 = '5'

I killed process 1

Re: [HACKERS] Mysterious server crashes

2011-07-16 Thread k...@rice.edu

On Fri, Jul 15, 2011 at 11:37:54PM +0200, Žiga Kranjec wrote:
> Hello!
> 
> Recently we have upgraded our debian system (sid),
> which has since started crashing mysteriously.
> We are still looking into that. It runs on 3ware RAID.
> Postgres package is 8.4.8-2.
> 
> The database came back up apparently ok, except
> for indexes. Running reindex produces this error on
> one of the tables:
> 
> ERROR:  unexpected chunk number 1 (expected 0) for toast value
> 17539760 in pg_toast_16992
> 
> Same with select.
> 
> I tried running reindex on toast table didn't help. Running:
> 
> select * from pg_toast.pg_toast_16992 where chunk_id = 17539760;
> 
> crashed postgres backend (and apparently the whole server).
> 
> Is there anything we can/should do to fix the problem, besides
> restoring the whole database from backup?
> 
> Thanks!
> 
> Ziga
> 

Hi Ziga,

I do not want to be negative, but it sounds like your server is
having serious problems completely outside of PostgreSQL. Reading a
file should not cause your system to crash. That sounds like a
driver or hardware problem and you need to fix that. I would make
sure you have a good backup for your DB before you do anything
else.

Good luck,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-07-16 Thread Yeb Havinga


On 2011-07-14 21:46, Kohei KaiGai wrote:

Sorry, the syscache part was mixed to contrib/sepgsql part
in my previous post.
Please see the attached revision.

Although its functionality is enough simple (it just reduces
number of system-call invocation), its performance
improvement is obvious.
So, I hope someone to volunteer to review these patches.


I will be able to look at this patch next week on monday and tuesday, 
without wanting to raise any expectations about that time being enough 
for me to say anything useful. On a longer timescale, I believe that 
sepgsql is one of the most important new features of PostgreSQL and that 
I want to commit myself to spend more community work on this subject.


regards,
Yeb

--

Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How can I change patch status in CommitFest application?

2011-07-16 Thread MauMau


From: "Tom Lane" 

"MauMau"  writes:

I re-submitted a patch and added a comment on the page below. I chose
"patch" from the comment type drop-down box, but the patch status does 
not

change from "waiting on author". I expected the patch status would become
"needs review".


No, adding a comment doesn't change anything else.


What should I do? Where should I refer to handle patch status correctly?


Click on the "Edit Patch" link, then change the status.


Oh, why didn't I notice the "Edit Patch" link!? I successfully changed the 
patch status. Thank you very much.


Regards
MauMau


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

37 matches

Mail list logo