from:"daveg"

[HACKERS] timezone buglet?

2011-10-04 Thread daveg


Postgresql 9.0.4 has the timezone:

  America/Blanc-Sablon

However other sources seem to spell this with an underscore instead of dash:

  America/Blanc_Sablon

It appears that beside 'America/Blanc_Sablon', other multi-word timezones
are spelled with underscore. For example: 'Australia/Broken_Hill',
'Asia/Ho_chi_minh', 'America/Los_Angeles', and so on.

Two questions:

Is this correct as is, or is it wrong in 9.0.4?

And, should I have reported this somewhere else? Bugs?

Err, three questions:

I'm a little unclear on how the tz machinery works. Can I just update the
name column in pg_timezones to fix it for now?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Wed, Sep 07, 2011 at 09:02:04PM -0400, Tom Lane wrote:
> daveg  writes:
> > On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
> >> BTW ... what were the last versions you were running on which you had
> >> *not* seen the problem?  (Just wondering about the possibility that we
> >> back-patched some "fix" that broke things.  It would be good to have
> >> a version range before trawling the commit logs.)
> 
> > The first version we saw it on was 8.4.7.
> 
> Yeah, you said that.  I was wondering what you'd last run before 8.4.7.

Sorry, misunderstood. We were previously running 8.4.4, but have been on 8.4.7
since shortly after it was released. Prior to that we have had all the major
and most of the minor releases since 7.1.
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
> daveg  writes:
> > Also, this is very intermittant, we have seen it only in recent months
> > on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
> > feels like a few times a month. Possibly some new application behaviour
> > is provoking it, but I have no guesses as to what.
> 
> BTW ... what were the last versions you were running on which you had
> *not* seen the problem?  (Just wondering about the possibility that we
> back-patched some "fix" that broke things.  It would be good to have
> a version range before trawling the commit logs.)

The first version we saw it on was 8.4.7.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > I thought about an error exit from client authentication, and that's a
> > somewhat appealing explanation, but I can't quite see why we wouldn't
> > clean up there the same as anywhere else.  The whole mechanism feels a
> > bit rickety to me - we don't actually release locks; we just abort the
> > transaction and *assume* that will cause locks to get released.
> 
> Well, transaction abort will call LockReleaseAll, which is carefully
> coded to clean up the proclock lists regardless of what is in the
> locallocks table, so I'm not sure why you find that any more rickety
> than anything else.  But maybe it'd be interesting for Dave to stick a
> LockReleaseAll call into ProcKill() and see if that makes things better.
> (Dave: test that before you put it in production, I'm not totally sure
> it's safe.)

Re safety, what is the worst case here? 

Also, this is very intermittant, we have seen it only in recent months
on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
feels like a few times a month. Possibly some new application behaviour
is provoking it, but I have no guesses as to what.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Wed, Sep 07, 2011 at 06:35:08PM -0400, Tom Lane wrote:
> daveg  writes:
> > It does not seem restricted to pg_authid:
> > 2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
> > object 16403/2615/0 
> > And I think I've seen it on other tables too.
> 
> Hmm.  2615 = pg_namespace, which most likely is the first catalog
> accessed by just about any SQL command that's going to access tables at
> all, so I suspect that this is mostly just a "the first access failed"
> thing and not something peculiar to pg_namespace.  But we still don't
> have a clue why the locks are not getting released by the previous
> owner of the PGPROC slot.  Have you trawled your logs to see if there's
> any sign of any distress at all, shortly before the problem starts to
> happen?

Will do, but its a pretty big haystack. Sure wish I knew what the needle
looked like. ;-)

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Wed, Sep 07, 2011 at 04:55:24PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > Tom's right to be skeptical of my theory, because it would require a
> > CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the
> > pathways that use session-level locks, and I can't find one.
> 
> More to the point: session-level locks are released on error.  The only
> way to get out of a transaction while still holding one is for the
> VACUUM-or-whichever-command code to deliberately commit and exit while
> still holding it.  An error exit path would release the lock.
> 
> > OTOH, I'm skeptical of the theory that this involves userlocks,
> > because this whole thread started because of a complaint about lock
> > 0/1260/0 already being held.  That ain't no userlock.
> 
> Yeah, and for that matter it seems to let VACUUM off the hook too.
> If we assume that the reported object ID is non-corrupt (and if it's
> always the same, that seems like the way to bet) then this is a lock
> on pg_authid.
> 
> Hmmm ... could the pathway involve an error exit from client
> authentication?  We're still finding bugs in the 9.0 rewrite of
> auth-time database access.

It does not seem restricted to pg_authid:

2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
object 16403/2615/0 

And I think I've seen it on other tables too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Wed, Sep 07, 2011 at 10:20:24AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > After spending some time staring at the code, I do have one idea as to
> > what might be going on here.  When a backend is terminated,
> > ShutdownPostgres() calls AbortOutOfAnyTransaction() and then
> > LockReleaseAll(USER_LOCKMETHOD, true).  The second call releases all
> > user locks, and the first one (or so we suppose) releases all normal
> > locks as part of aborting the transaction.  But what if there's no
> > transaction in progress?  In that case, AbortOutOfAnyTransaction()
> > won't do anything - which is fine as far as transaction-level locks
> > go, because we shouldn't be holding any of those anyway if there's no
> > transaction in progress.  However, if we hold a session-level lock at
> > that point, then we'd orphan it.  We don't make much use of session
> > locks.  As far as I can see, they are used by (1) VACUUM, (2) CREATE
> > INDEX CONCURRENTLY, (3) ALTER DATABASE .. SET TABLESPACE, and (4) on
> > standby servers, redo of DROP DATABASE actions.  Any chance one of
> > those died or was killed off around the time this happened?
> 
> I don't believe this theory at all, because if that were the issue,
> we'd have heard about it long since.  The correct theory has to involve
> a very-little-used triggering condition.  At the moment I'm wondering
> about advisory (userspace) locks ... Dave, do your apps use any of those?

Yes, we make extensive use of advisory locks. That was my thought too when
Robert mentioned session level locks.

I'm happy to add any additional instrumentation, but my client would be
happier to actually run it if there was a way to recover from this without
an unplanned outage. Is there something I can do when the patch detects the
problem to be able to continue without a restart? Is is save to just reset
the proclock queue? I don't think they would mind leaking locks, for instance,
and a later planned restart to clear it up as much as they mind unscheduled
downtime.

Thank

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg

On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote:
> On Mon, Aug 22, 2011 at 3:31 AM, daveg  wrote:
> > So far I've got:
> >
> >  - affects system tables
> >  - happens very soon after process startup
> >  - in 8.4.7 and 9.0.4
> >  - not likely to be hardware or OS related
> >  - happens in clusters for period of a few second to many minutes
> >
> > I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
> > hard to get downtime to pick up new builds. Any other ideas on getting to
> > the bottom of this?
> 
> I've been thinking this one over, and doing a little testing. I'm
> still stumped, but I have a few thoughts.  What that error message is
> really saying is that the LOCALLOCK bookkeeping doesn't match the
> PROCLOCK bookkeeping; it doesn't tell us which one is to blame.
... 
> My second thought is that perhaps a process is occasionally managing
> to exit without fully cleaning up the associated PROCLOCK entry.  At
> first glance, it appears that this would explain the observed
> symptoms.  A new backend gets the PGPROC belonging to the guy who
> didn't clean up after himself, hits the error, and disconnects,
> sticking himself right back on to the head of the SHM_QUEUE where the
> next connection will inherit the same PGPROC and hit the same problem.
>  But it's not clear to me what could cause the system to get into this
> state in the first place, or how it would eventually right itself.
> 
> It might be worth kludging up your system to add a test to
> InitProcess() to verify that all of the myProcLocks SHM_QUEUEs are
> either NULL or empty, along the lines of the attached patch (which
> assumes that assertions are enabled; otherwise, put in an elog() of
> some sort).  Actually, I wonder if we shouldn't move all the
> SHMQueueInit() calls for myProcLocks to InitProcGlobal() rather than
> doing it over again every time someone calls InitProcess().  Besides
> being a waste of cycles, it's probably less robust this way.   If
> there somehow are leftovers in one of those queues, the next
> successful call to LockReleaseAll() ought to clean up the mess, but of
> course there's no chance of that working if we've nuked the queue
> pointers.

I did this in the elog flavor as we don't build production images with asserts.
It has been running on all hosts for a few days. Today it hit the extra
checks in initproc.

00:02:32.782  8626  [unknown] [unknown]  LOG:  connection received: host=bk0 
port=42700
00:02:32.783  8627  [unknown] [unknown]  LOG:  connection received: host=op2 
port=45876
00:02:32.783  8627  d61 apps  FATAL:  Initprocess myProclocks[4] not empty: 
queue 0x2ae6b4b895f8 (prev 0x2ae6b4a2b558, next 0x2ae6b4a2b558) 
00:02:32.783  8626  d35 postgres  LOG:  connection authorized: user=postgres 
database=c35
00:02:32.783  21535  LOG:  server process (PID 8627) exited with exit code 1
00:02:32.783  21535  LOG:  terminating any other active server processes
00:02:32.783  8626  c35 postgres  WARNING:  terminating connection because of 
crash of another server process

The patch that produced this is attached. If you can think of anything I
can add to this to help I'd be happy to do so. Also, can I clean this up
and continue somehow? Maybe clear the queue instead having to have a restart?
Or is there a way to just pause this proc here, maybe mark it not to be used
and exit, or just to sleep forever so I can debug later?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
--- postgresql-9.0.4/src/backend/storage/lmgr/proc.c2011-04-14 
20:15:53.0 -0700
+++ postgresql-9.0.4.dg/src/backend/storage/lmgr/proc.c 2011-08-23 
17:30:03.505176019 -0700
@@ -323,7 +323,15 @@
MyProc->waitLock = NULL;
MyProc->waitProcLock = NULL;
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+   {
+   SHM_QUEUE *queue = &(MyProc->myProcLocks[i]);
+   if (! (!queue->prev || queue->prev == queue ||
+  !queue->next || queue->next == queue)
+   )
+   elog(FATAL, "Initprocess myProclocks[%d] not empty: 
queue %p (prev %p, next %p) ",
+   i, queue, queue->prev, queue->next);
SHMQueueInit(&(MyProc->myProcLocks[i]));
+   }
MyProc->recoveryConflictPending = false;
 
/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-05 Thread daveg

On Mon, Sep 05, 2011 at 08:19:21PM -0400, Bruce Momjian wrote:
> daveg wrote:
> > > Can you tell me what table is showing this error?  Does it happen during
> > > vacuum?  Can you run a vacuum verbose to see what it is throwing the
> > > error on?  Thanks.
> > 
> > This was upgrading from 8.4.8 to 9.0.4. I don't have the running cluster
> > anymore, but I do have tar.gz archives of it and could probably find
> > 2TB free somewhere to restore it to if there is something useful to extract.
> > 
> > However, I don't think this was toast related. Most of our rows are short 
> > and 
> > have only int, float, and short text columns. These errors hit over 60
> > different tables mostly during the analyzes we ran immediately after the
> > upgrade. It also hit during select, insert and delete statements. We did not
> > run the db more than a few minutes as the damage was so extensive.
> > 
> > As far as I can tell pg_upgrade never copied any pg_clog files from the
> > old cluster to the new cluster. I wish I had detected that before running
> > the remove_old_cluster.sh script.
> 
> Wow, no clogs?  That would make the system very confused.  You can pull
> the clogs out of the old backup and move them over if the files don't
> already exist.

We don't have the old cluster after running delete_old_cluster.ch. We use
pg_dump for backup, so no clogs.  We ended up restored 20 odd dbs totalling
2.1TB from the previous days pg_dumps.

If you review my original report I mentioned that there were only 2 clog
files in the new cluster both with ctime after the start of postgresql
after the upgrade. I did the upgrade for three hosts at the same time, the
others were fine. They have dozens of clogs dating back days before the
upgrade. The failing system had only 2 recent clog.

-dg
 
> -- 
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
> 
>   + It's impossible for everything to be true. +
> 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-05 Thread daveg


Sorry I missed your reply, catching up now.

On Wed, Aug 31, 2011 at 09:56:59PM -0400, Bruce Momjian wrote:
> daveg wrote:
> > On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote:
> > > On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote:
> > > vacuumdb: vacuuming of database "etsy_v2" failed: ERROR:  could not 
> > > access status of transaction 3429738606
> > > DETAIL:  Could not open file "pg_clog/0CC6": No such file or directory.
> > > 
> > > Interestingly.
> > > 
> > > In old dir there is pg_clog directory with files:
> > > 0AC0 .. 0DAF (including 0CC6, size 262144)
> > > but new pg_clog has only:
> > > 0D2F .. 0DB0
> > > 
> > > File content - nearly all files that exist in both places are the same, 
> > > with exception of 2 newest ones in new datadir:
> > > 3c5122f3e80851735c19522065a2d12a  0DAF
> > > 8651fc2b9fa3d27cfb5b496165cead68  0DB0
> > > 
> > > 0DB0 doesn't exist in old, and 0DAF has different md5sum: 
> > > 7d48996c762d6a10f8eda88ae766c5dd
...
> > I had this same thing happen this Saturday just past and my client had to
> > restore the whole 2+ TB instance from the previous days pg_dumps.
...
> > After running pg_upgrade apparently successfully and analyzeing all the

Update: reviewing the logs I see some of the analyzes hit the "could not
access status of transaction" error too.

> > tables we restarted the production workload and started getting errors:
> > 
> > 2011-08-27 04:18:34.015  12337  c06  postgres  ERROR:  could not access 
> > status of transaction 2923961093
> > 2011-08-27 04:18:34.015  12337  c06  postgres  DETAIL:  Could not open file 
> > "pg_clog/0AE4": No such file or directory.
> > 2011-08-27 04:18:34.015  12337  c06  postgres  STATEMENT:  analyze 
> > public.b_pxx;
> > 
> > On examination the pg_clog directory contained on two files timestamped
> > after the startup of the new cluster with 9.0.4. Other hosts that upgraded
> > successfully had numerous files in pg_clog dating back a few days. So it
> > appears that all the clog files went missing during the upgrade somehow.
> > a
> > This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between
> > at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously.
> 
> I have posted this fix to the hackers email list, but I found it only
> affected old 8.3 servers, not old 8.4.X, so I am confused by your bug
> report.
> 
> I have tested 8.4.X to 9.0.4 and found pg_upgrade preserves toast
> relfrozenxids properly in that case.
> 
> Can you tell me what table is showing this error?  Does it happen during
> vacuum?  Can you run a vacuum verbose to see what it is throwing the
> error on?  Thanks.

This was upgrading from 8.4.8 to 9.0.4. I don't have the running cluster
anymore, but I do have tar.gz archives of it and could probably find
2TB free somewhere to restore it to if there is something useful to extract.

However, I don't think this was toast related. Most of our rows are short and 
have only int, float, and short text columns. These errors hit over 60
different tables mostly during the analyzes we ran immediately after the
upgrade. It also hit during select, insert and delete statements. We did not
run the db more than a few minutes as the damage was so extensive.

As far as I can tell pg_upgrade never copied any pg_clog files from the
old cluster to the new cluster. I wish I had detected that before running
the remove_old_cluster.sh script.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-29 Thread daveg

On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote:
> On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote:
> vacuumdb: vacuuming of database "etsy_v2" failed: ERROR:  could not access 
> status of transaction 3429738606
> DETAIL:  Could not open file "pg_clog/0CC6": No such file or directory.
> 
> Interestingly.
> 
> In old dir there is pg_clog directory with files:
> 0AC0 .. 0DAF (including 0CC6, size 262144)
> but new pg_clog has only:
> 0D2F .. 0DB0
> 
> File content - nearly all files that exist in both places are the same, with 
> exception of 2 newest ones in new datadir:
> 3c5122f3e80851735c19522065a2d12a  0DAF
> 8651fc2b9fa3d27cfb5b496165cead68  0DB0
> 
> 0DB0 doesn't exist in old, and 0DAF has different md5sum: 
> 7d48996c762d6a10f8eda88ae766c5dd
> 
> one more thing. I did select count(*) from transactions and it worked.
> 
> that's about it. I can probably copy over files from old datadir to new (in
> pg_clog/), and will be happy to do it, but I'll wait for your call - retry 
> with
> copies files might destroy some evidence.

I had this same thing happen this Saturday just past and my client had to
restore the whole 2+ TB instance from the previous days pg_dumps.
I had been thinking that perhaps I did something wrong in setting up or
running the upgrade, but had not found it yet. Now that I see Hubert has
the same problem it is starting to look like pg_upgrade can eat all your
data.

After running pg_upgrade apparently successfully and analyzeing all the
tables we restarted the production workload and started getting errors:

2011-08-27 04:18:34.015  12337  c06  postgres  ERROR:  could not access status 
of transaction 2923961093
2011-08-27 04:18:34.015  12337  c06  postgres  DETAIL:  Could not open file 
"pg_clog/0AE4": No such file or directory.
2011-08-27 04:18:34.015  12337  c06  postgres  STATEMENT:  analyze public.b_pxx;

On examination the pg_clog directory contained on two files timestamped
after the startup of the new cluster with 9.0.4. Other hosts that upgraded
successfully had numerous files in pg_clog dating back a few days. So it
appears that all the clog files went missing during the upgrade somehow.
a
This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between
at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already

2011-08-23 Thread daveg

On Sun, Aug 14, 2011 at 12:16:39AM -0400, Robert Haas wrote:
> On Fri, Aug 12, 2011 at 7:19 PM, daveg  wrote:
> > This seems to be bug month for my client. Now there are seeing periods
> > where all new connections fail immediately with the error:
> >
> >   FATAL:  lock AccessShareLock on object 0/1260/0 is already held
...
> > What can I do to help track this down?
> 
> I've seen that error (though not that exact fact pattern) caused by
> bad RAM.  It's unclear to me what else could cause it.
> 
> In terms of debugging, it seems like it might be sensible to start by
> injecting some debugging code that dumps out the contents of the LOCK
> and LOCALLOCK structures at the point the error occurs.

I've made up the attached patch to print this, please suggest any additions.
I'll deploy this on a couple of the production hosts that have had the
issue this evening, but there is no telling when or if it will strike next.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
*** postgresql-9.0.4/src/backend/storage/lmgr/lock.c.orig   2011-08-22 
13:46:29.386428943 -0700
--- postgresql-9.0.4/src/backend/storage/lmgr/lock.c2011-08-23 
00:12:43.456422966 -0700
***
*** 469,474 
--- 469,534 
return LockAcquireExtended(locktag, lockmode, sessionLock, dontWait, 
true);
  }
  
+ int DG_FORCE_TRAP = 0;
+ 
+ inline static void
+ DG_LOCK_PRINT(const char *where, const LOCK *lock, LOCKMODE type)
+ {
+   elog(LOG,
+"%s: lock(%p) id(%u,%u,%u,%u,%u,%u) grantMask(%x) "
+"req(%d,%d,%d,%d,%d,%d,%d)=%d "
+"grant(%d,%d,%d,%d,%d,%d,%d)=%d wait(%d) type(%s)",
+where, lock,
+lock->tag.locktag_field1, lock->tag.locktag_field2,
+lock->tag.locktag_field3, lock->tag.locktag_field4,
+lock->tag.locktag_type, lock->tag.locktag_lockmethodid,
+lock->grantMask,
+lock->requested[1], lock->requested[2], lock->requested[3],
+lock->requested[4], lock->requested[5], lock->requested[6],
+lock->requested[7], lock->nRequested,
+lock->granted[1], lock->granted[2], lock->granted[3],
+lock->granted[4], lock->granted[5], lock->granted[6],
+lock->granted[7], lock->nGranted,
+lock->waitProcs.size,
+LockMethods[LOCK_LOCKMETHOD(*lock)]->lockModeNames[type]);
+ }
+ 
+ 
+ inline static void
+ DG_PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
+ {
+   elog(LOG,
+"%s: proclock(%p) lock(%p) method(%u) proc(%p) "
+  "hold(%x) release(%x) "
+  "links(p,n): lock=(%p,%p), proc=(%p,%p)",
+where, proclockP, proclockP->tag.myLock,
+PROCLOCK_LOCKMETHOD(*(proclockP)),
+proclockP->tag.myProc,
+(int) proclockP->holdMask, (int) proclockP->releaseMask,
+proclockP->lockLink.prev, proclockP->lockLink.next,
+proclockP->procLink.prev, proclockP->procLink.next
+   );
+ }
+ 
+ inline static void
+ DG_LOCALLOCK_PRINT(const char *where, const LOCALLOCK *localP)
+ {
+   elog(LOG,
+"%s: locallock(%p) id(%u,%u,%u,%u,%u,%u mode %x) "
+"lock(%p), proclock(%p) "
+"hashcode %x  nlocks %ld  numLockOwners %d  maxLockOwners %d ",
+where, localP,
+localP->tag.lock.locktag_field1, 
localP->tag.lock.locktag_field2,
+localP->tag.lock.locktag_field3, 
localP->tag.lock.locktag_field4,
+localP->tag.lock.locktag_type, 
localP->tag.lock.locktag_lockmethodid,
+localP->tag.mode,
+localP->lock, localP->proclock,
+localP->hashcode, localP->nLocks,
+localP->numLockOwners, localP->maxLockOwners
+/* localP->lockOwners[0].owner, localP->lockOwners.nlocks  "%p 
%d" */
+   );
+ }
+ 
  /*
   * LockAcquireExtended - allows us to specify additional options
   *
***
*** 500,505 
--- 560,568 
LWLockIdpartitionLock;
int status;
boollog_lock = false;
+   int DG_found_local = -1;
+   int DG_found_lock = -1;
+   int DG_found_proc = -1;
  
if (lockmethodid <= 0 || lockmethodid >= lengthof(LockMethods))
elog(ERROR, "unrecognized lock method: %d", lockmethodid);
***
*** 5

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-08-22 Thread daveg

On Fri, Aug 12, 2011 at 04:19:37PM -0700, daveg wrote:
> 
> This seems to be bug month for my client. Now there are seeing periods
> where all new connections fail immediately with the error:
> 
>FATAL:  lock AccessShareLock on object 0/1260/0 is already held 
> 
> This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
> been up for months. It started happening sporadicly a few days ago. It will
> do this for a period of several minutes to an hour and then go back to
> normal for hours or days.
> 
> One complete failing session out of several hundred around that time:
> -
> 2011-08-09 00:01:04.446  8823  [unknown]  [unknown]  LOG:  connection 
> received: host=op05.xxx port=34067
> 2011-08-09 00:01:04.446  8823  c77  apps  LOG:  connection authorized: 
> user=apps database=c77
> 2011-08-09 00:01:04.449  8823  c77  apps  FATAL:  lock AccessShareLock on 
> object 0/1260/0 is already held
> --

This is to add additional information to the original report:

For a while this was happening on many different databases in one postgresql
8.4.7 instance on a single large host ('U2' 512GB 64cpu) running RH 5.
That has been quiet for several days and the newest batches of errors have
happened on only on a single database 'c23', in a postgresql 9.0.4 instance
on a smaller host ('A', 64GB 8cpu) running SuSE 10.2.

No memory errors or other misbehaviour have been seen on either of these
hosts in recent months.

The original error was:

  lock AccessShareLock on object 0/1260/0 is already held

which is for pg_database. The recent errors are:

  lock AccessShareLock on object 16403/2615/0 is already held

which is for pg_namespace in database c23.

All of the orginal and most of the recent batchs of errors were immediately
after connecting to a database and being authorized, that is, before any
statements were attempted. However, some of the most recent are on the first
"query" statement. That is after logging in and doing things like "set
transaction ... " the first select would hit this error.

It seems to come in clusters, sometimes, which suggests something shared
by multiple processes. For example, here are the times for the errors
on c23 in the afternoon of August 20:

20 07:14:12.722

20 16:05:07.798
20 16:05:07.808

20 16:05:10.519

20 16:07:07.726
20 16:07:08.722
20 16:07:09.734
20 16:07:10.656

20 16:07:25.436

20 16:22:23.983
20 16:22:24.014
20 16:22:24.335
20 16:22:24.409
20 16:22:24.477
20 16:22:24.499
20 16:22:24.516

20 16:30:58.210

20 16:31:15.261
20 16:31:15.296
20 16:31:15.324
20 16:31:15.348

20 18:06:16.515

20 18:06:49.198
20 18:06:49.204

20 18:06:51.444

20 21:03:05.940

So far I've got:

  - affects system tables
  - happens very soon after process startup
  - in 8.4.7 and 9.0.4
  - not likely to be hardware or OS related
  - happens in clusters for period of a few second to many minutes

I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
hard to get downtime to pick up new builds. Any other ideas on getting to
the bottom of this?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-15 Thread daveg

[adding back hackers so the thread shows the resolution]

On Sun, Aug 14, 2011 at 07:02:55PM -0400, Tom Lane wrote:
> Sounds good.  Based on my own testing so far, I think that patch will
> probably make things measurably better for you, though it won't resolve
> every corner case.

The most recent catalog vacuums did vacuum full pg_class in 34 databases on
that instance with no new failures. So it looks like the patch fixes it. This
is not conclusive, but it looks very good so far. I'll send an update if I
see any new errors during the week.

Thanks for your help on this. It looks like it has sent you on a merry
search through all the catcache related program activities. I'm assuming
this patch or some improvement on it will show up in a point release.
Meanwhile, if this works as is for couple more days we will resume upgrading
the rest of the hosts to 9.0 using this patch.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already

2011-08-13 Thread daveg

On Sun, Aug 14, 2011 at 12:16:39AM -0400, Robert Haas wrote:
> On Fri, Aug 12, 2011 at 7:19 PM, daveg  wrote:
> > This seems to be bug month for my client. Now there are seeing periods
> > where all new connections fail immediately with the error:
> >
> >   FATAL:  lock AccessShareLock on object 0/1260/0 is already held
> >
> > This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
> > been up for months. It started happening sporadicly a few days ago. It will
> > do this for a period of several minutes to an hour and then go back to
> > normal for hours or days.
> >
> > One complete failing session out of several hundred around that time:
> > -
> > 2011-08-09 00:01:04.446  8823  [unknown]  [unknown]  LOG:  connection 
> > received: host=op05.xxx port=34067
> > 2011-08-09 00:01:04.446  8823  c77  apps  LOG:  connection authorized: 
> > user=apps database=c77
> > 2011-08-09 00:01:04.449  8823  c77  apps  FATAL:  lock AccessShareLock on 
> > object 0/1260/0 is already held
> > --
> >
> > What can I do to help track this down?
> 
> I've seen that error (though not that exact fact pattern) caused by
> bad RAM.  It's unclear to me what else could cause it.

I'll look into that. I think it is only happening on one host, so that might
make sense. On the other hand, these are pretty fancy hosts all ECC and that
so I'd hope they would have squeaked about bad ram.
 
> In terms of debugging, it seems like it might be sensible to start by
> injecting some debugging code that dumps out the contents of the LOCK
> and LOCALLOCK structures at the point the error occurs.

Hmm, we will update to 9.0 next week on these hosts, so I'll try to hold off
on this part at least until then.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already

2011-08-12 Thread daveg


This seems to be bug month for my client. Now there are seeing periods
where all new connections fail immediately with the error:

   FATAL:  lock AccessShareLock on object 0/1260/0 is already held 

This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
been up for months. It started happening sporadicly a few days ago. It will
do this for a period of several minutes to an hour and then go back to
normal for hours or days.

One complete failing session out of several hundred around that time:
-
2011-08-09 00:01:04.446  8823  [unknown]  [unknown]  LOG:  connection received: 
host=op05.xxx port=34067
2011-08-09 00:01:04.446  8823  c77  apps  LOG:  connection authorized: 
user=apps database=c77
2011-08-09 00:01:04.449  8823  c77  apps  FATAL:  lock AccessShareLock on 
object 0/1260/0 is already held
--
 
What can I do to help track this down?

-dg


--
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] our buffer replacement strategy is kind of lame

2011-08-12 Thread daveg

On Fri, Aug 12, 2011 at 01:28:49PM +0100, Simon Riggs wrote:
> I think there are reasonable arguments to make
> 
> * prefer_cache = off (default) | on a table level storage parameter,
> =on will disable the use of BufferAccessStrategy
> 
> * make cache_spoil_threshold a parameter, with default 0.25
> 
> Considering the world of very large RAMs in which we now live, some
> control of the above makes sense.

As long as we are discussion cache settings for tables, I have a client
who would like to be able to lock specific tables and indexes into cache
as they have strict response time requirements for particular queries.
At the moment they are running postgres with a tablespace on ramfs and
taking frequent backups, but this is not optimal.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM FULL versus system catalog cache invalidation

2011-08-12 Thread daveg

On Fri, Aug 12, 2011 at 09:26:02PM +0100, Simon Riggs wrote:
> With HOT, there is very little need to perform a VACUUM FULL on any
> shared catalog table. Look at the indexes...
> 
> I would a suggest that VACUUM FULL perform only a normal VACUUM on
> shared catalog tables, then perform an actual VACUUM FULL only in dire
> need (some simple heuristic in size and density). This avoids doing a
> VACUUM FULL unless it is actually necessary to do so. That has the
> added advantage of not locking out essential tables, which is always a
> concern.
> 
> In the unlikely event we do actually have to VACUUM FULL a shared
> catalog table, nuke any cache entry for the whole shared catalog. That
> way we absolutely and positively will never get any more bugs in this
> area, ever again. Sounds harsh, but these events are only actually
> needed very, very rarely and hygiene is more important than a few
> minor points of performance.

This is a very optimistic view. My client makes heavy use of temp tables.
HOT and autovacuum are not sufficient to keep catalog bloat under control.
We run a daily script that calculates the density of the catalog and only
vaccum fulls those that are severely bloated. Here is a result from a
recent bloat check on one db. 'packed' is the number of pages needed for
the rows if they were packed, 'bloat' is the multiple of pages in use over
the number really needed.

relation  | tuples | pages | packed | bloat
--++---++---
 pg_class; -- |   4292 | 10619 |114 |  93.2
 pg_depend; --|  25666 |  7665 |217 |  35.4
 pg_attrdef; --   |   6585 |  7595 |236 |  32.2
 pg_type; --  |   4570 |  8177 |416 |  19.6
 pg_shdepend; --  |  52040 |  7968 |438 |  18.2

-dg
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread daveg

On Fri, Aug 05, 2011 at 12:10:31PM -0400, Tom Lane wrote:
> I wrote:
> > Ahh ... you know what, never mind about stack traces, let's just see if
> > the attached patch doesn't fix it.
> 
> On reflection, that patch would only fix the issue for pg_class, and
> that's not the only catalog that gets consulted during relcache reloads.
> I think we'd better do it as attached, instead.
> 
>   regards, tom lane

Should this be applied in addition to the earlier patch, or to replace it?

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg

On Thu, Aug 04, 2011 at 04:16:08PM -0400, Tom Lane wrote:
> daveg  writes:
> > We are seeing "cannot read' and 'cannot open' errors too that would be
> > consistant with trying to use a vanished file.
> 
> Yeah, these all seem consistent with the idea that the failing backend
> somehow missed an update for the relation mapping file.  You would get
> the "could not find pg_class tuple" syndrome if the process was holding
> an open file descriptor for the now-deleted file, and otherwise cannot
> open/cannot read type errors.  And unless it later received another
> sinval message for the relation mapping file, the errors would persist.
> 
> If this theory is correct then all of the file-related errors ought to
> match up to recently-vacuumed mapped catalogs or indexes (those are the
> ones with relfilenode = 0 in pg_class).  Do you want to expand your
> logging of the VACUUM FULL actions and see if you can confirm that idea?

At your service, what would you like to see?
 
> Since the machine is running RHEL, I think we can use glibc's
> backtrace() function to get simple stack traces without too much effort.
> I'll write and test a patch and send it along in a bit.

Great.

Any point to try to capture SI events somehow?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg

On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
> daveg  writes:
> > Summary: the failing process reads 0 rows from 0 blocks from the OLD
> > relfilenode. 
> 
> Hmm.  This seems to mean that we're somehow missing a relation mapping
> invalidation message, or perhaps not processing it soon enough during
> some complex set of invalidations.  I did some testing with that in mind
> but couldn't reproduce the failure.  It'd be awfully nice to get a look
> at the call stack when this happens for you ... what OS are you running?

To recap, a few observations:

When it happens the victim has recently been waiting on a lock for a
several seconds.

We create a lot of temp tables, hundreds of thousands a day.

There are catalog vacuum fulls and reindexes running on 30 odd other databases
at the same time. The script estimates the amount of bloat on each table and
index and chooses either reindex on specific indexes or vacuum full as needed.

This is a 32 core (64 with hype threading) 512GB host with several hundred
connections

We are seeing "cannot read' and 'cannot open' errors too that would be
consistant with trying to use a vanished file.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg

On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
> daveg  writes:
> > Summary: the failing process reads 0 rows from 0 blocks from the OLD
> > relfilenode. 
> 
> Hmm.  This seems to mean that we're somehow missing a relation mapping
> invalidation message, or perhaps not processing it soon enough during
> some complex set of invalidations.  I did some testing with that in mind
> but couldn't reproduce the failure.  It'd be awfully nice to get a look
> at the call stack when this happens for you ... what OS are you running?

cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Linux version 2.6.18-194.el5

I can use gdb as well if we can get a core or stop the correct process.
Perhaps a long sleep when it hits this?
Or perhaps we could log invalidate processing for pg_class?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg

On Wed, Aug 03, 2011 at 11:18:20AM -0400, Tom Lane wrote:
> Evidently not, if it's not logging anything, but now the question is
> why.  One possibility is that for some reason RelationGetNumberOfBlocks
> is persistently lying about the file size.  (We've seen kernel bugs
> before that resulted in transiently wrong values, so this isn't totally
> beyond the realm of possibility.)  Please try the attached patch, which
> extends the previous one to add a summary line including the number of
> blocks physically scanned by the seqscan.

Ok, I have results from the latest patch and have attached a redacted
server log with the select relfilenode output added inline. This is the
shorter of the logs and shows the sequence pretty clearly. I have additional
logs if wanted.

Summary: the failing process reads 0 rows from 0 blocks from the OLD
relfilenode. 

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
SS.msec  pid   db   user  statement
--  -  ---    ---
03.804  29706  c27  postgres  connection authorized: user=postgres database=c27
...
03.824  29706  c27  postgres  set statement_timeout=60;
03.824  29706  c27  postgres  0.364 ms
03.825  29706  c27  postgres  select current_database() as db, relname, oid, 
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from 
pg_class where relname like 'pg_class%';
03.829  29706  c27  postgres  4.173 ms
 db  |  relname   | oid  |  filenode  |   ts
-++--++-
 c27 | pg_class   | 1259 | 1245833951 | 21:31:03.828293
 c27 | pg_class_oid_index | 2662 | 1245833955 | 21:31:03.828791
 c27 | pg_class_relname_nsp_index | 2663 | 1259100530 | 21:31:03.828807

03.829  29706  c27  postgres  vacuum full pg_catalog.pg_class;
03.829  29706  c27  postgres  LOCATION: exec_simple_query, postgres.c:900
...
03.845  29707  c27  postgres  disconnection: session time: 0:00:00.041 
user=postgres database=c27 host=bk-0
...
08.856  29706  c27  postgres  process 29706 still waiting for RowExclusiveLock 
on relation 1214 of database 0 after 5000.483 ms
08.856  29706  c27  postgres  LOCATION: ProcSleep, proc.c:1059
08.856  29706  c27  postgres  STATEMENT: vacuum full pg_catalog.pg_class;
09.383  29711  LOG: 0: process 29711 still waiting for AccessShareLock on 
relation 1259 of database 16408 after 5000.331 ms
09.383  29711  LOCATION: ProcSleep, proc.c:1059
...
11.559  28857  c27  apps  SELECT
...
16.560  28857  c27  apps  process 28857 still waiting for AccessShareLock 
on relation 1259 of database 16408 after 5001.209 ms
16.560  28857  c27  apps  LOCATION: ProcSleep, proc.c:1059
16.560  28857  c27  apps  STATEMENT: SELECT
...
19.763  29706  c27  postgres  process 29706 acquired RowExclusiveLock on 
relation 1214 of database 0 after 15907.284 ms
19.763  29706  c27  postgres  LOCATION: ProcSleep, proc.c:1063
19.763  29706  c27  postgres  STATEMENT: vacuum full pg_catalog.pg_class;
...
25.735  29711  LOG: 0: process 29711 acquired AccessShareLock on relation 
1259 of database 16408 after 21352.393 ms
25.735  29711  LOCATION: ProcSleep, proc.c:1063
25.735  28857  c27  apps  process 28857 acquired AccessShareLock on 
relation 1259 of database 16408 after 14176.040 ms
25.735  28857  c27  apps  LOCATION: ProcSleep, proc.c:1063
25.735  28857  c27  apps  STATEMENT: SELECT
25.736  28857  c27  apps  ScanPgRelationDetailed: found 0 tuples with OID 
2662 in 0 blocks of filenode 1245833951
25.736  28857  c27  apps  LOCATION: ScanPgRelationDetailed, relcache.c:372
25.736  28857  c27  apps  STATEMENT: SELECT
25.736  28857  c27  apps  ERROR: XX000: could not find pg_class tuple for 
index 2662
25.736  28857  c27  apps  LOCATION: RelationReloadIndexInfo, relcache.c:1816
25.736  28857  c27  apps  STATEMENT: SELECT
25.736  29706  c27  postgres  21906.865 ms
25.737  29706  c27  postgres  select current_database() as db, relname, oid, 
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from 
pg_class where relname like 'pg_class%';
25.767  29706  c27  postgres  30.902 ms
 db  |  relname   | oid  |  filenode  |   ts
-++--++-
 c27 | pg_class   | 1259 | 1279787837 | 21:31:25.76726
 c27 | pg_class_oid_index | 2662 | 1279788022 | 21:31:25.767764
 c27 | pg_class_relname_nsp_index | 2663 | 1279788023 | 21:31:25.767782

25.768  29706  c27  postgres  vacuum full pg_catalog.pg_attribute;
25.775  28857  c27  apps  disconnection: session time: 0:07:07.758 
user=apps database=c27 host=op-01
25.775  28857  c27  apps  LOCATION: log_disconnections, postgres.c:4339
...
30.914  29711  LOG: 0: process 29711 still waiting for AccessShareLock on 
relation 1249 of database 1

Re: [HACKERS] Further news on Clang - spurious warnings

2011-08-03 Thread daveg

On Wed, Aug 03, 2011 at 04:03:39PM -0400, Tom Lane wrote:
> The C standard specifies that signed-to-unsigned conversions must work
> like that; and even if the standard didn't, it would surely work like
> that on any machine with two's-complement representation, which is to
> say every computer built in the last forty years or so.  So I don't find
> it a questionable assumption.

I had the "pleasure" of working on a Univac 1108 in about 1978 and
it was very definitely ones complement. I'm somewhat amazed to find that
the Univac 1100 series architecture and instruction set lives on to this
day. The last pure 1100 seems to be the Unisys 2200/3800 released in 1997.
Even later U1100/Exec 8 descendants appear to still exist and are still
actively supported:

  http://en.wikipedia.org/wiki/Unisys_OS_2200_operating_system

So there are still ones complement machines out there. However I suggest we
pretend otherwise and continue to ignore them.

-dg

--
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-03 Thread daveg

On Mon, Aug 01, 2011 at 01:23:49PM -0400, Tom Lane wrote:
> daveg  writes:
> > On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
> >> I think we need to start adding some instrumentation so we can get a
> >> better handle on what's going on in your database.  If I were to send
> >> you a source-code patch for the server that adds some more logging
> >> printout when this happens, would you be willing/able to run a patched
> >> build on your machine?
> 
> > Yes we can run an instrumented server so long as the instrumentation does
> > not interfere with normal operation. However, scheduling downtime to switch
> > binaries is difficult, and generally needs to be happen on a weekend, but
> > sometimes can be expedited. I'll look into that.
> 
> OK, attached is a patch against 9.0 branch that will re-scan pg_class
> after a failure of this sort occurs, and log what it sees in the tuple
> header fields for each tuple for the target index.  This should give us
> some useful information.  It might be worthwhile for you to also log the
> results of
> 
> select relname,pg_relation_filenode(oid) from pg_class
> where relname like 'pg_class%';
> 
> in your script that does VACUUM FULL, just before and after each time it
> vacuums pg_class.  That will help in interpreting the relfilenodes in
> the log output.

We have installed the patch and have encountered the error as usual.
However there is no additional output from the patch. I'm speculating
that the pg_class scan in ScanPgRelationDetailed() fails to return
tuples somehow.


I have also been trying to trace it further by reading the code, but have not
got any solid hypothesis yet. In the absence of any debugging output I've
been trying to deduce the call tree leading to the original failure. So far
it looks like this:

RelationReloadIndexInfo(Relation)
// Relation is 2662 and !rd_isvalid
pg_class_tuple = ScanPgRelation(2662, indexOK=false)  // returns NULL
pg_class_desc = heap_open(1259, ACC_SHARE)
r = relation_open(1259, ACC_SHARE) // locks oid, ensures 
RelationIsValid(r)
r = RelationIdGetRelation(1259)
r = RelationIdCacheLookup(1259)   // assume success
if !rd_isvalid:
RelationClearRelation(r, true)
RelationInitPhysicalAddr(r) // r is pg_class 
relcache

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread daveg

On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
> daveg  writes:
> > Here is the update: the problem happens with vacuum full alone, no reindex
> > is needed to trigger it. I updated the script to avoid reindexing after
> > vacuum. Over the past two days there are still many ocurrances of this
> > error coincident with the vacuum.
> 
> Well, that jives with the assumption that the one case we saw in
> the buildfarm was the same thing, because the regression tests were
> certainly only doing a VACUUM FULL and not a REINDEX of pg_class.
> But it doesn't get us much closer to understanding what's happening.
> In particular, it seems to knock out most ideas associated with race
> conditions, because the VAC FULL should hold exclusive lock on pg_class
> until it's completely done (including index rebuilds).
> 
> I think we need to start adding some instrumentation so we can get a
> better handle on what's going on in your database.  If I were to send
> you a source-code patch for the server that adds some more logging
> printout when this happens, would you be willing/able to run a patched
> build on your machine?

Yes we can run an instrumented server so long as the instrumentation does
not interfere with normal operation. However, scheduling downtime to switch
binaries is difficult, and generally needs to be happen on a weekend, but
sometimes can be expedited. I'll look into that.

> (BTW, just to be perfectly clear ... the "could not find pg_class tuple"
> errors always mention index 2662, right, never any other number?)

Yes, only index 2662, never any other.

I'm attaching a somewhat redacted log for two different databases on the same
instance around the time of vacuum full of pg_class in each database.
My observations so far are:

 - the error occurs at commit of vacuum full of pg_class
 - in these cases error hits autovacuum after it waited for a lock on pg_class
 - in these two cases there was a new process startup while the vacuum was
   running. Don't know if this is relevant.
 - while these hit autovacuum, the error does hit other processs (just not in
   these sessions).  Unknown if autovacuum is a required component.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
NOTES:
-- most statements for db c01 in one catalog vacuum session
-- database c01 is oid 16404
-- applications run as user 'app'
-- vacuums run as user postgres
-- somewhat redacted

MM:SS.ms   piduser  log text
-  -    -

03:00.350  2716   c01  postgres  vacuum full pg_catalog.pg_class;
03:00.507  2866  LOG: process 2866 acquired ShareUpdateExclusiveLock on 
relation 2604 of database 16416 after 10143.750 ms
03:01.222  21685  c01  app   11100.670 ms
03:01.223  21685  c01  app  SELECT DISTINCT
03:01.224  21685  c01  app   1.524 ms
03:02.031  3001   c01  app  connection authorized: user=app database=c01
03:04.952  2787  LOG: process 2787 still waiting for AccessShareLock on 
relation 1259 of database 16412 after 5001.035 ms
03:05.065  2787  LOG: process 2787 acquired AccessShareLock on relation 1259 of 
database 16412 after 5114.253 ms
03:05.564  2977  LOG: process 2977 still waiting for AccessShareLock on 
relation 1259 of database 16404 after 5000.970 ms
03:05.640  2731  LOG: process 2731 still waiting for RowExclusiveLock on 
relation 1259 of database 16404 after 5000.186 ms
03:06.045  2977  LOG: process 2977 acquired AccessShareLock on relation 1259 of 
database 16404 after 5482.389 ms
03:06.045  2731  LOG: process 2731 acquired RowExclusiveLock on relation 1259 
of database 16404 after 5405.652 ms
03:06.046  2731  ERROR: could not find pg_class tuple for index 2662
03:06.046  2731  CONTEXT: automatic vacuum of table "c01.pg_catalog.pg_index"
03:06.046  2731  ERROR: could not find pg_class tuple for index 2662
03:06.046  2716   c01  postgres   5696.537 ms
03:06.056  3001   c01  app  SET SESSION TIME ZONE 'UTC'; ...
03:06.057  2716   c01  postgres  vacuum full pg_catalog.pg_rewrite;
03:06.066  3001   c01  app   10.459 ms
03:06.091  3001   c01  app  disconnection: session time: 0:00:04.085 user=app 
database=c01 host=xxxl01
03:08.908  3006  LOG: process 3006 still waiting for AccessShareLock on 
relation 2659 of database 16407 after 5000.778 ms
03:11.777  21685  c01  app  SELECT
03:11.779  21685  c01  app   2.296 ms
03:11.779  21685  c01  app  SELECT
03:11.780  21685  c01  app   0.328 ms
03:11.798  21685  c01  app  SELECT
03:11.799  21685  c01  app   0.348 ms
03:11.800  21685  c01  app  SELECT
03:11.800  21685  c01  app   0.205 ms
03:11.804  21685  c01  app  SELECT
03:11.805  21685  c01  app   0.589 ms
03:11.806  21685  c01  app  SELECT DISTINCT
03:11.809  21685  c01  app   3.552 ms
03:11.810  21685  c01  app

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread daveg

On Thu, Jul 28, 2011 at 11:31:31PM -0700, daveg wrote:
> On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
> > REINDEX.  My guess is that this is happening either right around the
> > time the VACUUM FULL commits or right around the time the REINDEX
> > commits.  It'd be helpful to know which, if you can figure it out.
> 
> I'll update my vacuum script to skip reindexes after vacuum full for 9.0
> servers and see if that makes the problem go away. Thanks for reminding
> me that they are not needed. However, I suspect it is the vacuum, not the
> reindex causing the problem. I'll update when I know.

Here is the update: the problem happens with vacuum full alone, no reindex
is needed to trigger it. I updated the script to avoid reindexing after
vacuum. Over the past two days there are still many ocurrances of this
error coincident with the vacuum.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread daveg

On Fri, Jul 29, 2011 at 09:55:46AM -0400, Tom Lane wrote:
> The thing that was bizarre about the one instance in the buildfarm was
> that the error was persistent, ie, once a session had failed all its
> subsequent attempts to access pg_class failed too.  I gather from Dave's
> description that it's working that way for him too.  I can think of ways
> that there might be a transient race condition against a REINDEX, but
> it's very unclear why the failure would persist across multiple
> attempts.  The best idea I can come up with is that the session has
> somehow cached a wrong commit status for the reindexing transaction,
> causing it to believe that both old and new copies of the index's
> pg_class row are dead ... but how could that happen?  The underlying

It is definitely persistant. Once triggered the error occurs for any new
statement until the session exits.

> state in the catalog is not wrong, because no concurrent sessions are
> upset (at least not in the buildfarm case ... Dave, do you see more than
> one session doing this at a time?).

It looks like it happens to multiple sessions so far as one can tell from
the timestamps of the errors:

 timestampsessionid error
  - --
 03:05:37.434 4e26a861.4a6d could not find pg_class tuple for index 2662
 03:05:37.434 4e26a861.4a6f could not find pg_class tuple for index 2662

 03:06:12.041 4e26a731.438e could not find pg_class tuple for index 2662

 03:06:12.042 4e21b6a3.629b could not find pg_class tuple for index 2662
 03:06:12.042 4e26a723.42ec could not find pg_class tuple for index 2662 at 
character 13


-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-28 Thread daveg

On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
> On Thu, Jul 28, 2011 at 5:46 PM, daveg  wrote:
> > On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
> >> On Wed, Jul 27, 2011 at 8:28 PM, daveg  wrote:
> >> > My client has been seeing regular instances of the following sort of 
> >> > problem:
> >> On what version of PostgreSQL?
> >
> > 9.0.4.
> >
> > I previously said:
> >> > This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
> >> > identical systems still running 8.4.8 that do not have this issue, so I'm
> >> > assuming it is related to the vacuum full work done for 9.0. Oddly, we 
> >> > don't
> >> > see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
> >> > so it may be timing related.
> 
> Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
> means that a REINDEX is happening as part of the same operation.  In
> 9.0, there's no point in doing VACUUM FULL immediately followed by
> REINDEX.  My guess is that this is happening either right around the
> time the VACUUM FULL commits or right around the time the REINDEX
> commits.  It'd be helpful to know which, if you can figure it out.

I'll update my vacuum script to skip reindexes after vacuum full for 9.0
servers and see if that makes the problem go away. Thanks for reminding
me that they are not needed. However, I suspect it is the vacuum, not the
reindex causing the problem. I'll update when I know.

> If there's not a hardware problem causing those read errors, maybe a
> backend is somehow ending up with a stale or invalid relcache entry.
> I'm not sure exactly how that could be happening, though...

It does not appear to be a hardware problem. I also suspect it is a stale
relcache.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-28 Thread daveg

On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
> On Wed, Jul 27, 2011 at 8:28 PM, daveg  wrote:
> > My client has been seeing regular instances of the following sort of 
> > problem:
> On what version of PostgreSQL?

9.0.4.

I previously said:
> > This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
> > identical systems still running 8.4.8 that do not have this issue, so I'm
> > assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
> > see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
> > so it may be timing related.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] error: could not find pg_class tuple for index 2662

2011-07-27 Thread daveg


My client has been seeing regular instances of the following sort of problem:

...
 03:06:09.453 exec_simple_query, postgres.c:900
 03:06:12.042 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.042 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.042 INSERT INTO zzz_k(k) SELECT ...
 03:06:12.045 0: statement: ABORT
 03:06:12.045 exec_simple_query, postgres.c:900
 03:06:12.045 0: duration: 0.100 ms
 03:06:12.045 exec_simple_query, postgres.c:1128
 03:06:12.046 0: statement: INSERT INTO temp_807
  VALUES (...)
 03:06:12.046 exec_simple_query, postgres.c:900
 03:06:12.046 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.046 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.046 INSERT INTO temp_807
  VALUES (...)
 03:06:12.096 08P01: unexpected EOF on client connection
 03:06:12.096 SocketBackend, postgres.c:348
 03:06:12.096 XX000: could not find pg_class tuple for index 2662
 03:06:12.096 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.121 0: disconnection: session time: 0:06:08.537 user=ZZZ 
database=ZZZ_01
 03:06:12.121 log_disconnections, postgres.c:4339


The above happens regularly (but not completely predictably) corresponding
with a daily cronjob that checks the catalogs for bloat and does vacuum full
and/or reindex as needed. Since some of the applications make very heavy
use of temp tables this will usually mean pg_class and pg_index get vacuum
full and reindex.

Sometimes queries will fail due to being unable to open a tables containing
file. On investigation the file will be absent in both the catalogs and the
filesystem so I don't know what table it refers to:

 20:41:19.063  ERROR:  could not open file 
"pg_tblspc/16401/PG_9.0_201008051/16413/1049145092": No such file or directory
 20:41:19.063  STATEMENT:  insert into r_ar__30
   select aid, mid, pid, sum(wdata) as wdata, ...
--
 20:41:19.430  ERROR:  could not open file 
"pg_tblspc/16401/PG_9.0_201008051/16413/1049145092": No such file or directory
 20:41:19.430  STATEMENT: SELECT nextval('j_id_seq')


Finallly, I have seen a several instances of failure to read data by
vacuum full itself:

 03:05:45.699 0: statement: vacuum full pg_catalog.pg_index;
 03:05:45.699 exec_simple_query, postgres.c:900
 03:05:46.142 XX001: could not read block 65 in file 
"pg_tblspc/16401/PG_9.0_201008051/16416/1049146489": read only 0 of 8192 bytes
 03:05:46.142 mdread, md.c:656
 03:05:46.142 vacuum full pg_catalog.pg_index;

This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
identical systems still running 8.4.8 that do not have this issue, so I'm
assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
so it may be timing related.

This seems possibly related to the issues in:

  Bizarre buildfarm failure on baiji: can't find pg_class_oid_index
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02038.php
  Broken HOT chains in system catalogs
http://archives.postgresql.org/pgsql-hackers/2011-04/msg00777.php

As far as I can tell from the logs I have, once a session sees one of these
errors any subsequent query will hit it again until the session exits.
However, it does not seem to harm other sessions or leave any persistant
damage (crossing fingers and hoping here).

I'm ready to do any testing/investigation/instrumented builds etc that may be
helpful in resolving this.

Regards

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-08 Thread daveg

On Tue, Mar 08, 2011 at 10:37:24AM +0200, Heikki Linnakangas wrote:
> On 08.03.2011 10:00, Heikki Linnakangas wrote:
> >Another idea is to give up on the warning when it appears that
> >oldestxmin has moved backwards, and assume that it's actually fine. We
> >could still warn in other cases where the flag appears to be incorrectly
> >set, like if there is a deleted tuple on the page.
> 
> This is probably a better idea at least in back-branches. It also 
> handles the case of twiddling vacuum_defer_cleanup_age, which tracking 
> two xmins per transactions would not handle.
> 
> Here's a patch. I also changed the warning per Robert's suggestion. 
> Anyone see a hole in this?

It would be helpful to have the dbname and schema in the message in addition
to the relname. I added those to the original diagnostic patch as it was not
clear that the messages were all related to the same page/table/dg.

Also, in your comment you might mention that multiple databases are one way
we could see oldestxmin move backwards.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-08 Thread daveg

On Tue, Mar 08, 2011 at 10:00:01AM +0200, Heikki Linnakangas wrote:
> On 08.03.2011 04:07, Greg Stark wrote:
> >Well from that log you definitely have OldestXmin going backwards. And
> >not by a little bit either. at 6:33 it set the all_visible flag and
> >then at 7:01 it was almost 1.3 million transactions earlier. In fact
> >to precisely the same value that was in use for a transaction at 1:38.
> >That seems like a bit of a coincidence though it's not repeated
> >earlier.
> 
> Yep. After staring at GetOldestXmin() again, it finally struck me how 
> OldestXmin can move backwards. You need two databases for it, which 
> probably explains why this has been so elusive.
... 
> What there are no other transactions active in the same database, 
> GetOldestXmin() returns just latestCompletedXid. When you open a 
> transaction in the same database after that, its xid will be above 
> latestCompletedXid, but its xmin includes transactions from all 
> databases, and there might be a transaction in some other database with 
> an xid that precedes the value that GetOldestXmin() returned earlier.
> 
> I'm not sure what to do about that. One idea is track two xmin values in 
> proc-array, one that includes transactions in all databases, and another 
> that only includes transactions in the same database. GetOldestXmin() 
> (when allDbs is false) would only pay attention to the latter. It would 
> add a few instructions to GetSnapshotData(), though.
> 
> Another idea is to give up on the warning when it appears that 
> oldestxmin has moved backwards, and assume that it's actually fine. We 
> could still warn in other cases where the flag appears to be incorrectly 
> set, like if there is a deleted tuple on the page.

I read this to mean that it is safe to ignore this warning and that these
databases are not at risk for data corruption or wrong results so long as
the warning is due to oldestxmin.  Please correct me if I have misunderstood. 

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-07 Thread daveg

On Fri, Mar 04, 2011 at 05:52:29PM +0200, Heikki Linnakangas wrote:
> Hmm, if these all came from the same database, then it looks OldestXmin 
> has moved backwards. That would explain the warnings. First one vacuum 
> determines that all the tuples are visible to everyone and sets the 
> flag. Then another vacuum runs with an older OldestXmin, and determines 
> that there's a tuple on the page with an xmin that is not yet visible to 
> everyone, hence it thinks that the flag should not have been set yet.
> 
> Looking at the code, I don't see how that situation could arise, though. 
> The value calculated by GetOldestXmin() should never move backwards. And 
> GetOldestXmin() is called in lazy_vacuum_rel(), after it has acquired a 
> lock on the table, which should protect from a race condition where two 
> vacuums could run on the table one after another, in a way where the 
> later vacuum runs with an OldestXmin calculated before the first vacuum.
> 
> Hmm, fiddling with vacuum_defer_cleanup_age on the fly could cause that, 
> though. You don't do that, do you?
 
No.

I've updated the patch to collect db and schema and added Merlins patch as
well and run it for a while. The attached log is all the debug messages
for pg_statistic page 333 from one database. I've also attached the two
most recent page images for that particular page, the last digits in the
filename are the hour and minute of when the page was saved.

What else can I be doing to help figure this out?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
2011-03-05 00:50:03.238 PST  13304  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 337186202)
2011-03-05 21:55:00.004 PST  10702  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 341342282)
2011-03-05 23:28:06.368 PST  16660  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 342000646)
2011-03-06 05:24:03.461 PST  12850  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 343420345)
2011-03-06 06:23:40.174 PST  21650  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 343773970)
2011-03-06 07:50:49.921 PST  3847  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 344373260)
2011-03-06 09:11:35.662 PST  10820  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 344660206)
2011-03-07 01:23:57.124 PST  14121  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347873961)
2011-03-07 01:25:01.129 PST  14266  WARNING:  PD_ALL_VISIBLE flag was 
incorrectly set in relation "c57.pg_catalog.pg_statistic" page 333 via 
OldestXmin (OldestXmin 347186993)
2011-03-07 01:26:01.143 PST  14356  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347879220)
2011-03-07 01:27:01.158 PST  14456  WARNING:  PD_ALL_VISIBLE flag was 
incorrectly set in relation "c57.pg_catalog.pg_statistic" page 333 via 
OldestXmin (OldestXmin 347186993)
2011-03-07 01:33:05.353 PST  15108  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347705144)
2011-03-07 01:38:09.298 PST  15869  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347705144)
2011-03-07 01:51:17.622 PST  18829  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 348042728)
2011-03-07 02:24:41.994 PST  22594  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 348116608)
2011-03-07 06:33:47.460 PST  18384  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 349004767)
2011-03-07 07:01:51.400 PST  23896  WARNING:  PD_ALL_VISIBLE flag was 
incorrectly set in relation "c57.pg_catalog.pg_statistic" page 333 via 
OldestXmin (OldestXmin 347705144)
2011-03-07 10:23:31.416 PST  10654  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 349660381)


pageimage_c57_pg_catalog_pg_statistic_333.0127
Description: Binary data


pageimage_c57_pg_catalog_pg_statistic_333.0701
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-04 Thread daveg

On Thu, Mar 03, 2011 at 09:04:04AM -0600, Merlin Moncure wrote:
> On Thu, Mar 3, 2011 at 2:16 AM, Heikki Linnakangas
>  wrote:
> > On 03.03.2011 09:12, daveg wrote:
> >>
> >> Question: what would be the consequence of simply patching out the setting
> >> of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
> >> problem (big assumption perhaps) then simply never setting it would at
> >> least
> >> avoid the possibility of returning wrong answers, presumably at some
> >> performance cost. We possibly could live with that until we get a handle
> >> on the real cause and fix.
> >
> > Yes. With that assumption.
> >
> > If you really want to do that, I would suggest the attached patch instead.
> > This just disables the optimization in seqscans to trust it, so an
> > incorrectly set flag won't affect correctness of query results,  but the
> > flag is still set as usual and you still get the warnings so that we can
> > continue to debug the issue.
> 
> This.  The mis-set flag can is likely a bug/concurrency issue etc,
> but could also be a symptom of more sinister data corruption.  I did
> various vacuum experiments all day yesterday on my windows workstation
> and was not able to produce any mis-flags.  I trust iscsi more than
> nfs, but maybe there is a connection here that is hardware based.  hm.
> do you think it would be helpful to know what is causing the
> all_visible flag to get flipped?  If so, the attached patch shows
> which case is throwing it...

I'll apply your patch and try it. Probably can only do it for a few minutes
tomorrow evening though as the output is huge and we have only limited down
time availability.
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-03 Thread daveg

On Thu, Mar 03, 2011 at 10:16:29AM +0200, Heikki Linnakangas wrote:
> On 03.03.2011 09:12, daveg wrote:
> >Question: what would be the consequence of simply patching out the setting
> >of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
> >problem (big assumption perhaps) then simply never setting it would at 
> >least
> >avoid the possibility of returning wrong answers, presumably at some
> >performance cost. We possibly could live with that until we get a handle
> >on the real cause and fix.
> 
> Yes. With that assumption.
> 
> If you really want to do that, I would suggest the attached patch 
> instead. This just disables the optimization in seqscans to trust it, so 
> an incorrectly set flag won't affect correctness of query results,  but 
> the flag is still set as usual and you still get the warnings so that we 
> can continue to debug the issue.

Thanks. I'll be applying this tomorrow and will send you some page images
to look at assuming it still does it.

I had a look at how this gets set and cleared and did not see anything obvious
so I'm pretty mystified. Also, we are seeing thousands of these daily for at
least a month on 4 large hosts and no-one has noticed any other issues,
which suprises me. Very strange.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg

On Tue, Mar 01, 2011 at 08:40:37AM -0500, Robert Haas wrote:
> On Mon, Feb 28, 2011 at 10:32 PM, Greg Stark  wrote:
> > On Tue, Mar 1, 2011 at 1:43 AM, David Christensen  
> > wrote:
> >> Was this cluster upgraded to 8.4.4 from 8.4.0?  It sounds to me like a 
> >> known bug in 8.4.0 which was fixed by this commit:
> >>
> >
> > The reproduction script described was running vacuum repeatedly. A
> > single vacuum run out to be sufficient to clean up the problem if it
> > was left-over.
> >
> > I wonder if it would help to write a regression test that runs 100 or
> > so vacuums and see if the bulid farm turns up any examples of this
> > behaviour.
> 
> One other thing to keep in mind here is that the warning message we've
> chosen can be a bit misleading.  The warning is:
> 
> WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation "test" page 1
> 
> ...which implies that the state of the tuples is correct, and that the
> page-level bit is wrong in comparison.  But I recently saw a case
> where the infomask got clobbered, resulting in this warning.  The page
> level bit was correct, at least relative to the intended page
> contents; it was the a tuple on the page that was screwed up.  It
> might have been better to pick a more neutral phrasing, like "page is
> marked all-visible but some tuples are not visible".

Yeesh. Yikes. I hope that this is not the case as we are seeing thousands of
these daily on each of 4 large production hosts. Mostly on catalogs,
especially pg_statistic. However it does occur on some high delete/insert
traffic user tables too.

Question: what would be the consequence of simply patching out the setting
of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
problem (big assumption perhaps) then simply never setting it would at least
avoid the possibility of returning wrong answers, presumably at some
performance cost. We possibly could live with that until we get a handle
on the real cause and fix.

I had a look and don't really see anything except vacuum_lazy that sets it,
so it seems simple to disable.

Or have I understood this incorrectly?

Anything else I can be doing to try to track this down?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg

On Wed, Mar 02, 2011 at 04:20:24PM -0800, bricklen wrote:
> On Wed, Mar 2, 2011 at 3:53 PM, daveg  wrote:
> >> > Postgresql version is 8.4.4.
> >>
> >> I don't see how this could be related, but since you're running on NFS,
> >> maybe it is, somehow:
> >> http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
> >> (for example what if the visibility map fork's last page is overwritten?)
> >
> > Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll 
> > look.
> > Also, we are not seeing any of the "unexpected data beyond EOF" errors,
> > just thousands per day of the PD_ALL_VISIBLE error.
> >
> > -dg
> 
> FWIW, we had a couple occurrences of that message about a month ago on 9.0.2
> 
> http://archives.postgresql.org/pgsql-general/2011-01/msg00887.php
> 
> Haven't seen it since we ran a cluster-wide vacuum.

We did a shutdown and restart to clear the buffer cache (but did not reboot
the host) and a vacuum on all dbs in the cluster last night. That cleared it
up for a couple hours, but we are still getting lots of these messages.

Most of them are pg_statistic and we create and drop hundreds of thousands of
temp tables daily, so there is a good chance there is a concurrancy issue.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg

On Wed, Mar 02, 2011 at 06:45:13PM -0300, Alvaro Herrera wrote:
> Excerpts from daveg's message of mié mar 02 18:30:34 -0300 2011:
> 
> > After a restart and vacuum of all dbs with no other activity things were
> > quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
> > messages again. 
> > 
> > Going back through the logs we have been getting these since at least before
> > mid January. Oddly, this only happens on four systems which are all new Dell
> > 32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
> > Our older 8 core 64GB hosts have never logged any of these errors. I'm not
> > saying it is related to the hw, as these hosts are doing a lot more work 
> > than
> > the old hosts so it may be a concurrency problem that just never came up at
> > lower levels before.
> > 
> > Postgresql version is 8.4.4.
> 
> I don't see how this could be related, but since you're running on NFS,
> maybe it is, somehow:
> http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
> (for example what if the visibility map fork's last page is overwritten?)

Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll look.
Also, we are not seeing any of the "unexpected data beyond EOF" errors,
just thousands per day of the PD_ALL_VISIBLE error.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg

On Tue, Mar 01, 2011 at 01:20:43PM -0800, daveg wrote:
> On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote:
> > On 28.02.2011 23:28, daveg wrote:
> > >On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
> > >>We'll likely need to go back and forth a few times with various
> > >>debugging patches until we get to the heart of this..
> > >
> > >Anything new on this? I'm seeing at on one of my clients production boxes.
> > 
> > I haven't heard anything from the OP since.
> > 
> > >Also, what is the significance, ie what is the risk or damage potential if
> > >this flag is set incorrectly?
> > 
> > Sequential scans will honor the flag, so you might see some dead rows 
> > incorrectly returned by a sequential scan. That's the only "damage", but 
> > an incorrectly set flag could be a sign of something more sinister, like 
> > corrupt tuple headers. The flag should never be set incorrectly, so if 
> > you see that message you have hit a bug in PostgreSQL, or you have bad 
> > hardware.
> > 
> > This flag is quite new, so a bug in PostgreSQL is quite possible. If you 
> > still have a backup that contains those incorrectly set flags, I'd like 
> > to see what the page looks like.
> 
> 
> I ran vacuums on all the affected tables last night. I plan to take a downtime
> to clear the buffer cache and then to run vacuums on all the dbs in the
> cluster.
> 
> Most but not all the tables involved are catalogs.
> 
> However, I could probably pick up your old patch sometime next week if it
> recurrs and send you page images.

After a restart and vacuum of all dbs with no other activity things were
quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
messages again. 

Going back through the logs we have been getting these since at least before
mid January. Oddly, this only happens on four systems which are all new Dell
32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
Our older 8 core 64GB hosts have never logged any of these errors. I'm not
saying it is related to the hw, as these hosts are doing a lot more work than
the old hosts so it may be a concurrency problem that just never came up at
lower levels before.

Postgresql version is 8.4.4.

I'll pick up Heikkis page logging patch and run it for a bit to get some
damaged page images. What else could I be doing to track this down?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-01 Thread daveg

On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote:
> On 28.02.2011 23:28, daveg wrote:
> >On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
> >>We'll likely need to go back and forth a few times with various
> >>debugging patches until we get to the heart of this..
> >
> >Anything new on this? I'm seeing at on one of my clients production boxes.
> 
> I haven't heard anything from the OP since.
> 
> >Also, what is the significance, ie what is the risk or damage potential if
> >this flag is set incorrectly?
> 
> Sequential scans will honor the flag, so you might see some dead rows 
> incorrectly returned by a sequential scan. That's the only "damage", but 
> an incorrectly set flag could be a sign of something more sinister, like 
> corrupt tuple headers. The flag should never be set incorrectly, so if 
> you see that message you have hit a bug in PostgreSQL, or you have bad 
> hardware.
> 
> This flag is quite new, so a bug in PostgreSQL is quite possible. If you 
> still have a backup that contains those incorrectly set flags, I'd like 
> to see what the page looks like.


I ran vacuums on all the affected tables last night. I plan to take a downtime
to clear the buffer cache and then to run vacuums on all the dbs in the
cluster.

Most but not all the tables involved are catalogs.

However, I could probably pick up your old patch sometime next week if it
recurrs and send you page images.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-01 Thread daveg

On Mon, Feb 28, 2011 at 07:43:39PM -0600, David Christensen wrote:
> 
> On Feb 28, 2011, at 3:28 PM, daveg wrote:
> 
> > Anything new on this? I'm seeing at on one of my clients production boxes.
> > Also, what is the significance, ie what is the risk or damage potential if
> > this flag is set incorrectly?
> 
> 
> Was this cluster upgraded to 8.4.4 from 8.4.0?  It sounds to me like a known 
> bug in 8.4.0 which was fixed by this commit:
> 
> commit 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f
> Author: Tom Lane 
> Date:   Mon Aug 24 02:18:32 2009 +
> 
> Fix a violation of WAL coding rules in the recent patch to include an
> "all tuples visible" flag in heap page headers.  The flag update *must*
> be applied before calling XLogInsert, but heap_update and the tuple
> moving routines in VACUUM FULL were ignoring this rule.  A crash and
> replay could therefore leave the flag incorrectly set, causing rows
> to appear visible in seqscans when they should not be.  This might explain
> recent reports of data corruption from Jeff Ross and others.
> 
> In passing, do a bit of editorialization on comments in visibilitymap.c.
> 
> oy:postgresql machack$ git describe --tag 
> 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f
> REL8_4_0-190-g7fc7a7c
> 
> If the flag got twiddled while running as 8.4.0, the incorrect PD_ALL_VISIBLE 
> flag would (obviously) not be fixed by the upgrade to 8.4.4.  (Is this a 
> separate issue?)

This cluster was installed with 8.4.4. So it is still an existing problem.
Also, to my recollection, this cluster has never crashed.

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-02-28 Thread daveg

On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
> On 12.01.2011 06:21, Fujii Masao wrote:
> >On Sat, Dec 25, 2010 at 2:09 PM, Maxim Boguk  wrote:
> >>While I trying create reproducible test case for BUG #5798 I
> >>encountered very strange effect on two of my servers (both servers
> >>have same hardware platform/OS (freebsd 7.2) and PostgreSQL 8.4.4).
> >>
> >>Very simple test table created as:
> >>CREATE TABLE test (id integer);
> >>INSERT INTO test select generate_series(0,1);
> >>
> >>And I trying repeateble vacuum of that table with script:
> >>  perl -e "foreach (1..10) {system \"psql -d test -h -c 'vacuum 
> >>  test'\";}"
> >>
> >>And once per like an minute (really random intervals can be 5 minutes
> >>without problems can be 3 vacuum in row show same error)  I getting
> >>next errors:
> >>WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation "test" page 
> >>1
> >>...
> >>WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation "test"
> >>page 30 for all pages of the relation.
> 
> Oh, interesting. This is the first time anyone can reliably reproducible 
> that. I can't reproduce that on my laptop with that script, though, so 
> I'm going to need your help to debug this.
> 
> Can you compile PostgreSQL with the attached patch, and rerun the test? 
> It will dump the pages with incorrectly set flags to files in /tmp/, and 
> adds a bit more detail in the WARNING.  Please run the test until you 
> get those warnings, and tar up the the created "/tmp/pageimage*" files, 
> and post them along with the warning generated.
> 
> We'll likely need to go back and forth a few times with various 
> debugging patches until we get to the heart of this..

Anything new on this? I'm seeing at on one of my clients production boxes.
Also, what is the significance, ie what is the risk or damage potential if
this flag is set incorrectly?

Thanks

-dg


-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL and HugePage

2010-10-21 Thread daveg

On Thu, Oct 21, 2010 at 08:16:27AM -0700, Mark Wong wrote:
> On Tue, Oct 19, 2010 at 8:30 PM, daveg  wrote:
> > On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
> >> On 20/10/10 16:05, Mark Kirkwood wrote:
> >> >
> >> >
> >> >shmget and friends are hugetlbpage  aware, so it seems it should 'just
> >> >work'.
> >> >
> >>
> >> Heh - provided you specify
> >>
> >> SHM_HUGETLB
> >>
> >>
> >> in the relevant call that is :-)
> >
> > I had a patch for this against 8.3 that I could update if there is any
> > interest. I suspect it is helpful.
> 
> Oh, probably better than me digging up my broken one.  Send it out as
> is if you don't want to update it. :)

I'll update it and see if I can get a largish machine to test, at least with
pgbench on. But not today alas.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL and HugePage

2010-10-20 Thread daveg

On Wed, Oct 20, 2010 at 12:28:25PM -0700, Greg Stark wrote:
> On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark  wrote:
> > I don't think it's a big cost once all the processes
> > have been forked if you're reusing them beyond perhaps slightly more
> > efficient cache usage.
> 
> Hm, this site claims to get a 13% win just from the reduced tlb misses
> using a preload hack with Pg 8.2. That would be pretty substantial.
> 
> http://oss.linbit.com/hugetlb/

That was my motivation in trying a patch. TLB misses can be a substantial
overhead. I'm not current on the state of play, but working at Sun's
benchmark lab on a DB TPC-B benchmark something for the first generation
of MP systems, something like 30% of all bus traffic was TLB misses. The
next iteration of the hardward had a much larger TLB.

I have a client with 512GB memory systems, currently with 128GB configured
as postgresql buffer cache. Which is 32M TLB entires trying to fit in the
few dozed cpu TLB slots. I suspect there may be some contention.

I'll benchmark of course.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL and HugePage

2010-10-19 Thread daveg

On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
> On 20/10/10 16:05, Mark Kirkwood wrote:
> >
> >
> >shmget and friends are hugetlbpage  aware, so it seems it should 'just 
> >work'.
> >
> 
> Heh - provided you specify
> 
> SHM_HUGETLB
> 
> 
> in the relevant call that is :-)

I had a patch for this against 8.3 that I could update if there is any
interest. I suspect it is helpful.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a faster compression algorithm for pg_dump

2010-04-14 Thread daveg

On Tue, Apr 13, 2010 at 03:03:58PM -0400, Tom Lane wrote:
> Joachim Wieland  writes:
> > If we still cannot do this, then what I am asking is: What does the
> > project need to be able to at least link against such a compression
> > algorithm?
> 
> Well, what we *really* need is a convincing argument that it's worth
> taking some risk for.  I find that not obvious.  You can pipe the output
> of pg_dump into your-choice-of-compressor, for example, and that gets
> you the ability to spread the work across multiple CPUs in addition to
> eliminating legal risk to the PG project.  And in any case the general
> impression seems to be that the main dump-speed bottleneck is on the
> backend side not in pg_dump's compression.

My client uses pg_dump -Fc and produces about 700GB of compressed postgresql
dump nightly from multiple hosts. They also depend on being able to read and
filter the dump catalog. A faster compression algorithm would be a huge
benefit for dealing with this volume.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TCP keepalive support for libpq

2010-02-10 Thread daveg

On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote:
> Tollef Fog Heen wrote:
> >(please Cc me on replies, I am not subscribed)
> >
> >Hi,
> >
> >libpq currently does not use TCP keepalives.  This is a problem in our
> >case where we have some clients waiting for notifies and then the
> >connection is dropped on the server side.  The client never gets the FIN
> >and thinks the connection is up.  The attached patch unconditionally
> >adds keepalives.  I chose unconditionally as this is what the server
> >does.  We didn't need the ability to tune the timeouts, but that could
> >be added with reasonable ease.
> 
> ISTM that the default behavior should be keep alives disabled, as it is 
> now, and those wanting it can just set it in their apps:
> 
> setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...)

I disagree. I have clients who have problems with leftover client connections
due to server host failures. They do not write apps in C. For a non-default
change to be effective we would need to have all the client drivers, eg JDBC,
psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding
this option as a non-default will not really help.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump enhancement proposal

2009-12-13 Thread daveg

On Thu, Nov 12, 2009 at 04:31:37PM -0500, Tom Lane wrote:
> Mark Hammonds  writes:
> > 2.  Custom Query Exports
> 
> > In my use of mysqldump, I found one feature very useful: the ability  
> > to execute a custom SELECT. . .WHERE statement and then dump only the  
> > results. This feature currently provides MySQL users with the ability  
> > to quickly and easily export very granular data subsets, and I see no  
> > reason why PostgreSQL users wouldn't benefit from the same capability.  
> > While it is true that this functionality can already be achieved in  
> > PostgreSQL using Copy, it seems to me that it would logically fit well  
> > as an extension to pg_dump, especially since many beginning and even  
> > some intermediate PostgreSQL users aren't aware of the alternatives.
> 
> As you say, we already have this using COPY, and I don't agree that
> it would be a good idea to plaster it into pg_dump as well.  pg_dump
> is intended for dumping and restoring data, not for ETL-type tasks.
> Furthermore, pg_dump is a overly complex beast already --- much more
> so than one could wish, for a tool that is absolutely fundamental to
> database reliability.  Putting requirements on it that are well outside
> its charter seems like a short route to maintenance disaster.
> 
> There has been some occasional chatter about developing one or more
> tools focused on ETL rather than dump/restore, and my thought is that
> this idea would fit better there.  An ETL tool would not have the
> kind of requirements pg_dump has for coping with multiple server
> versions and knowing everything there is to know about database
> contents, so it seems like it could address new areas of functionality
> without a complexity explosion.
> 
> You might want to check the archives for previous discussions ---
> I think the last go-round started with someone wanting to add an
> arbitrary WHERE filter to pg_dump dumps.

Sorry I missed this thread.

Not only has there been previous discussion, there have been at least two,
and I seem to recall three, patches implementing this. None of the patches
was very large, and none of them impacted the basic "make a backup" paths
in pg_dump.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EOL for 7.4?

2009-11-12 Thread daveg

On Fri, Nov 13, 2009 at 02:47:56AM +, Greg Stark wrote:
> On Fri, Nov 13, 2009 at 2:35 AM, daveg  wrote:
> > I suggest we announce now that both 7.4 and 8.0 will EOL when 8.5 is 
> > expected
> > to ship, or to comfort those who never use .0 versions when 8.5.1 ships.
> 
> What would this mean? How would it be different than the status quo?

I suppose it would mean posting periodic prominent notices, moving the sources
to the OLD directory, that sort of thing. I thought that was the topic of this
thread?

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EOL for 7.4?

2009-11-12 Thread daveg

On Fri, Nov 13, 2009 at 02:22:01AM +, Greg Stark wrote:
> 
> Really I think you guys are on the wrong track trying to map Postgres
> releases to commercial support terms. None of the Postgres releases
> are "supported" in the sense that there's no warranty and no promises,
> it's all best effort. If you want a promise of anything then pay
> someone for that service.
> 
> As with any open source software if you're running 7-year-old versions
> of the software you can't seriously expect the developers to take any
> interest in bugs you discover which don't affect current releases.
> Other projects don't release back branches at all. The most the
> developers are likely to do if your bugs require serious engineering
> is declare that the version you're using is too old.

Claiming to support versions that are "too old" is giving users a false
sense of comfort. Encouraging users to use these versions is actually
harming them as when this happens they will be stuck with either living
with the bug or doing an immediate unplanned upgrade.

I suggest we announce now that both 7.4 and 8.0 will EOL when 8.5 is expected
to ship, or to comfort those who never use .0 versions when 8.5.1 ships.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EOL for 7.4?

2009-11-03 Thread daveg

On Tue, Nov 03, 2009 at 10:32:17AM -0800, Josh Berkus wrote:
> So I'm going to make a case in favor of EOL'ing 7.4.  In fact, I'd be in
> favor of doing so in, say, February after an announcement this month.
> 
> The main reason I'm in favor of this is that we have a lot of users
> using 7.4 out of inertia, and they need a message that 7.4 is "not
> supported" to get them to upgrade.  I can think of several here in SF
> who have been "working on upgrade plans" for the past 3 years.  An EOL
> is what's needed to give them a kick in the pants.
> 
> The same goes for other OSS projects.  There's quite a few random OSS
> apps which were created on PG 7.4 and have never offered their users an
> upgrade path (Gnuworld comes to mind).  They need an EOL announcement to
> get them motivated to upgrade.

+1

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-10-20 Thread daveg

On Fri, Oct 02, 2009 at 07:57:13PM -0700, daveg wrote:
> On Fri, Oct 02, 2009 at 10:41:07AM -0400, Alvaro Herrera wrote:
> > daveg escribió:
> > 
> > > I work with Kunal and have been looking into this. It appears to be the 
> > > same
> > > as the bug described in:
> > > 
> > >   http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php
> > > 
> > > as I have localized it to a NULL pointer deference in
> > > RelationCacheInitializePhase2() as well. Tom speculates in:
> > > 
> > >   http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php
> > > 
> > > that large numbers of table drops might trigger this. The system in 
> > > question
> > > creates and drops temp tables at a high rate which tends to confirm this. 
> > 
> > Did you test the patch posted by Tom?
> 
> We are testing it since last night in our test environment. If it does not
> break anything (unlikely) we will deploy it next week. However, since the
> problem is only occasional, only happens every few days on one of 50+ hosts,
> it will take some extended time without further segfaults to say anything
> confident about the patches effectiveness.

We have had this deployed in our test and production environments for a 
couple weeks now. We have not seen any further instance of the problem.
Without the patch, we would have expected to see at least a few by now.
So the patch appears to be effective.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Application name patch - v2

2009-10-20 Thread daveg

On Tue, Oct 20, 2009 at 12:16:42PM -0400, Tom Lane wrote:
> Magnus Hagander  writes:
> > Also, how many platforms can't we do this on? If we have BSD and
> > Windows covered already. on linux, I believe you can easily read it
> > out of /proc/self/cmdline, no?
> 
> Writing a pile of platform-specific code for this is simply insane from
> a support point of view.  The feature is NOT worth it.  Especially not
> since the typical result will be something quite uninformative like
> "psql" or "java".  The cases that are actually useful are the ones where
> the application sets it.  I don't think we should have a default at all
> --- you don't set it, you don't get a name.

-1

I'd like a default, especially for psql, to help identify interactive sessions.

-dg 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Application name patch - v2

2009-10-19 Thread daveg

On Mon, Oct 19, 2009 at 01:00:28PM +0100, Dave Page wrote:
> On Mon, Oct 19, 2009 at 12:57 PM, Pavel Stehule  
> wrote:
> > It is not practical. I'll log errors. Usually SQL injection generates
> > lot of errors. Loging all statements has not sense. What is difference
> > bad and good SQL statement.? Maybe multistatements are good candidates
> > for log as possible attackers statements. On highly load databases
> > loging all statements significantly increase load :(
> 
> Ahh, I see.
> 
> >> My point is, that the query to change the app name is logged using the
> >> *original* app name, thus it will not be discarded by the log analysis
> >> tools in your scenario.
> >>
> >
> > I thing, so change of original name should generate warning.
> 
> Well, if other people think that's necessary, it's certainly possible.

I have clients working around the lack of this feature by simply prepending
a single line comment to their sql in the application to supply the app name.
eg:

  -- monthly_report monthly_process.py:524
  select wev from foo;

This feature would be very handy, but not if it requires special permission
to use it.

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecation

2009-10-19 Thread daveg

On Sat, Oct 17, 2009 at 03:01:27PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > Sounds like a good reason to remove add_missing_from in 8.5.
> 
> Seems like the general consensus is that it's okay to do that.
> I will go make it happen unless somebody squawks pretty soon...
> 
>   regards, tom lane

+1

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-10-02 Thread daveg

On Fri, Oct 02, 2009 at 10:41:07AM -0400, Alvaro Herrera wrote:
> daveg escribió:
> 
> > I work with Kunal and have been looking into this. It appears to be the same
> > as the bug described in:
> > 
> >   http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php
> > 
> > as I have localized it to a NULL pointer deference in
> > RelationCacheInitializePhase2() as well. Tom speculates in:
> > 
> >   http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php
> > 
> > that large numbers of table drops might trigger this. The system in question
> > creates and drops temp tables at a high rate which tends to confirm this. 
> 
> Did you test the patch posted by Tom?

We are testing it since last night in our test environment. If it does not
break anything (unlikely) we will deploy it next week. However, since the
problem is only occasional, only happens every few days on one of 50+ hosts,
it will take some extended time without further segfaults to say anything
confident about the patches effectiveness.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-10-01 Thread daveg

On Tue, Sep 29, 2009 at 09:52:06PM +0530, kunal sharma wrote:
> Hi ,
> We are using Postgres 8.4 and its been found going into recovery
> mode couple of times. The server process seems to fork another child process
> which is another postgres server running under same data directory and after
> some time it goes away while the old server is still running. There were few
> load issues on the server but the load didnt went above "32".
> 
>We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
> Checking the logs I found that theres a segmentation fault ,
> 
> 
> Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0030
> rip 0066ba8c rsp 7fffd364da30 error 4
> 
> gdb dump shows this
> 
> Reading symbols from /lib64/libdl.so.2...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/libm.so.6...done.
> Loaded symbols for /lib64/libm.so.6
> Reading symbols from /lib64/libc.so.6...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /lib64/libnss_files.so.2...done.
> Loaded symbols for /lib64/libnss_files.so.2
> 0x2ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
> (gdb)
> 
>   Any suggestions what is causing this segmentation fault?

I work with Kunal and have been looking into this. It appears to be the same
as the bug described in:

  http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php

as I have localized it to a NULL pointer deference in
RelationCacheInitializePhase2() as well. Tom speculates in:

  http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php

that large numbers of table drops might trigger this. The system in question
creates and drops temp tables at a high rate which tends to confirm this. 

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg

On Thu, Oct 01, 2009 at 11:47:43AM -0400, Tom Lane wrote:
> Euler Taveira de Oliveira  writes:
> > Tom Lane escreveu:
> >> daveg  writes:
> >>> I'd like to propose adding a new GUC to limit the amount of memory a 
> >>> backend
> >>> can allocate for its own use.
> >> 
> >> Use ulimit.

> Seriously, the proposed patch introduces overhead into a place that is
> already a known hot spot, in return for not much of anything.  It will

The overhead is simply an integer addition and compare with values that are
likely already in processor caches. And this only occurs when we actually
call malloc() to get a new block, not on every palloc. So I suspect it will
not be noticable. However, I welcome any suggestion on how to test this
and actually measure the overhead if any. pg_bench? Something else?

> *not* bound backend memory use very accurately, because there is no way
> to track raw malloc() calls. And I think that 99% of users will
>  not find it useful.

The use case that motivated is a client that runs many postgresql instances
with a mostly batch/large query workload. Some of the queries are code
generated by an application and can be very complex.  A few times a month
one of these will run through 64GB of memory and oom the host. So it
seriously hurts production. Setting work_mem low enough to prevent this
results in poor query performance.

This client does not use any outside libraries that call malloc() directly.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg

On Thu, Oct 01, 2009 at 10:35:55AM -0400, Tom Lane wrote:
> daveg  writes:
> > I'd like to propose adding a new GUC to limit the amount of memory a backend
> > can allocate for its own use.
> 
> Use ulimit.

That was my initial thought too. However, ulimit() is documented as superceded
by setrlimit(). Which has the option RLIMIT_DATA to limit the size of the data
segment. Perfect!

Except, RLIMIT_DATA does not appear to work on linux. The call succeeds and
the new value can even be read back with getrlimit(), but it does not seem
to do anything to actually limit the memory allocated. I tested this on
SuSE 11: kernel 2.6.25, and Ubuntu Intrepid: kernel 2.6.28.

Setting RLIMIT_AS to limit the total address space for a process works as
expected. However this seems undesireable for postgresql as it can also cause
stack expansion to fail, which would then force a general restart. Also,
this limit would interact with the buffercache size setting as it includes
the shared address space as well.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg


I'd like to propose adding a new GUC to limit the amount of memory a backend
can allocate for its own use. The problem this addresses is that sometimes
one needs to set work_mem fairly high to get good query plans for large joins.
However, some complex queries will then use huge amounts of memory so that
one or a few of them will consume all the memory on the host and run it deep
into swap or trigger the oom killer or worse.

I've attached a patch based on 8.4.1. It works by keeping a track of the
total memory allocated via malloc to AllocBlocks (aset.c). If this is not
shot down/up too badly I will rebase it on CVS and submit it for the next
commit fest.

I would also like to propose a similar limit on temp space use. It is quite
easy for an unintended cartesion product to use hundreds of gigabytes of
scratch space and cause other processes to fail due to lack of disk space.
If this is not objectionable, I'll work on it too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
*** ./src/include/utils/memutils.h.orig 2009-09-30 01:54:36.0 -0700
--- ./src/include/utils/memutils.h  2009-09-30 03:33:44.0 -0700
***
*** 114,119 
--- 114,122 
   */
  
  /* aset.c */
+ 
+ extern int max_allocated_mem;
+ 
  extern MemoryContext AllocSetContextCreate(MemoryContext parent,
  const char *name,
  Size minContextSize,
*** ./src/backend/utils/mmgr/aset.c.orig2009-09-29 16:14:23.0 
-0700
--- ./src/backend/utils/mmgr/aset.c 2009-10-01 03:07:34.0 -0700
***
*** 168,173 
--- 168,187 
  } AllocBlockData;
  
  /*
+  * AllocBlock accounting maintains total allocated memory to enforce the 
memory use limit.
+  */
+ int max_allocated_mem = 0;
+ Size AllocBlockAccountingMemUsed = 0;
+ 
+ #define AllocBlockAccountingFree(block)   \
+   (AllocBlockAccountingMemUsed -= block->endptr - (char 
*) (block))
+ #define AllocBlockAccountingAlloc(block)  \
+(AllocBlockAccountingMemUsed += block->endptr - (char 
*) (block))
+ #define AllocBlockAccountingOverLimit()   \
+   (max_allocated_mem != 0 \
+&& AllocBlockAccountingMemUsed / 1024 > 
max_allocated_mem)
+ 
+ /*
   * AllocChunk
   *The prefix of each piece of memory in an AllocBlock
   *
***
*** 393,398 
--- 407,423 
context->blocks = block;
/* Mark block as not to be released at reset time */
context->keeper = block;
+ 
+   AllocBlockAccountingAlloc(block);
+   if (AllocBlockAccountingOverLimit())
+   {
+   MemoryContextStats(TopMemoryContext);
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg("memory limit exceeded"),
+errdetail("Failed while creating 
memory context \"%s\".",
+  name)));
+   }
}
  
context->isReset = true;
***
*** 476,481 
--- 501,507 
else
{
/* Normal case, release the block */
+   AllocBlockAccountingFree(block);
  #ifdef CLOBBER_FREED_MEMORY
/* Wipe freed memory for debugging purposes */
memset(block, 0x7F, block->freeptr - ((char *) block));
***
*** 521,526 
--- 547,553 
{
AllocBlock  next = block->next;
  
+   AllocBlockAccountingFree(block);
  #ifdef CLOBBER_FREED_MEMORY
/* Wipe freed memory for debugging purposes */
memset(block, 0x7F, block->freeptr - ((char *) block));
***
*** 597,602 
--- 624,640 
set->blocks = block;
}
  
+   AllocBlockAccountingAlloc(block);
+   if (AllocBlockAccountingOverLimit())
+   {
+   MemoryContextStats(TopMemoryContext);
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg("memory limit exceeded"),
+errdetail("Failed on request of size 
%lu.",
+  (unsigned long) 
size)));
+   }
+ 
set->isReset = false;
  
AllocAllocInfo(set, chunk);
***
*** 767,772 
--- 805,821 
  
block->next = set->blocks;
set->blocks = block;
+ 
+   AllocBlockAccountingAlloc(block);
+   i

Re: [HACKERS] "BEGIN TRANSACTION" and "START TRANSACTION": different error handling

2009-09-24 Thread daveg

On Thu, Sep 24, 2009 at 12:16:43PM +0300, Hannu Krosing wrote:
> "I expect the transaction is aborted and rollback is executed
> automatically." - this is not how postgreSQL behaves. PostgreSQL needs
> an explicit end of transaction from client, either COMMIT; or ROLLBACK;
> 
> when run from psql, they both act the same, except the string returned
> 
> hannu=# begin transaction;
> BEGIN
> hannu=# select 1/0;
> ERROR:  division by zero
> hannu=# select 1/0;
> ERROR:  current transaction is aborted, commands ignored until end of
> transaction block
> hannu=# abort;
> ROLLBACK
> hannu=# start transaction;
> START TRANSACTION
> hannu=# select 1/0;
> ERROR:  division by zero
> hannu=# select 1/0;
> ERROR:  current transaction is aborted, commands ignored until end of
> transaction block
> hannu=# abort;
> ROLLBACK
> 
> I suspect, that psqlodbc is the one doing the automatic rollback and it
> seems to rely on reply "BEGIN" to establish an in-transaction state.
> 
> so when "start transaction;" returns "START TRANSACTION" instead of
> "BEGIN", psqlodbc does not realise that it is in transaction and does
> not initiate the automatic rollback.

Well. I'd always thought BEGIN and START were syntactic Aspartame and had
the same underlying implementation. So this is a surprise. Why do they
return a different status?

-dg
`
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding \ev view editor?

2009-09-23 Thread daveg

On Mon, Sep 21, 2009 at 02:26:05PM -0400, Andrew Dunstan wrote:
>andrew=# select pg_get_viewdef('foo',true);
>pg_get_viewdef   
>--
>  SELECT 'a'::text AS b,
> ( SELECT 1
>FROM dual) AS x,
> random() AS y,
> CASE
> WHEN true THEN 1
> ELSE 0
> END AS c,
> 1 AS d
>FROM dual;
>(1 row)

+1 

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] numeric_to_number() function skipping some digits

2009-09-23 Thread daveg

On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
> 
> It seems that Oracle reads formatting string from right-to-left. Here are
> few results:
> ('number','format') ==> Oracle  PG
> 
> ('34,50','999,99')  ==> 3450340
> ('34,50','99,99')   ==> 34503450
> ('34,50','99,999')  ==> Invalid Number  3450
> ('34,50','999,999') ==> Invalid Number  340

It seems worse to to give a wrong answer silently then to throw an error.
What we do now seems sort of MySqlish.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] happy birthday Tom Lane ...

2009-09-18 Thread daveg

On Fri, Sep 18, 2009 at 01:04:23PM +0200, Hans-Juergen Schoenig -- PostgreSQL 
wrote:
> Tom,
> 
> On behalf of the entire PostgreSQL team here in Austria I want to wish 
> you a happy birthday.
> We hope that you fill be a vital part of PostgreSQL for many years to come.
> 
>   Best regards,
> 
>   Hans-Jürgen Schönig + team

+1 from me too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-17 Thread daveg

On Tue, Sep 15, 2009 at 07:38:18AM +0200, Pavel Stehule wrote:
> it isn't fair :) why you use $$ without single quote? And still this
> case should be vulnerable on SQL injection. Maybe you or me knows,
> what SQL injection means, but beginners knows nothing and this people
> use following bad code:
> 
> sql := $$SELECT * FROM '${table_name}'$$} and are happy. But this code
> is wrong!

I have an idea you will like less: have multiple interpolation codes that
automagically do the right quoting. Perhaps as extra printf like type codes.
The above then becomes:

  sql := pgprintf($$SELECT * FROM %I;$$, table_name )

Where %I evaluates as if it were quote_ident(%s).

This would maybe even encourage users to do the quoting they should by
making it easy.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ragged CSV import

2009-09-12 Thread daveg

On Fri, Sep 11, 2009 at 10:27:06AM +0200, Dimitri Fontaine wrote:
> Maybe instead of opening FROM for COPY, having it accepted in WITH would
> be better, the same way (from the user point of view) that DML returning
> are worked on.
... 
>  WITH csv AS (
>COPY t FROM stdin CSV
>  )
>  INSERT INTO foo(x, y, z) 
>  SELECT t[3], t[2], mytimestamptz([5], [6], [7])
>FROM csv;
> 
> Now the text[] has a strange feeling, without it it'd be:
> 
>  WITH csv AS (
>COPY t(a, b, c, d, e, f, g) 
>FROM stdin 
> CSV IGNORING EXTRA COLUMNS -- random nice syntax
> MISSING COLUMNS DEFAULTS NULL  -- that needs some reality check
>  )
>  INSERT INTO foo(x, y, z) 
>  SELECT c, b, mytimestamptz(e, f, g)
>FROM csv;
> 
> The function mytimestamptz(date text, time text, timezone text) will
> accept input that PostgreSQL input types would have errored out on... so
> you can process in one go strange formats from other products.

+1

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-12 Thread daveg

On Fri, Sep 11, 2009 at 11:43:32AM -0400, Merlin Moncure wrote:
> 
> If you are going to use printf format codes, which is good and useful
> being something of a standard, I'd call routine printf (not format)
> and actually wrap vsnprintf.  The format codes in printf have a very
> specific meaning: converting native C types to arrays of characters.
> I think that a postgresql implementation should do exactly that:
> attempt to convert the passed in datum to the c type in question if
> possible (erroring if no cast exists) and then pass it down.  The idea
> is we are not adding new formatting routines but using a very high
> quality existing one...why reinvent the wheel?
> 
> so if you did: select printf('%s %3.1f', foo::box, bar::circle);
> the box to char* cast would work (using the text cast) but the second
> cast would fail unless the user added a cast to float.  The code in
> question is easy to imagine...parse the format string, and loop the
> varargs using the appropriate looked up cast one by one...

+1

-dg 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] remove flatfiles.c

2009-09-03 Thread daveg

On Thu, Sep 03, 2009 at 07:57:25PM -0400, Andrew Dunstan wrote:
> daveg wrote:
> >On Tue, Sep 01, 2009 at 07:42:56PM -0400, Tom Lane wrote:
> >>I'm having a hard time believing that VACUUM FULL really has any
> >>interesting use-case anymore.
> >
> >I have a client who uses temp tables heavily, hundreds of thousands of 
> >creates
> >and drops per day. They also have long running queries. The only thing that
> >keeps catalog bloat somewhat in check is vacuum full on bloated catalogs
> >a few times a day. Without that pg_class, pg_attribute etc quickly balloon 
> >to thousands of pages.
> 
> That's a rate of more than one create and drop per second. How does your 
> client handle the fact that VACUUM FULL will exclusively lock those 
> catalog tables? Without knowing more, it looks like a bit of a design issue.

I'd say it is several per second.

They wait for the catalog locks sometimes. This is not an interactive
application so that is somewhat acceptable. It also occasionally causes
deadlocks which is less agreeable.

There are various reasons for the heavy use of temps, mainly having to do
with loading external feeds or reusing intermediate query results in a series
of queries.

It would be great if there was a way to have temp tables that
did not get cataloged, eg local cache only.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] remove flatfiles.c

2009-09-03 Thread daveg

On Tue, Sep 01, 2009 at 07:42:56PM -0400, Tom Lane wrote:
> Greg Stark  writes:
> > On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
> > Herrera wrote:
> >>> The use cases where VACUUM FULL wins currently are where storing two
> >>> copies of the table and its indexes concurrently just isn't practical.
> >> 
> >> Yeah, but then do you really need to use VACUUM FULL?  If that's really
> >> a problem then there ain't that many dead tuples around.
> 
> > That's what I want to believe. But picture if you have, say a
> > 1-terabyte table which is 50% dead tuples and you don't have a spare
> > 1-terabytes to rewrite the whole table.
> 
> But trying to VACUUM FULL that table is going to be horridly painful
> too, and you'll still have bloated indexes afterwards.  You might as
> well just live with the 50% waste, especially since if you did a
> full-table update once you'll probably do it again sometime.
> 
> I'm having a hard time believing that VACUUM FULL really has any
> interesting use-case anymore.

I have a client who uses temp tables heavily, hundreds of thousands of creates
and drops per day. They also have long running queries. The only thing that
keeps catalog bloat somewhat in check is vacuum full on bloated catalogs
a few times a day. Without that pg_class, pg_attribute etc quickly balloon to
thousands of pages.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add YAML option to explain

2009-08-31 Thread daveg

On Mon, Aug 31, 2009 at 02:15:08PM -, Greg Sabino Mullane wrote:
> > Greg, can we see a few examples of the YAML output
> > compared to both json and text?
...
> greg=# explain (format json, analyze on) select * from pg_class where relname 
> ~ 'x' order by 1,2,3;
> QUERY PLAN
> - ---

An interesting property of json, it is almost exactly the same as python
data structure syntax. If I paste the following into python:

plan = [
 {
   "Plan": {
 "Node Type": "Sort",
 "Startup Cost": 12.82,
 "Total Cost": 13.10,
 "Plan Rows": 111,
 "Plan Width": 185,
 "Actual Startup Time": 1.152,
 "Actual Total Time": 1.373,
 "Actual Rows": 105,
 "Actual Loops": 1,
 "Sort Key": ["relname", "relnamespace", "reltype"],
 "Sort Method": "quicksort",
 "Sort Space Used": 44,
 "Sort Space Type": "Memory",
 "Plans": [
   {
 "Node Type": "Seq Scan",
 "Parent Relationship": "Outer",
 "Relation Name": "pg_class",
 "Alias": "pg_class",
 "Startup Cost": 0.00,
 "Total Cost": 9.05,
 "Plan Rows": 111,
 "Plan Width": 185,
 "Actual Startup Time": 0.067,
 "Actual Total Time": 0.817,
 "Actual Rows": 105,
 "Actual Loops": 1,
 "Filter": "(relname ~ 'x'::text)"
   }
 ]
   },
   "Triggers": [
   ],
   "Total Runtime": 1.649
 }
   ]

I get a python data structure. Which can be manipulated directly, or pretty
printed:

>>> import pprint
>>> pprint.pprint(plan)
[{'Plan': {'Actual Loops': 1,
   'Actual Rows': 105,
   'Actual Startup Time': 1.1519,
   'Actual Total Time': 1.373,
   'Node Type': 'Sort',
   'Plan Rows': 111,
   'Plan Width': 185,
   'Plans': [{'Actual Loops': 1,
  'Actual Rows': 105,
  'Actual Startup Time': 0.067004,
  'Actual Total Time': 0.81695,
  'Alias': 'pg_class',
  'Filter': "(relname ~ 'x'::text)",
  'Node Type': 'Seq Scan',
  'Parent Relationship': 'Outer',
  'Plan Rows': 111,
  'Plan Width': 185,
  'Relation Name': 'pg_class',
  'Startup Cost': 0.0,
  'Total Cost': 9.0507}],
   'Sort Key': ['relname', 'relnamespace', 'reltype'],
   'Sort Method': 'quicksort',
   'Sort Space Type': 'Memory',
   'Sort Space Used': 44,
   'Startup Cost': 12.82,
   'Total Cost': 13.1},
  'Total Runtime': 1.649,
  'Triggers': []}]

I'm not sure if all json can be read this way, but the python and json
notations are very similar.

-dg

--
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add YAML option to explain

2009-08-28 Thread daveg

On Fri, Aug 28, 2009 at 04:37:41PM -0700, David E. Wheeler wrote:
> On Aug 28, 2009, at 3:45 PM, Stephen Frost wrote:
> 
> >+1 from me.  I've read the other comments and just plain don't agree
> >with them.  It's a small patch, adds a useful format for EXPLAIN, and
> >would be used.
> >
> >One of the best things about PG is the flexibility and usability.
> 
> I agree, I tend to prefer YAML output where it's parseable (and I  
> expect it the EXPLAIN YAML output won't be doing anything tricky).
> 
> That said, maybe there should be a way to create modules add formats,  
> instead of adding them to core?

+1

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.5 release timetable, again

2009-08-28 Thread daveg

On Thu, Aug 27, 2009 at 08:02:03PM -0700, Ron Mayer wrote:
> Andrew Dunstan wrote:
> > I don't know of anyone who is likely to want to try out alphas in their
> > normal development environments. The client I approached was
> > specifically prepared to test beta releases that way.
> 
> Perhaps end-users won't, but I think companies who develop software that
> works on top of postgres will. Perhaps to make sure their existing software
> continues to work; or perhaps to get a head start working with new features.
> I test against CVS-head occasionally.

I've been trying to help a client take up new versions of postgresql
more quickly as the performance or feature content is often very valuable
to them. Accordingly, I have encouraged them to run periodic samples of the
nightly snapshots on at least one development instance, and to run the
betas in the test environment. The goal is to be confident on the day of the
postgresql release that we have tested enough and fixed any incompatibilities
so that, if they so choose, they could migrate production to the new release
immediately. This way they get several months extra benefit from improvements
to postgresql. 

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.5 release timetable, again

2009-08-28 Thread daveg

On Thu, Aug 27, 2009 at 09:38:15PM +0200, Dimitri Fontaine wrote:
> Exactly, and I think that what we're missing here is a simple tool for
> our users to check a new PostgreSQL release against their existing
> application.
> 
> We already know how to either log all queries and analyze the log files
> (CSV makes it easier, pgfouine parses them too) or to have a fe/be
> protocol proxy to record application SQL traffic (tsung recorder does
> that).
> 
> What we miss is a tool to run the captured queries through both versions
> of PG and report any resultset mismatch, of course with a way to account
> for ordering issues (but we've seen people rely on the ordering when
> they don't give an order by clause, then bug the lists about it if a new
> release changes it).

This would be very useful. I often am asked "how much better will the new
release run our apps" as part of convincing a client to upgrade to a
more current postgresql release. Being able to replay a days workload in a
somewhat realistic manner would be a great help.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GRANT ON ALL IN schema

2009-08-16 Thread daveg

On Sun, Aug 16, 2009 at 02:59:53PM +0200, Pavel Stehule wrote:
> 2009/8/16 Peter Eisentraut :
> > On sön, 2009-08-16 at 00:04 -0400, Andrew Dunstan wrote:
> >> SQL is not Lisp. Simple is  good. I didn't think Peter was really very
> >> serious.
> >
> > Well, I don't know if we really need to call it "lambda", but I fully
> > expect to be able to use these "ad hoc functions" as part of other
> > expressions.  So making DO or whatever a top-level command that does not
> > integrate with anything else would not really satisfy me.
> 
> +1

+1

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GRANT ON ALL IN schema

2009-08-16 Thread daveg

On Sat, Aug 15, 2009 at 11:34:04PM +0200, Dimitri Fontaine wrote:
> Nitpicking dept, I think I prefer:
> 
>  DO [ [LANGUAGE] language] $$ ... $$;
>  DO plperl $$ ... $$;
>  DO language plpython $$ ... $$;
> 
> language is optional and defaults to plpgsql.

+1
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alpha releases: How to tag

2009-08-08 Thread daveg

On Fri, Aug 07, 2009 at 06:28:34PM -0400, Tom Lane wrote:
> David Fetter  writes:
> > I am not suggesting that this change be immediate, and it's not ivory
> > tower.  It's just how everybody else does it.
> 
> You keep saying that, and it's completely meaningless.  What do you know
> about the development practices of Oracle, or DB2, or even Mysql?

When I was at Sybase, changes to the on disk structure were required to
provide code to do the migration. Nonetheless, at release time, the
migrate process was almost always discovered to be broken, sometimes even
before it was shipped to customers.

Of course, Sybase implemented its own complete filesystem layer on top of
raw partitions, so there was more scope to go wrong, especially since it
was possible to corrupt the on disk structure in subtle ways that would
not be discovered in normal operation but that would cause migration to
corrupt it still further.

In fairness, this is a very difficult problem to solve well and I expect
to rely on dump/load migrations for quite sometime.

-dg 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-08-03 Thread daveg

On Mon, Aug 03, 2009 at 11:21:43AM -0400, Tom Lane wrote:
> "Kevin Grittner"  writes:
> > Over the weekend I ran 40 restores of Milwaukee County's production
> > data using Friday's snapshot with and without the patch.  I alternated
> > between patched and unpatched.  It appears that this latest version is
> > slightly slower for our production database on the same machine and
> > configuration where the previous patch appeared to be 1% to 2% faster
> > than unpatched (although I had fewer samples of that).
> 
> I think we can conclude that for this particular test case, the effects
> of the patch are pretty much masked by noise.  I definitely see no way
> that the latest version of the patch could really be slower than the
> original; it has the same job-scheduling behavior and strictly less
> list-munging overhead.  Now the patch could be slower than unpatched
> as a result of different job-scheduling behavior ... but there's no
> evidence here of a consistently measurable benefit or loss from that.
> 
> IIRC daveg was volunteering to do some tests with his own data; maybe
> we should wait for those results.

I have run extensive tests with three trials of each configuration on two
hosts with a variety of db sizes from 3GB to 142GB. These just finished,
and I will send a more detailed summary later, but at the moment I don't
see any significant difference between the patched and vanilla pg_restore.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-31 Thread daveg

On Thu, Jul 30, 2009 at 12:29:34PM -0500, Kevin Grittner wrote:
> Tom Lane  wrote: 
>  
> > I think we've pretty much established that it doesn't make things
> > *worse*, so I'm sort of inclined to go ahead and apply it.  The
> > theoretical advantage of eliminating O(N^2) search behavior seems
> > like reason enough, even if it takes a ridiculous number of tables
> > for that to become significant.
>  
> Agreed, although I'm having some concerns about whether this should
> proceed based exclusively on my benchmarks.  On a thread on the
> performance list, people are talking about restores which go several
> times faster with parallel restore (compared to a single job).  On my
> hardware, I haven't even gotten it to run twice as fast.  This means
> that parallel restore is not a good fit for servers like we have, at
> least with databases like we have, which means it's probably a poor
> environment to get benchmarks for this patch.  :-(
>  
> Can we get someone who has benchmarks showing parallel restore to be
> eight times the speed of a single job to benchmark with this patch,
> just for confirmation?

I have a couple "spare" 32GB 4 core and 64GB 8 core servers with 15 scsi/sas
drives and dumps of production dbs in the 100GB to 500 GB range. These have
several hundred tables most with an index or few and an fkey or too many.

It will take a couple days to run a variety of tests I suppose, and I will
be away starting mid next week, but maybe I could get some done before I go.

Will the patch apply to a vanilla 8.4.0?


-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] display previous query string of idle-in-transaction

2009-07-23 Thread daveg

On Thu, Jun 04, 2009 at 10:22:41PM -0400, Robert Haas wrote:
> The only thing I don't like about this is that I think it's kind of a
> hack to shove the  in transaction designation and the query
> string into the same database column.  I've never liked having to
> write:
> 
> select sum(1) from pg_stat_activity where current_query = ' in
> transaction';
> 
> ...and I like it even less if I now have to modify that query to use
> "like".  We should find some way to represent this as structured
> data...  maybe make a separate column called "idle" that's a boolean,
> or something, and let the query column contain the most recent query
> (whether or not it's still executing).
 
I like this idea a lot. Possibly it would be useful to have the end time
of the last query too, then one could find idle sessions that were old and
truly idle rather than just waiting for a busy client to send the next query.

   select ... from pg_stat_activity
 where idle
   and last_statement_endtime < now() - interval '1 minute';

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is a plan for lmza commpression in pg_dump

2009-02-07 Thread daveg

On Sat, Feb 07, 2009 at 08:49:29PM -0500, Robert Haas wrote:
> "Proprietary compression algorithms, even with Postgresql-specific  
> license exceptions"?

To be fair, lzo is GPL, which is a stretch to consider proprietary.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is a plan for lmza commpression in pg_dump

2009-02-07 Thread daveg

On Sat, Feb 07, 2009 at 02:47:05PM -0500, Bruce Momjian wrote:
> daveg wrote:
> > On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
> > > Dann Corbit wrote:
> > > >
> > > >The LZMA SDK is granted to the public domain:
> > > >http://www.7-zip.org/sdk.html
> > > >
> > > 
> > > I played with this but found the SDK extremely confusing and flat out 
> > > horrible. One personal dislike was the unnecessary use of C++; although 
> > > it 
> > >  was the horrible API that turned me off.  I'm not even sure if I ever 
> > > got a 
> > > test program working.
> > > 
> > > LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy 
> > > API with many variants; my fav is LZO1X-1(15).  Its known for its 
> > > compresison and decompresison speeds ... its blazing fast.  zlib 
> > > typically 
> > > gets 5-8% more compression.
> > 
> > LZO rocks. I wonder if the lzo developer would consider a license exception
> > so that postgresql could use it?  What would we need?
> 
> The chance of us using anything but one zlib is near zero so please do
> not persue this;  this discussion comes up much too often.

That this comes up "much to often" suggests that there is more than near
zero interest.  Why can only one compression library can be considered?
We use multiple readline implementations, for better or worse.

I think the context here is for pg_dump only and in that context a faster
compression library makes a lot of sense. I'd be happy to prepare a patch
if the license issue can be accomodated. Hence my question, what sort of
licence accomodation would we need to be able to use this library?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is a plan for lmza commpression in pg_dump

2009-02-07 Thread daveg

On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
> Dann Corbit wrote:
> >
> >The LZMA SDK is granted to the public domain:
> >http://www.7-zip.org/sdk.html
> >
> 
> I played with this but found the SDK extremely confusing and flat out 
> horrible. One personal dislike was the unnecessary use of C++; although it 
>  was the horrible API that turned me off.  I'm not even sure if I ever got a 
> test program working.
> 
> LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy 
> API with many variants; my fav is LZO1X-1(15).  Its known for its 
> compresison and decompresison speeds ... its blazing fast.  zlib typically 
> gets 5-8% more compression.

LZO rocks. I wonder if the lzo developer would consider a license exception
so that postgresql could use it?  What would we need?

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Keeping creation time of objects

2008-09-09 Thread daveg

On Tue, Sep 09, 2008 at 11:03:56PM +0300, Hannu Krosing wrote:
> On Tue, 2008-09-09 at 12:40 -0700, daveg wrote:
> 
> > I'd be very interested in seeing a last schema modification time for 
> > pg_class
> > objects. I don't care about it being preserved over dump and restore as my
> > use case is more to find out when a table was created with a view to finding
> > out if it is still needed.
> 
> Isn't it easier to find out if it is still needed by looking if it is
> still used, say from pg_stat_user_tables ?

Except that pg_dump will access it and make it look used. Also, this does
not work for functions, views etc.

It seems to me to be pretty simple to put an abstime or timestamp column
on the major catalog tables and update it when the row is updated. A mod
time is more useful probably than a create time.

-dg
 

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Keeping creation time of objects

2008-09-09 Thread daveg

On Tue, Sep 09, 2008 at 03:36:19PM -0400, Tom Lane wrote:
> Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= <[EMAIL PROTECTED]> writes:
> > AFAICS, PostgreSQL is not keeping info about when a table, database,
> > sequence, etc was created. We cannot get that info even from OS, since
> > CLUSTER or VACUUM FULL may change the metadata of corresponding
> > relfilenode.
> 
> > Does anyone think that adding a timestamp column to pg_class would bring
> > an overhead?
> 
> There isn't sufficient support for such a "feature".  In any case, why
> would creation time (as opposed to any other time, eg last schema
> modification, last data modification, yadda yadda) be especially
> significant?  Would you expect it to be preserved over dump/restore?
> How about every other object type in the system?

I'd be very interested in seeing a last schema modification time for pg_class
objects. I don't care about it being preserved over dump and restore as my
use case is more to find out when a table was created with a view to finding
out if it is still needed. So the question I'm looking to answer is "when did
that get here?"

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IN vs EXISTS equivalence

2008-09-02 Thread daveg

On Thu, Aug 14, 2008 at 06:50:09PM +0100, Simon Riggs wrote:
> 
> On Fri, 2008-08-08 at 16:23 -0400, Tom Lane wrote:
> 
> > NOT IN is a lot trickier,
> > condition: you must also assume that the comparison operator involved
> > never yields NULL for non-null inputs.  That might be okay for btree
> > comparison functions but it's not a very comfy assumption in general;
> > we certainly haven't got any explicit knowledge that any functions are
> > guaranteed to act that way.  So this case might be worth doing later
... 
> Just found this comment, after reading what you said on other thread
> about NOT IN.
> 
> NOT IN is a serious performance issue for most people. We simply can't
> say to people "you were told not to".
> 
> If we can fix it easily for the majority of cases, we should. We can't
> let the "it won't work in certain cases" reason prevent various

A suggestion: what about adding an attribute to functions to declare that
they never return null? 

   declare foo(int, int) returns int immutable not null as ...


-dg


-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal sql: labeled function params

2008-08-23 Thread daveg

On Sat, Aug 23, 2008 at 05:08:25PM +0100, Gregory Stark wrote:
> "Pavel Stehule" <[EMAIL PROTECTED]> writes:
> 
> > Hello
> >
> > 2008/8/23 Peter Eisentraut <[EMAIL PROTECTED]>:
> >> On Friday 22 August 2008 07:41:30 Decibel! wrote:
> >>> If we're really worried about it we can have a GUC for a few versions
> >>> that turns off named parameter assignment. But I don't think we
> >>> should compromise the design on the theory that some folks might be
> >>> using that as an operator *and* can't change their application to
> >>> wrap it's use in ().
> >>
> >> Even if that were a reasonable strategy, you can't use GUC parameters to 
> >> alter
> >> parser behavior.
> >
> > I thing, so it's possible - in this case. We should transform named
> > params to expr  after syntax analyze.
> 
> So for a bit of useless syntactic sugar we should introduce conflicts with
> named parameters, conflicts with operators, introduce an un-sqlish syntax and
> remove a feature users have already made use of and introduce backwards
> compatibility issues for those users?
> 
> At any point in this discussion has anyone explained why these labels would
> actually be a good idea?

I was missing that too. What is this for that makes it so compelling?

-dg
 

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A smaller default postgresql.conf

2008-08-19 Thread daveg

On Tue, Aug 19, 2008 at 09:39:39PM +0300, Peter Eisentraut wrote:
> On Tuesday 19 August 2008 19:12:16 Tom Lane wrote:
> > Well, why not just make a one-eighty and say that the default
> > postgresql.conf is *empty* (except for whatever initdb puts into it)?
> 
> Well, my original implementation of GUC had an empty default configuration 
> file, which was later craptaculated to its current form based on seemingly 
> popular demand.  I am very happy to work back toward the empty state, and 
> there appears to be growing support for that.

I like the almost empty file idea.

I often use the include facility to setup postgresql.conf files
that look roughly like this:

  #include postgresql_site_policy.conf
  #include postgresql_.conf
  listen_port=

where the included files set things like the overall sites logging
preferences and buffer_cache and workmen based on installed memory and
disk system (ex: postgresql_32GB.conf) etc.

-dg


-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adjusting debug_print_plan to be more useful by default

2008-08-19 Thread daveg

On Tue, Aug 19, 2008 at 06:33:33PM +0100, Simon Riggs wrote:
> 
> On Tue, 2008-08-19 at 12:40 -0400, Tom Lane wrote:
> > Back in April we changed EXPLAIN VERBOSE to not dump the internal plan
> > tree anymore, on the grounds that non-hackers didn't want that info and
> > hackers could get it with debug_print_plan and related variables.
> > Well, now that I've tried to do some planner development work relying on
> > debug_print_plan instead of EXPLAIN VERBOSE, I find it a mite annoying.
> > It's not sufficient to set debug_print_plan = true, because the output
> > comes out at priority DEBUG1, which is to say it doesn't come out at all
> > in a default configuration.  If you boost up client_min_messages or
> > log_min_messages so you can see it, you get lots of extraneous debugging
> > messages too.
> > 
> > I'd like to propose that the messages emitted by debug_print_plan
> > and friends be given priority LOG rather than DEBUG1.  If you've gone
> > to the trouble of turning on the variable, then you presumably want the
> > results, so it seems dumb to print them at a priority that isn't logged
> > by default.  (Note that this is biased to the assumption that you want
> > the messages in the postmaster log, not on the console.  Which is
> > usually what I want, but maybe someone wants to argue for NOTICE?)
> > 
> > I'd also like to propose making debug_pretty_print default to ON.
> > At least for me, the other formatting is 100% unreadable.
> 
> +1

+1

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Visibility Groups

2008-08-08 Thread daveg

On Thu, Aug 07, 2008 at 01:30:27PM +0100, Gregory Stark wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> 
> > Currently, we calculate a single OldestXmin across all snapshots on the
> > assumption that any transaction might access any table.
> >
> > I propose creating "Visibility Groups" that *explicitly* limit the
> > ability of a transaction to access data outside its visibility group(s).
> > By default, visibility_groups would be NULL, implying potential access
> > to all tables.
> >
> > Once set, any attempt to lock an object outside of a transactions
> > defined visibility_groups will result in an error:
> >   ERROR attempt to lock table outside of visibility group(s): foo
> >   HINT you need to set a different value for visibility_groups
> > A transaction can only ever reduce or restrict its visibility_groups, it
> > cannot reset or add visibility groups.
> 
> Hm, so backing up a bit from the specific proposed interface, the key here is
> being able to explicitly mark which tables your transaction will need in the
> future?
> 
> Is it always just a handful of heavily updated tables that you want to
> protect? In that case we could have a lock type which means "I'll never need
> to lock this object". Then a session could issue "LOCK TABLE foo IN
> INACCESSIBLE MODE" or something like that. That requires people to hack up
> their pg_dump or replication script though which might be awkward.
> 
> Perhaps the way to do that would be to preemptively take locks on all the
> objects that you'll need, then have a command to indicate you won't need any
> further objects beyond those. 

+1

-dg
-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg

On Mon, Aug 04, 2008 at 05:19:50PM -0400, Robert Treat wrote:
> See, this is what we ended up talking about before. Someone will say "I'd 
> like 
> to prevent my devs from accidentally doing queries with cartesian products" 
> and they will use this to do it... but that will only work in some cases, so 
> it becomes a poor tool to solve a different problem. 
> 
> BTW, what I really love about statement costs, is that they aren't even 
> reliable on the same machine with the same data. I have seen query plans 
> which run on the same data on the same machine where the resultant query 
> runtime can vary from 2 hours to 5 hours, depending on how much other 
> concurrent traffic is on the machine. Awesome eh? 

Sure, I don't think anyone believes that costs are precise. But the case that
is interesting is 2 hours versus years and years.

> The footgun in my mind is that people will think this solves a number of 
> problems even though it doesnt solve them well.  However, the footgun for yo

I suspect that a good solution to this problem is impossible as it is more
or less the halting problem. So I'm willing to accept a poor solution based
on costs and then hope we improve the cost model.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Automatic Client Failover

2008-08-04 Thread daveg

On Mon, Aug 04, 2008 at 05:17:59PM -0400, Jonah H. Harris wrote:
> On Mon, Aug 4, 2008 at 5:08 PM, Simon Riggs <[EMAIL PROTECTED]> wrote:
> > When primary server fails, it would be good if the clients connected to
> > the primary knew to reconnect to the standby servers automatically.
> 
> This would be a nice feature which many people I've talked to have
> asked for.  In Oracle-land, it's called Transparent Application
> Failover (TAF) and it gives you a lot of options, including the
> ability to write your own callbacks when a failover is detected.

This might be better done as part of a proxy server, eg pgbouncer, pgpool
than as part of postgresql or libpq. I like the concept, but the logic to
determine when a failover has occurred is complex and a client will often
not have access to enough information to make this determination accurately.

postgresql could have hooks to support this though, ie to determine when a
standby thinks it has become the master.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg

On Mon, Aug 04, 2008 at 11:59:03AM -0700, Josh Berkus wrote:
> Greg,
> 
> >Well that's going to depend on the application But I suppose there's
> >nothing wrong with having options which aren't always a good idea to use. 
> >The
> >real question I guess is whether there's ever a situation where it would 
> >be a
> >good idea to use this. I'm not 100% sure.
> 
> I can think of *lots*.   Primarily, simple web applications, where 
> queries are never supposed to take more than 50ms.  If a query turns up 
> with an estimated cost of 100, then you know something's wrong; 
> in the statistics if not in the query.  In either case, that query has a 
> good chance of dragging down the whole system.
> 
> In such a production application, it is better to have false positives 
> and reject otherwise-OK queries becuase their costing is wrong, than to 
> let a single cartesian join bog down an application serving 5000 
> simultaneous users.  Further, with a SQL error, this would allow the 
> query rejection to be handled in a user-friendly way from the UI 
> ("Search too complex.  Try changing search terms.") rather than timing 
> out, which is very difficult to handle well.
> 
> The usefulness of this feature for interactive sessions is 
> limited-to-nonexistant.  It's for production applications.

Ok, that is a different use case where an error seems very useful. What
about slightly extending the proposal to have the severity of exceeding
the limit configurable too. Something like:

   costestimate_limit = 10 # default 0 to ignore limit
   costestimate_limit_severity = error # debug, notice, warning, error

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg

On Mon, Aug 04, 2008 at 02:35:07PM -0400, Robert Treat wrote:
> On Monday 04 August 2008 03:50:40 daveg wrote:
> 
> That's great for you, I am talking in the scope of a general solution. (Note 
> I'd also bet that even given the same hardware, different production loads 
> can produce different relative mappings of cost vs. performance, but 
> whatever)

Even on different hardware it would still likely warn of mistakes like
products due to missing join conditions etc.
 
> > > I still think it is worth revisiting what problems people are trying to
> > > solve, and see if there are better tools they can be given to solve them.
> > >  Barring that, I suppose a crude solution is better than nothing, though
> > > I fear people might point at the crude solution as a good enough solution
> > > to justify not working on better solutions.
> >
> > Alerting developers and QA to potentially costly queries would help solve
> > some of the probems we are trying to solve. Better tools are welcome, an
> > argument that the good is the enemy of the best so we should be content
> > with nothing is not.
> 
> And you'll note, I specifically said that a crude tool is better than 
> nothing. 

I released somewhat after I sent the above that it might have sounded a bit
snippy. I hope I have not offended.

> But your completely ignoring that a crude tool can often end-up as a foot-gun 
> once relased into the wild. 

I'm suggesting a warning, or even just a notice into the logs, I don't see
the footgun. What am I missing?

Regards

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg

On Mon, Aug 04, 2008 at 03:09:34PM -0400, Greg Smith wrote:
> On Mon, 4 Aug 2008, daveg wrote:
> >We load the production dumps into our dev environment, which are the same
> >hardware spec, so the costs should be identical.
> 
> Not identical, just close.  ANALYZE samples data from your table randomly. 
> The statistics used to compute the costs will therefore be slightly 
> different on the two servers even if the data is the same.  The problem of 
> discovering one plan on production and another on development is not quite 
> that easy to remove.  Ultimately, if your developers aren't thorough 
> enough to do thinks like look at EXPLAIN plans enough to discover things 
> that are just bad, I just chuckle at your thinking that putting a single 
> limiter on their bad behavior will somehow magically make that better.

Not all developers can be persuaded to run explain on every change.
However, many will investigate a new message. I'm only hoping to try to
focus their attention toward possible problem queries.

> Anyway, if your production server is small enough that you can afford to 
> have another one just like it for the developers to work on, that's great. 
> Robert's point is that many installs don't work like that.  The 
> development teams in lots of places only get a subset of the production 
> data because it's too large to deploy on anything but a big server, which 
> often is hard to cost justify buying just for development purposes.

Not to get into a size war ;-), but the production environment I'd like this
feature for is over 40 32GB 16 scsi drive quadcore boxes. These are dedicated
to postgresql and run one or just a few databases. There are also a bunch
of client boxes that we will not speak of. The staging and test environments
are similar hardware but have only a subset of the databases copied to them.
There are probably than a dozen DB hosts for that.

> I like the concept of a cost limit, but I'm a bit horrified by the thought 
> of it being exposed simply through the internal cost numbers because they 
> are so arbitrary.  One of the endless projects I think about but never 

> start coding is to write something that measures the things the planner 
> cost constants estimate on a particular machine, so that all those numbers 
> actually can be tied to some real-world time measure.  If you did that, 
> you'd actually have a shot at accomplishing the real goal here, making 
> statement_cost_limit cut off statements expected to take longer than 
> statement_timeout before they even get started.

That is a nice idea. Possibly it could be a utility like the fsync tester.

But planner estimates are never going to be all that accurate even with solid
cost numbers because for some classes of queries, particularly those with
many joins the stats can be good at each level but the error accumulates
exponentially. Which is why I think a warning is appropriate instead of an
error. Even a notice in the logs would be useful.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg

On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
> 
> ISTR that what ended up killing the enthusiasm for this was that most people 
> realized that this GUC was just a poor tool to take a stab at solving other 
> problems (ie. rate limiting cpu for queries). 

I'm not concerned with that, I want developers to have feed back on costs in
a way that is obvious.
 
> > > I think a variation on this could be very useful in development and test
> > > environments. Suppose it raised a warning or notice if the cost was over
> > > the limit. Then one could set a limit of a few million on the development
> > > and test servers and developers would at least have a clue that they
> > > needed to look at explain for that query. As it is now, one can exhort
> > > them to run explain, but it has no effect.  Instead we later see queries
> > > killed by a 24 hour timeout with estimated costs ranging from "until they
> > > unplug the machine and dump it" to "until the sun turns into a red
> > > giant".
> >
> > Great argument. So that's 4 in favour at least.
> >
> 
> Not such a great argument. Cost models on development servers can and often 
> are quite different from those on production, so you might be putting an 
> artifical limit on top of your developers. 

We load the production dumps into our dev environment, which are the same
hardware spec, so the costs should be identical.
 
> I still think it is worth revisiting what problems people are trying to 
> solve, 
> and see if there are better tools they can be given to solve them.  Barring 
> that, I suppose a crude solution is better than nothing, though I fear people 
> might point at the crude solution as a good enough solution to justify not 
> working on better solutions. 

Alerting developers and QA to potentially costly queries would help solve
some of the probems we are trying to solve. Better tools are welcome, an
argument that the good is the enemy of the best so we should be content with
nothing is not.

-dg
 

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-03 Thread daveg

On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote:
> On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:
> 
> >Andrew Dunstan <[EMAIL PROTECTED]> writes:
> >>Hans-Jürgen Schönig wrote:
> >>>i introduced a GUC called statement_cost_limit which can be used to
> >>>error out if a statement is expected to be too expensive.
> >
> >>You clearly have far more faith in the cost estimates than I do.
> >
> >Wasn't this exact proposal discussed and rejected awhile back?
> >
> > regards, tom lane
> >
> 
> 
> i don't remember precisely.
> i have seen it on simon's wiki page and it is something which would  
> have been useful in some cases in the past.

I think a variation on this could be very useful in development and test
environments. Suppose it raised a warning or notice if the cost was over
the limit. Then one could set a limit of a few million on the development
and test servers and developers would at least have a clue that they needed
to look at explain for that query. As it is now, one can exhort them to
run explain, but it has no effect.  Instead we later see queries killed
by a 24 hour timeout with estimated costs ranging from "until they unplug
the machine and dump it" to "until the sun turns into a red giant".

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Copy storage parameters on CREATE TABLE LIKE/INHERITS

2008-07-30 Thread daveg

On Wed, Jul 30, 2008 at 04:45:47PM +0900, ITAGAKI Takahiro wrote:
> Here is a patch to copy column storage parameters and reloptions on
> CREATE TABLE LIKE, which I proposed at:
>   [HACKERS] Uncopied parameters on CREATE TABLE LIKE
>   http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
> 
> Column storage parameters (by ALTER COLUMN SET STORAGE) and table
> storage parameters (by ALTER TABLE SET (...) ) are copied from template
> tables on LIKE or parent tables on INHERITS (1. and 2. at above e-mail).
> The patch does not include copying of comments (3.) for now.
> It also includes new regression tests and rough documentation.
> 
> When template or parent tables have non-default settings,
> they are copied into a new table automatically on CREATE TABLE.
> If CREATE TABLE statement has own storage parameter definition,
> they overwrites inherited settings.

I'd like to have the ability to copy these parameters, but not to have it
be automatic. Some of my clients applications commonly use CREATE TABLE
LIKE to get empty work tables to populate and update before adding the
data to the main table (which may have non-default settings). The automatic
behaviour may be undesirable for this use.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] pg_dump additional options for performance

2008-07-27 Thread daveg

On Sun, Jul 27, 2008 at 10:37:34AM +0100, Simon Riggs wrote:
> 
> On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote:
> 
> > 2. We have no concurrency which means, anyone with any database over 50G
> > has unacceptable restore times.
> 
> Agreed.
> 
> Also the core reason for wanting -w
> 
> > 3. We have to continue develop hacks to define custom utilization. Why
> > am I passing pre-data anything? It should be automatic. For example:

[adding hackers for discussion]

On Sat, Jul 26, 2008 at 01:56:14PM -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I want to dump tables separately for performance reasons. There are
> > documented tests showing 100% gains using this method. There is no gain
> > adding this to pg_restore. There is a gain to be had - parallelising
> > index creation, but this patch doesn't provide parallelisation.
> 
> Right, but the parallelization is going to happen sometime, and it is
> going to happen in the context of pg_restore.  So I think it's pretty
> silly to argue that no one will ever want this feature to work in
> pg_restore.
> 
> To extend the example I just gave to Stephen, I think a fairly probable
> scenario is where you only need to tweak some "before" object
> definitions, and then you could do
> 
> pg_restore --schema-before-data whole.dump >before.sql
> edit before.sql
> psql -f before.sql target_db
> pg_restore --data-only --schema-after-data -d target_db whole.dump
> 
> which (given a parallelizing pg_restore) would do all the time-consuming
> steps in a fully parallelized fashion.

A few thoughts about pg_restore performance:

To take advantage of non-logged copy, the table create and data load should
be in the same transaction.

To take advantage of file and buffer cache, it would be better to create
indexes immediately after loading table data. Many tables will be small
enough to fit in cache on and this will avoid re-reading them for index
builds. This is more advantagious with more indexes on one table. There
may also be some filesytem placement benefits to building the indexes for
a table immediately after loading the data.

Creating constraints immediately after loading data also would benefit from
warm buffer and file caches. Doing this this is complicated by the need
for indexes and data in the referenced tables to exist first.

It seems that a high performance restore will want to procede in a different
order than the current sort order or that proposed by the before/data/after
patch.

 - The simplest unit of work for parallelism may be the table and its
   "decorations", eg indexes and relational constraints.

 - Sort tables by foreign key dependency so that referenced tables are
   loaded before referencing tables.

 - Do table creation and data load together in one transaction to use
   non-logged copy. Index builds, and constraint creation should follow
   immediately, either as part of the same transaction, or possibly
   parallelized themselves.

Table creation, data load, index builds, and constraint creation could
be packaged up as the unit of work to be done in a subprocess which either
completes or fails as a unit. The worker process would be called with
connection info, a file pointer to the data, and the DDL for the table.
pg_restore would keep a work queue of tables to be restored in FK dependency
order and also do the other schema operations such as functions and types.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding WHERE clause to pg_dump

2008-07-25 Thread daveg

On Fri, Jul 25, 2008 at 11:17:20PM +0100, Gregory Stark wrote:
> "Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> 
> > How do we deal with this?
> >
> > pg_dump -w "last_update_timestamp < ..." -t 'table*'
> >
> > What I see is a recipe for inconsistent, un-restorable backups without a
> > user realizing what they have done. The only way to deal with the above
> > is:
> >
> > 1. Wildcards aren't allowed if you have -w
> > 2. You dump everything, if the WHERE clause isn't relevant you just dump
> > the whole table
> 
> There's always 
> 
>   3. Apply the WHERE clause to all tables and if there's a table missing
>  columns referenced in the where clause then fail with the appropriate
>  error.
> 
> Which seems like the right option to me. The tricky bit would be how to deal
> with cases where you want a different where clause for different tables. But
> even if it doesn't handle all cases that doesn't mean a partial solution is
> unreasonable.

Actually, Davy's patch does deal with the case "where you want a different
where clause for different tables".

-dg


-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 116 matches

Mail list logo