[HACKERS] timezone buglet?
Postgresql 9.0.4 has the timezone: America/Blanc-Sablon However other sources seem to spell this with an underscore instead of dash: America/Blanc_Sablon It appears that beside 'America/Blanc_Sablon', other multi-word timezones are spelled with underscore. For example: 'Australia/Broken_Hill', 'Asia/Ho_chi_minh', 'America/Los_Angeles', and so on. Two questions: Is this correct as is, or is it wrong in 9.0.4? And, should I have reported this somewhere else? Bugs? Err, three questions: I'm a little unclear on how the tz machinery works. Can I just update the name column in pg_timezones to fix it for now? Thanks -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Wed, Sep 07, 2011 at 09:02:04PM -0400, Tom Lane wrote: > daveg writes: > > On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote: > >> BTW ... what were the last versions you were running on which you had > >> *not* seen the problem? (Just wondering about the possibility that we > >> back-patched some "fix" that broke things. It would be good to have > >> a version range before trawling the commit logs.) > > > The first version we saw it on was 8.4.7. > > Yeah, you said that. I was wondering what you'd last run before 8.4.7. Sorry, misunderstood. We were previously running 8.4.4, but have been on 8.4.7 since shortly after it was released. Prior to that we have had all the major and most of the minor releases since 7.1. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote: > daveg writes: > > Also, this is very intermittant, we have seen it only in recent months > > on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what > > feels like a few times a month. Possibly some new application behaviour > > is provoking it, but I have no guesses as to what. > > BTW ... what were the last versions you were running on which you had > *not* seen the problem? (Just wondering about the possibility that we > back-patched some "fix" that broke things. It would be good to have > a version range before trawling the commit logs.) The first version we saw it on was 8.4.7. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote: > Robert Haas writes: > > I thought about an error exit from client authentication, and that's a > > somewhat appealing explanation, but I can't quite see why we wouldn't > > clean up there the same as anywhere else. The whole mechanism feels a > > bit rickety to me - we don't actually release locks; we just abort the > > transaction and *assume* that will cause locks to get released. > > Well, transaction abort will call LockReleaseAll, which is carefully > coded to clean up the proclock lists regardless of what is in the > locallocks table, so I'm not sure why you find that any more rickety > than anything else. But maybe it'd be interesting for Dave to stick a > LockReleaseAll call into ProcKill() and see if that makes things better. > (Dave: test that before you put it in production, I'm not totally sure > it's safe.) Re safety, what is the worst case here? Also, this is very intermittant, we have seen it only in recent months on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what feels like a few times a month. Possibly some new application behaviour is provoking it, but I have no guesses as to what. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Wed, Sep 07, 2011 at 06:35:08PM -0400, Tom Lane wrote: > daveg writes: > > It does not seem restricted to pg_authid: > > 2011-08-24 18:35:57.445 24987 c23 apps ERROR: lock AccessShareLock on > > object 16403/2615/0 > > And I think I've seen it on other tables too. > > Hmm. 2615 = pg_namespace, which most likely is the first catalog > accessed by just about any SQL command that's going to access tables at > all, so I suspect that this is mostly just a "the first access failed" > thing and not something peculiar to pg_namespace. But we still don't > have a clue why the locks are not getting released by the previous > owner of the PGPROC slot. Have you trawled your logs to see if there's > any sign of any distress at all, shortly before the problem starts to > happen? Will do, but its a pretty big haystack. Sure wish I knew what the needle looked like. ;-) -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Wed, Sep 07, 2011 at 04:55:24PM -0400, Tom Lane wrote: > Robert Haas writes: > > Tom's right to be skeptical of my theory, because it would require a > > CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the > > pathways that use session-level locks, and I can't find one. > > More to the point: session-level locks are released on error. The only > way to get out of a transaction while still holding one is for the > VACUUM-or-whichever-command code to deliberately commit and exit while > still holding it. An error exit path would release the lock. > > > OTOH, I'm skeptical of the theory that this involves userlocks, > > because this whole thread started because of a complaint about lock > > 0/1260/0 already being held. That ain't no userlock. > > Yeah, and for that matter it seems to let VACUUM off the hook too. > If we assume that the reported object ID is non-corrupt (and if it's > always the same, that seems like the way to bet) then this is a lock > on pg_authid. > > Hmmm ... could the pathway involve an error exit from client > authentication? We're still finding bugs in the 9.0 rewrite of > auth-time database access. It does not seem restricted to pg_authid: 2011-08-24 18:35:57.445 24987 c23 apps ERROR: lock AccessShareLock on object 16403/2615/0 And I think I've seen it on other tables too. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Wed, Sep 07, 2011 at 10:20:24AM -0400, Tom Lane wrote: > Robert Haas writes: > > After spending some time staring at the code, I do have one idea as to > > what might be going on here. When a backend is terminated, > > ShutdownPostgres() calls AbortOutOfAnyTransaction() and then > > LockReleaseAll(USER_LOCKMETHOD, true). The second call releases all > > user locks, and the first one (or so we suppose) releases all normal > > locks as part of aborting the transaction. But what if there's no > > transaction in progress? In that case, AbortOutOfAnyTransaction() > > won't do anything - which is fine as far as transaction-level locks > > go, because we shouldn't be holding any of those anyway if there's no > > transaction in progress. However, if we hold a session-level lock at > > that point, then we'd orphan it. We don't make much use of session > > locks. As far as I can see, they are used by (1) VACUUM, (2) CREATE > > INDEX CONCURRENTLY, (3) ALTER DATABASE .. SET TABLESPACE, and (4) on > > standby servers, redo of DROP DATABASE actions. Any chance one of > > those died or was killed off around the time this happened? > > I don't believe this theory at all, because if that were the issue, > we'd have heard about it long since. The correct theory has to involve > a very-little-used triggering condition. At the moment I'm wondering > about advisory (userspace) locks ... Dave, do your apps use any of those? Yes, we make extensive use of advisory locks. That was my thought too when Robert mentioned session level locks. I'm happy to add any additional instrumentation, but my client would be happier to actually run it if there was a way to recover from this without an unplanned outage. Is there something I can do when the patch detects the problem to be able to continue without a restart? Is is save to just reset the proclock queue? I don't think they would mind leaking locks, for instance, and a later planned restart to clear it up as much as they mind unscheduled downtime. Thank -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote: > On Mon, Aug 22, 2011 at 3:31 AM, daveg wrote: > > So far I've got: > > > > - affects system tables > > - happens very soon after process startup > > - in 8.4.7 and 9.0.4 > > - not likely to be hardware or OS related > > - happens in clusters for period of a few second to many minutes > > > > I'll work on printing the LOCK and LOCALLOCK when it happens, but it's > > hard to get downtime to pick up new builds. Any other ideas on getting to > > the bottom of this? > > I've been thinking this one over, and doing a little testing. I'm > still stumped, but I have a few thoughts. What that error message is > really saying is that the LOCALLOCK bookkeeping doesn't match the > PROCLOCK bookkeeping; it doesn't tell us which one is to blame. ... > My second thought is that perhaps a process is occasionally managing > to exit without fully cleaning up the associated PROCLOCK entry. At > first glance, it appears that this would explain the observed > symptoms. A new backend gets the PGPROC belonging to the guy who > didn't clean up after himself, hits the error, and disconnects, > sticking himself right back on to the head of the SHM_QUEUE where the > next connection will inherit the same PGPROC and hit the same problem. > But it's not clear to me what could cause the system to get into this > state in the first place, or how it would eventually right itself. > > It might be worth kludging up your system to add a test to > InitProcess() to verify that all of the myProcLocks SHM_QUEUEs are > either NULL or empty, along the lines of the attached patch (which > assumes that assertions are enabled; otherwise, put in an elog() of > some sort). Actually, I wonder if we shouldn't move all the > SHMQueueInit() calls for myProcLocks to InitProcGlobal() rather than > doing it over again every time someone calls InitProcess(). Besides > being a waste of cycles, it's probably less robust this way. If > there somehow are leftovers in one of those queues, the next > successful call to LockReleaseAll() ought to clean up the mess, but of > course there's no chance of that working if we've nuked the queue > pointers. I did this in the elog flavor as we don't build production images with asserts. It has been running on all hosts for a few days. Today it hit the extra checks in initproc. 00:02:32.782 8626 [unknown] [unknown] LOG: connection received: host=bk0 port=42700 00:02:32.783 8627 [unknown] [unknown] LOG: connection received: host=op2 port=45876 00:02:32.783 8627 d61 apps FATAL: Initprocess myProclocks[4] not empty: queue 0x2ae6b4b895f8 (prev 0x2ae6b4a2b558, next 0x2ae6b4a2b558) 00:02:32.783 8626 d35 postgres LOG: connection authorized: user=postgres database=c35 00:02:32.783 21535 LOG: server process (PID 8627) exited with exit code 1 00:02:32.783 21535 LOG: terminating any other active server processes 00:02:32.783 8626 c35 postgres WARNING: terminating connection because of crash of another server process The patch that produced this is attached. If you can think of anything I can add to this to help I'd be happy to do so. Also, can I clean this up and continue somehow? Maybe clear the queue instead having to have a restart? Or is there a way to just pause this proc here, maybe mark it not to be used and exit, or just to sleep forever so I can debug later? Thanks -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. --- postgresql-9.0.4/src/backend/storage/lmgr/proc.c2011-04-14 20:15:53.0 -0700 +++ postgresql-9.0.4.dg/src/backend/storage/lmgr/proc.c 2011-08-23 17:30:03.505176019 -0700 @@ -323,7 +323,15 @@ MyProc->waitLock = NULL; MyProc->waitProcLock = NULL; for (i = 0; i < NUM_LOCK_PARTITIONS; i++) + { + SHM_QUEUE *queue = &(MyProc->myProcLocks[i]); + if (! (!queue->prev || queue->prev == queue || + !queue->next || queue->next == queue) + ) + elog(FATAL, "Initprocess myProclocks[%d] not empty: queue %p (prev %p, next %p) ", + i, queue, queue->prev, queue->next); SHMQueueInit(&(MyProc->myProcLocks[i])); + } MyProc->recoveryConflictPending = false; /* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
On Mon, Sep 05, 2011 at 08:19:21PM -0400, Bruce Momjian wrote: > daveg wrote: > > > Can you tell me what table is showing this error? Does it happen during > > > vacuum? Can you run a vacuum verbose to see what it is throwing the > > > error on? Thanks. > > > > This was upgrading from 8.4.8 to 9.0.4. I don't have the running cluster > > anymore, but I do have tar.gz archives of it and could probably find > > 2TB free somewhere to restore it to if there is something useful to extract. > > > > However, I don't think this was toast related. Most of our rows are short > > and > > have only int, float, and short text columns. These errors hit over 60 > > different tables mostly during the analyzes we ran immediately after the > > upgrade. It also hit during select, insert and delete statements. We did not > > run the db more than a few minutes as the damage was so extensive. > > > > As far as I can tell pg_upgrade never copied any pg_clog files from the > > old cluster to the new cluster. I wish I had detected that before running > > the remove_old_cluster.sh script. > > Wow, no clogs? That would make the system very confused. You can pull > the clogs out of the old backup and move them over if the files don't > already exist. We don't have the old cluster after running delete_old_cluster.ch. We use pg_dump for backup, so no clogs. We ended up restored 20 odd dbs totalling 2.1TB from the previous days pg_dumps. If you review my original report I mentioned that there were only 2 clog files in the new cluster both with ctime after the start of postgresql after the upgrade. I did the upgrade for three hosts at the same time, the others were fine. They have dozens of clogs dating back days before the upgrade. The failing system had only 2 recent clog. -dg > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
Sorry I missed your reply, catching up now. On Wed, Aug 31, 2011 at 09:56:59PM -0400, Bruce Momjian wrote: > daveg wrote: > > On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote: > > > On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote: > > > vacuumdb: vacuuming of database "etsy_v2" failed: ERROR: could not > > > access status of transaction 3429738606 > > > DETAIL: Could not open file "pg_clog/0CC6": No such file or directory. > > > > > > Interestingly. > > > > > > In old dir there is pg_clog directory with files: > > > 0AC0 .. 0DAF (including 0CC6, size 262144) > > > but new pg_clog has only: > > > 0D2F .. 0DB0 > > > > > > File content - nearly all files that exist in both places are the same, > > > with exception of 2 newest ones in new datadir: > > > 3c5122f3e80851735c19522065a2d12a 0DAF > > > 8651fc2b9fa3d27cfb5b496165cead68 0DB0 > > > > > > 0DB0 doesn't exist in old, and 0DAF has different md5sum: > > > 7d48996c762d6a10f8eda88ae766c5dd ... > > I had this same thing happen this Saturday just past and my client had to > > restore the whole 2+ TB instance from the previous days pg_dumps. ... > > After running pg_upgrade apparently successfully and analyzeing all the Update: reviewing the logs I see some of the analyzes hit the "could not access status of transaction" error too. > > tables we restarted the production workload and started getting errors: > > > > 2011-08-27 04:18:34.015 12337 c06 postgres ERROR: could not access > > status of transaction 2923961093 > > 2011-08-27 04:18:34.015 12337 c06 postgres DETAIL: Could not open file > > "pg_clog/0AE4": No such file or directory. > > 2011-08-27 04:18:34.015 12337 c06 postgres STATEMENT: analyze > > public.b_pxx; > > > > On examination the pg_clog directory contained on two files timestamped > > after the startup of the new cluster with 9.0.4. Other hosts that upgraded > > successfully had numerous files in pg_clog dating back a few days. So it > > appears that all the clog files went missing during the upgrade somehow. > > a > > This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between > > at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously. > > I have posted this fix to the hackers email list, but I found it only > affected old 8.3 servers, not old 8.4.X, so I am confused by your bug > report. > > I have tested 8.4.X to 9.0.4 and found pg_upgrade preserves toast > relfrozenxids properly in that case. > > Can you tell me what table is showing this error? Does it happen during > vacuum? Can you run a vacuum verbose to see what it is throwing the > error on? Thanks. This was upgrading from 8.4.8 to 9.0.4. I don't have the running cluster anymore, but I do have tar.gz archives of it and could probably find 2TB free somewhere to restore it to if there is something useful to extract. However, I don't think this was toast related. Most of our rows are short and have only int, float, and short text columns. These errors hit over 60 different tables mostly during the analyzes we ran immediately after the upgrade. It also hit during select, insert and delete statements. We did not run the db more than a few minutes as the damage was so extensive. As far as I can tell pg_upgrade never copied any pg_clog files from the old cluster to the new cluster. I wish I had detected that before running the remove_old_cluster.sh script. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote: > On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote: > vacuumdb: vacuuming of database "etsy_v2" failed: ERROR: could not access > status of transaction 3429738606 > DETAIL: Could not open file "pg_clog/0CC6": No such file or directory. > > Interestingly. > > In old dir there is pg_clog directory with files: > 0AC0 .. 0DAF (including 0CC6, size 262144) > but new pg_clog has only: > 0D2F .. 0DB0 > > File content - nearly all files that exist in both places are the same, with > exception of 2 newest ones in new datadir: > 3c5122f3e80851735c19522065a2d12a 0DAF > 8651fc2b9fa3d27cfb5b496165cead68 0DB0 > > 0DB0 doesn't exist in old, and 0DAF has different md5sum: > 7d48996c762d6a10f8eda88ae766c5dd > > one more thing. I did select count(*) from transactions and it worked. > > that's about it. I can probably copy over files from old datadir to new (in > pg_clog/), and will be happy to do it, but I'll wait for your call - retry > with > copies files might destroy some evidence. I had this same thing happen this Saturday just past and my client had to restore the whole 2+ TB instance from the previous days pg_dumps. I had been thinking that perhaps I did something wrong in setting up or running the upgrade, but had not found it yet. Now that I see Hubert has the same problem it is starting to look like pg_upgrade can eat all your data. After running pg_upgrade apparently successfully and analyzeing all the tables we restarted the production workload and started getting errors: 2011-08-27 04:18:34.015 12337 c06 postgres ERROR: could not access status of transaction 2923961093 2011-08-27 04:18:34.015 12337 c06 postgres DETAIL: Could not open file "pg_clog/0AE4": No such file or directory. 2011-08-27 04:18:34.015 12337 c06 postgres STATEMENT: analyze public.b_pxx; On examination the pg_clog directory contained on two files timestamped after the startup of the new cluster with 9.0.4. Other hosts that upgraded successfully had numerous files in pg_clog dating back a few days. So it appears that all the clog files went missing during the upgrade somehow. a This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already
On Sun, Aug 14, 2011 at 12:16:39AM -0400, Robert Haas wrote: > On Fri, Aug 12, 2011 at 7:19 PM, daveg wrote: > > This seems to be bug month for my client. Now there are seeing periods > > where all new connections fail immediately with the error: > > > > FATAL: lock AccessShareLock on object 0/1260/0 is already held ... > > What can I do to help track this down? > > I've seen that error (though not that exact fact pattern) caused by > bad RAM. It's unclear to me what else could cause it. > > In terms of debugging, it seems like it might be sensible to start by > injecting some debugging code that dumps out the contents of the LOCK > and LOCALLOCK structures at the point the error occurs. I've made up the attached patch to print this, please suggest any additions. I'll deploy this on a couple of the production hosts that have had the issue this evening, but there is no telling when or if it will strike next. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. *** postgresql-9.0.4/src/backend/storage/lmgr/lock.c.orig 2011-08-22 13:46:29.386428943 -0700 --- postgresql-9.0.4/src/backend/storage/lmgr/lock.c2011-08-23 00:12:43.456422966 -0700 *** *** 469,474 --- 469,534 return LockAcquireExtended(locktag, lockmode, sessionLock, dontWait, true); } + int DG_FORCE_TRAP = 0; + + inline static void + DG_LOCK_PRINT(const char *where, const LOCK *lock, LOCKMODE type) + { + elog(LOG, +"%s: lock(%p) id(%u,%u,%u,%u,%u,%u) grantMask(%x) " +"req(%d,%d,%d,%d,%d,%d,%d)=%d " +"grant(%d,%d,%d,%d,%d,%d,%d)=%d wait(%d) type(%s)", +where, lock, +lock->tag.locktag_field1, lock->tag.locktag_field2, +lock->tag.locktag_field3, lock->tag.locktag_field4, +lock->tag.locktag_type, lock->tag.locktag_lockmethodid, +lock->grantMask, +lock->requested[1], lock->requested[2], lock->requested[3], +lock->requested[4], lock->requested[5], lock->requested[6], +lock->requested[7], lock->nRequested, +lock->granted[1], lock->granted[2], lock->granted[3], +lock->granted[4], lock->granted[5], lock->granted[6], +lock->granted[7], lock->nGranted, +lock->waitProcs.size, +LockMethods[LOCK_LOCKMETHOD(*lock)]->lockModeNames[type]); + } + + + inline static void + DG_PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP) + { + elog(LOG, +"%s: proclock(%p) lock(%p) method(%u) proc(%p) " + "hold(%x) release(%x) " + "links(p,n): lock=(%p,%p), proc=(%p,%p)", +where, proclockP, proclockP->tag.myLock, +PROCLOCK_LOCKMETHOD(*(proclockP)), +proclockP->tag.myProc, +(int) proclockP->holdMask, (int) proclockP->releaseMask, +proclockP->lockLink.prev, proclockP->lockLink.next, +proclockP->procLink.prev, proclockP->procLink.next + ); + } + + inline static void + DG_LOCALLOCK_PRINT(const char *where, const LOCALLOCK *localP) + { + elog(LOG, +"%s: locallock(%p) id(%u,%u,%u,%u,%u,%u mode %x) " +"lock(%p), proclock(%p) " +"hashcode %x nlocks %ld numLockOwners %d maxLockOwners %d ", +where, localP, +localP->tag.lock.locktag_field1, localP->tag.lock.locktag_field2, +localP->tag.lock.locktag_field3, localP->tag.lock.locktag_field4, +localP->tag.lock.locktag_type, localP->tag.lock.locktag_lockmethodid, +localP->tag.mode, +localP->lock, localP->proclock, +localP->hashcode, localP->nLocks, +localP->numLockOwners, localP->maxLockOwners +/* localP->lockOwners[0].owner, localP->lockOwners.nlocks "%p %d" */ + ); + } + /* * LockAcquireExtended - allows us to specify additional options * *** *** 500,505 --- 560,568 LWLockIdpartitionLock; int status; boollog_lock = false; + int DG_found_local = -1; + int DG_found_lock = -1; + int DG_found_proc = -1; if (lockmethodid <= 0 || lockmethodid >= lengthof(LockMethods)) elog(ERROR, "unrecognized lock method: %d", lockmethodid); *** *** 5
Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held
On Fri, Aug 12, 2011 at 04:19:37PM -0700, daveg wrote: > > This seems to be bug month for my client. Now there are seeing periods > where all new connections fail immediately with the error: > >FATAL: lock AccessShareLock on object 0/1260/0 is already held > > This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has > been up for months. It started happening sporadicly a few days ago. It will > do this for a period of several minutes to an hour and then go back to > normal for hours or days. > > One complete failing session out of several hundred around that time: > - > 2011-08-09 00:01:04.446 8823 [unknown] [unknown] LOG: connection > received: host=op05.xxx port=34067 > 2011-08-09 00:01:04.446 8823 c77 apps LOG: connection authorized: > user=apps database=c77 > 2011-08-09 00:01:04.449 8823 c77 apps FATAL: lock AccessShareLock on > object 0/1260/0 is already held > -- This is to add additional information to the original report: For a while this was happening on many different databases in one postgresql 8.4.7 instance on a single large host ('U2' 512GB 64cpu) running RH 5. That has been quiet for several days and the newest batches of errors have happened on only on a single database 'c23', in a postgresql 9.0.4 instance on a smaller host ('A', 64GB 8cpu) running SuSE 10.2. No memory errors or other misbehaviour have been seen on either of these hosts in recent months. The original error was: lock AccessShareLock on object 0/1260/0 is already held which is for pg_database. The recent errors are: lock AccessShareLock on object 16403/2615/0 is already held which is for pg_namespace in database c23. All of the orginal and most of the recent batchs of errors were immediately after connecting to a database and being authorized, that is, before any statements were attempted. However, some of the most recent are on the first "query" statement. That is after logging in and doing things like "set transaction ... " the first select would hit this error. It seems to come in clusters, sometimes, which suggests something shared by multiple processes. For example, here are the times for the errors on c23 in the afternoon of August 20: 20 07:14:12.722 20 16:05:07.798 20 16:05:07.808 20 16:05:10.519 20 16:07:07.726 20 16:07:08.722 20 16:07:09.734 20 16:07:10.656 20 16:07:25.436 20 16:22:23.983 20 16:22:24.014 20 16:22:24.335 20 16:22:24.409 20 16:22:24.477 20 16:22:24.499 20 16:22:24.516 20 16:30:58.210 20 16:31:15.261 20 16:31:15.296 20 16:31:15.324 20 16:31:15.348 20 18:06:16.515 20 18:06:49.198 20 18:06:49.204 20 18:06:51.444 20 21:03:05.940 So far I've got: - affects system tables - happens very soon after process startup - in 8.4.7 and 9.0.4 - not likely to be hardware or OS related - happens in clusters for period of a few second to many minutes I'll work on printing the LOCK and LOCALLOCK when it happens, but it's hard to get downtime to pick up new builds. Any other ideas on getting to the bottom of this? Thanks -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
[adding back hackers so the thread shows the resolution] On Sun, Aug 14, 2011 at 07:02:55PM -0400, Tom Lane wrote: > Sounds good. Based on my own testing so far, I think that patch will > probably make things measurably better for you, though it won't resolve > every corner case. The most recent catalog vacuums did vacuum full pg_class in 34 databases on that instance with no new failures. So it looks like the patch fixes it. This is not conclusive, but it looks very good so far. I'll send an update if I see any new errors during the week. Thanks for your help on this. It looks like it has sent you on a merry search through all the catcache related program activities. I'm assuming this patch or some improvement on it will show up in a point release. Meanwhile, if this works as is for couple more days we will resume upgrading the rest of the hosts to 9.0 using this patch. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already
On Sun, Aug 14, 2011 at 12:16:39AM -0400, Robert Haas wrote: > On Fri, Aug 12, 2011 at 7:19 PM, daveg wrote: > > This seems to be bug month for my client. Now there are seeing periods > > where all new connections fail immediately with the error: > > > > FATAL: lock AccessShareLock on object 0/1260/0 is already held > > > > This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has > > been up for months. It started happening sporadicly a few days ago. It will > > do this for a period of several minutes to an hour and then go back to > > normal for hours or days. > > > > One complete failing session out of several hundred around that time: > > - > > 2011-08-09 00:01:04.446 8823 [unknown] [unknown] LOG: connection > > received: host=op05.xxx port=34067 > > 2011-08-09 00:01:04.446 8823 c77 apps LOG: connection authorized: > > user=apps database=c77 > > 2011-08-09 00:01:04.449 8823 c77 apps FATAL: lock AccessShareLock on > > object 0/1260/0 is already held > > -- > > > > What can I do to help track this down? > > I've seen that error (though not that exact fact pattern) caused by > bad RAM. It's unclear to me what else could cause it. I'll look into that. I think it is only happening on one host, so that might make sense. On the other hand, these are pretty fancy hosts all ECC and that so I'd hope they would have squeaked about bad ram. > In terms of debugging, it seems like it might be sensible to start by > injecting some debugging code that dumps out the contents of the LOCK > and LOCALLOCK structures at the point the error occurs. Hmm, we will update to 9.0 next week on these hosts, so I'll try to hold off on this part at least until then. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already
This seems to be bug month for my client. Now there are seeing periods where all new connections fail immediately with the error: FATAL: lock AccessShareLock on object 0/1260/0 is already held This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has been up for months. It started happening sporadicly a few days ago. It will do this for a period of several minutes to an hour and then go back to normal for hours or days. One complete failing session out of several hundred around that time: - 2011-08-09 00:01:04.446 8823 [unknown] [unknown] LOG: connection received: host=op05.xxx port=34067 2011-08-09 00:01:04.446 8823 c77 apps LOG: connection authorized: user=apps database=c77 2011-08-09 00:01:04.449 8823 c77 apps FATAL: lock AccessShareLock on object 0/1260/0 is already held -- What can I do to help track this down? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] our buffer replacement strategy is kind of lame
On Fri, Aug 12, 2011 at 01:28:49PM +0100, Simon Riggs wrote: > I think there are reasonable arguments to make > > * prefer_cache = off (default) | on a table level storage parameter, > =on will disable the use of BufferAccessStrategy > > * make cache_spoil_threshold a parameter, with default 0.25 > > Considering the world of very large RAMs in which we now live, some > control of the above makes sense. As long as we are discussion cache settings for tables, I have a client who would like to be able to lock specific tables and indexes into cache as they have strict response time requirements for particular queries. At the moment they are running postgres with a tablespace on ramfs and taking frequent backups, but this is not optimal. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VACUUM FULL versus system catalog cache invalidation
On Fri, Aug 12, 2011 at 09:26:02PM +0100, Simon Riggs wrote: > With HOT, there is very little need to perform a VACUUM FULL on any > shared catalog table. Look at the indexes... > > I would a suggest that VACUUM FULL perform only a normal VACUUM on > shared catalog tables, then perform an actual VACUUM FULL only in dire > need (some simple heuristic in size and density). This avoids doing a > VACUUM FULL unless it is actually necessary to do so. That has the > added advantage of not locking out essential tables, which is always a > concern. > > In the unlikely event we do actually have to VACUUM FULL a shared > catalog table, nuke any cache entry for the whole shared catalog. That > way we absolutely and positively will never get any more bugs in this > area, ever again. Sounds harsh, but these events are only actually > needed very, very rarely and hygiene is more important than a few > minor points of performance. This is a very optimistic view. My client makes heavy use of temp tables. HOT and autovacuum are not sufficient to keep catalog bloat under control. We run a daily script that calculates the density of the catalog and only vaccum fulls those that are severely bloated. Here is a result from a recent bloat check on one db. 'packed' is the number of pages needed for the rows if they were packed, 'bloat' is the multiple of pages in use over the number really needed. relation | tuples | pages | packed | bloat --++---++--- pg_class; -- | 4292 | 10619 |114 | 93.2 pg_depend; --| 25666 | 7665 |217 | 35.4 pg_attrdef; -- | 6585 | 7595 |236 | 32.2 pg_type; -- | 4570 | 8177 |416 | 19.6 pg_shdepend; -- | 52040 | 7968 |438 | 18.2 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Fri, Aug 05, 2011 at 12:10:31PM -0400, Tom Lane wrote: > I wrote: > > Ahh ... you know what, never mind about stack traces, let's just see if > > the attached patch doesn't fix it. > > On reflection, that patch would only fix the issue for pg_class, and > that's not the only catalog that gets consulted during relcache reloads. > I think we'd better do it as attached, instead. > > regards, tom lane Should this be applied in addition to the earlier patch, or to replace it? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Thu, Aug 04, 2011 at 04:16:08PM -0400, Tom Lane wrote: > daveg writes: > > We are seeing "cannot read' and 'cannot open' errors too that would be > > consistant with trying to use a vanished file. > > Yeah, these all seem consistent with the idea that the failing backend > somehow missed an update for the relation mapping file. You would get > the "could not find pg_class tuple" syndrome if the process was holding > an open file descriptor for the now-deleted file, and otherwise cannot > open/cannot read type errors. And unless it later received another > sinval message for the relation mapping file, the errors would persist. > > If this theory is correct then all of the file-related errors ought to > match up to recently-vacuumed mapped catalogs or indexes (those are the > ones with relfilenode = 0 in pg_class). Do you want to expand your > logging of the VACUUM FULL actions and see if you can confirm that idea? At your service, what would you like to see? > Since the machine is running RHEL, I think we can use glibc's > backtrace() function to get simple stack traces without too much effort. > I'll write and test a patch and send it along in a bit. Great. Any point to try to capture SI events somehow? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote: > daveg writes: > > Summary: the failing process reads 0 rows from 0 blocks from the OLD > > relfilenode. > > Hmm. This seems to mean that we're somehow missing a relation mapping > invalidation message, or perhaps not processing it soon enough during > some complex set of invalidations. I did some testing with that in mind > but couldn't reproduce the failure. It'd be awfully nice to get a look > at the call stack when this happens for you ... what OS are you running? To recap, a few observations: When it happens the victim has recently been waiting on a lock for a several seconds. We create a lot of temp tables, hundreds of thousands a day. There are catalog vacuum fulls and reindexes running on 30 odd other databases at the same time. The script estimates the amount of bloat on each table and index and chooses either reindex on specific indexes or vacuum full as needed. This is a 32 core (64 with hype threading) 512GB host with several hundred connections We are seeing "cannot read' and 'cannot open' errors too that would be consistant with trying to use a vanished file. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote: > daveg writes: > > Summary: the failing process reads 0 rows from 0 blocks from the OLD > > relfilenode. > > Hmm. This seems to mean that we're somehow missing a relation mapping > invalidation message, or perhaps not processing it soon enough during > some complex set of invalidations. I did some testing with that in mind > but couldn't reproduce the failure. It'd be awfully nice to get a look > at the call stack when this happens for you ... what OS are you running? cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.5 (Tikanga) Linux version 2.6.18-194.el5 I can use gdb as well if we can get a core or stop the correct process. Perhaps a long sleep when it hits this? Or perhaps we could log invalidate processing for pg_class? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Wed, Aug 03, 2011 at 11:18:20AM -0400, Tom Lane wrote: > Evidently not, if it's not logging anything, but now the question is > why. One possibility is that for some reason RelationGetNumberOfBlocks > is persistently lying about the file size. (We've seen kernel bugs > before that resulted in transiently wrong values, so this isn't totally > beyond the realm of possibility.) Please try the attached patch, which > extends the previous one to add a summary line including the number of > blocks physically scanned by the seqscan. Ok, I have results from the latest patch and have attached a redacted server log with the select relfilenode output added inline. This is the shorter of the logs and shows the sequence pretty clearly. I have additional logs if wanted. Summary: the failing process reads 0 rows from 0 blocks from the OLD relfilenode. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. SS.msec pid db user statement -- - --- --- 03.804 29706 c27 postgres connection authorized: user=postgres database=c27 ... 03.824 29706 c27 postgres set statement_timeout=60; 03.824 29706 c27 postgres 0.364 ms 03.825 29706 c27 postgres select current_database() as db, relname, oid, pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from pg_class where relname like 'pg_class%'; 03.829 29706 c27 postgres 4.173 ms db | relname | oid | filenode | ts -++--++- c27 | pg_class | 1259 | 1245833951 | 21:31:03.828293 c27 | pg_class_oid_index | 2662 | 1245833955 | 21:31:03.828791 c27 | pg_class_relname_nsp_index | 2663 | 1259100530 | 21:31:03.828807 03.829 29706 c27 postgres vacuum full pg_catalog.pg_class; 03.829 29706 c27 postgres LOCATION: exec_simple_query, postgres.c:900 ... 03.845 29707 c27 postgres disconnection: session time: 0:00:00.041 user=postgres database=c27 host=bk-0 ... 08.856 29706 c27 postgres process 29706 still waiting for RowExclusiveLock on relation 1214 of database 0 after 5000.483 ms 08.856 29706 c27 postgres LOCATION: ProcSleep, proc.c:1059 08.856 29706 c27 postgres STATEMENT: vacuum full pg_catalog.pg_class; 09.383 29711 LOG: 0: process 29711 still waiting for AccessShareLock on relation 1259 of database 16408 after 5000.331 ms 09.383 29711 LOCATION: ProcSleep, proc.c:1059 ... 11.559 28857 c27 apps SELECT ... 16.560 28857 c27 apps process 28857 still waiting for AccessShareLock on relation 1259 of database 16408 after 5001.209 ms 16.560 28857 c27 apps LOCATION: ProcSleep, proc.c:1059 16.560 28857 c27 apps STATEMENT: SELECT ... 19.763 29706 c27 postgres process 29706 acquired RowExclusiveLock on relation 1214 of database 0 after 15907.284 ms 19.763 29706 c27 postgres LOCATION: ProcSleep, proc.c:1063 19.763 29706 c27 postgres STATEMENT: vacuum full pg_catalog.pg_class; ... 25.735 29711 LOG: 0: process 29711 acquired AccessShareLock on relation 1259 of database 16408 after 21352.393 ms 25.735 29711 LOCATION: ProcSleep, proc.c:1063 25.735 28857 c27 apps process 28857 acquired AccessShareLock on relation 1259 of database 16408 after 14176.040 ms 25.735 28857 c27 apps LOCATION: ProcSleep, proc.c:1063 25.735 28857 c27 apps STATEMENT: SELECT 25.736 28857 c27 apps ScanPgRelationDetailed: found 0 tuples with OID 2662 in 0 blocks of filenode 1245833951 25.736 28857 c27 apps LOCATION: ScanPgRelationDetailed, relcache.c:372 25.736 28857 c27 apps STATEMENT: SELECT 25.736 28857 c27 apps ERROR: XX000: could not find pg_class tuple for index 2662 25.736 28857 c27 apps LOCATION: RelationReloadIndexInfo, relcache.c:1816 25.736 28857 c27 apps STATEMENT: SELECT 25.736 29706 c27 postgres 21906.865 ms 25.737 29706 c27 postgres select current_database() as db, relname, oid, pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from pg_class where relname like 'pg_class%'; 25.767 29706 c27 postgres 30.902 ms db | relname | oid | filenode | ts -++--++- c27 | pg_class | 1259 | 1279787837 | 21:31:25.76726 c27 | pg_class_oid_index | 2662 | 1279788022 | 21:31:25.767764 c27 | pg_class_relname_nsp_index | 2663 | 1279788023 | 21:31:25.767782 25.768 29706 c27 postgres vacuum full pg_catalog.pg_attribute; 25.775 28857 c27 apps disconnection: session time: 0:07:07.758 user=apps database=c27 host=op-01 25.775 28857 c27 apps LOCATION: log_disconnections, postgres.c:4339 ... 30.914 29711 LOG: 0: process 29711 still waiting for AccessShareLock on relation 1249 of database 1
Re: [HACKERS] Further news on Clang - spurious warnings
On Wed, Aug 03, 2011 at 04:03:39PM -0400, Tom Lane wrote: > The C standard specifies that signed-to-unsigned conversions must work > like that; and even if the standard didn't, it would surely work like > that on any machine with two's-complement representation, which is to > say every computer built in the last forty years or so. So I don't find > it a questionable assumption. I had the "pleasure" of working on a Univac 1108 in about 1978 and it was very definitely ones complement. I'm somewhat amazed to find that the Univac 1100 series architecture and instruction set lives on to this day. The last pure 1100 seems to be the Unisys 2200/3800 released in 1997. Even later U1100/Exec 8 descendants appear to still exist and are still actively supported: http://en.wikipedia.org/wiki/Unisys_OS_2200_operating_system So there are still ones complement machines out there. However I suggest we pretend otherwise and continue to ignore them. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Mon, Aug 01, 2011 at 01:23:49PM -0400, Tom Lane wrote: > daveg writes: > > On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote: > >> I think we need to start adding some instrumentation so we can get a > >> better handle on what's going on in your database. If I were to send > >> you a source-code patch for the server that adds some more logging > >> printout when this happens, would you be willing/able to run a patched > >> build on your machine? > > > Yes we can run an instrumented server so long as the instrumentation does > > not interfere with normal operation. However, scheduling downtime to switch > > binaries is difficult, and generally needs to be happen on a weekend, but > > sometimes can be expedited. I'll look into that. > > OK, attached is a patch against 9.0 branch that will re-scan pg_class > after a failure of this sort occurs, and log what it sees in the tuple > header fields for each tuple for the target index. This should give us > some useful information. It might be worthwhile for you to also log the > results of > > select relname,pg_relation_filenode(oid) from pg_class > where relname like 'pg_class%'; > > in your script that does VACUUM FULL, just before and after each time it > vacuums pg_class. That will help in interpreting the relfilenodes in > the log output. We have installed the patch and have encountered the error as usual. However there is no additional output from the patch. I'm speculating that the pg_class scan in ScanPgRelationDetailed() fails to return tuples somehow. I have also been trying to trace it further by reading the code, but have not got any solid hypothesis yet. In the absence of any debugging output I've been trying to deduce the call tree leading to the original failure. So far it looks like this: RelationReloadIndexInfo(Relation) // Relation is 2662 and !rd_isvalid pg_class_tuple = ScanPgRelation(2662, indexOK=false) // returns NULL pg_class_desc = heap_open(1259, ACC_SHARE) r = relation_open(1259, ACC_SHARE) // locks oid, ensures RelationIsValid(r) r = RelationIdGetRelation(1259) r = RelationIdCacheLookup(1259) // assume success if !rd_isvalid: RelationClearRelation(r, true) RelationInitPhysicalAddr(r) // r is pg_class relcache -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote: > daveg writes: > > Here is the update: the problem happens with vacuum full alone, no reindex > > is needed to trigger it. I updated the script to avoid reindexing after > > vacuum. Over the past two days there are still many ocurrances of this > > error coincident with the vacuum. > > Well, that jives with the assumption that the one case we saw in > the buildfarm was the same thing, because the regression tests were > certainly only doing a VACUUM FULL and not a REINDEX of pg_class. > But it doesn't get us much closer to understanding what's happening. > In particular, it seems to knock out most ideas associated with race > conditions, because the VAC FULL should hold exclusive lock on pg_class > until it's completely done (including index rebuilds). > > I think we need to start adding some instrumentation so we can get a > better handle on what's going on in your database. If I were to send > you a source-code patch for the server that adds some more logging > printout when this happens, would you be willing/able to run a patched > build on your machine? Yes we can run an instrumented server so long as the instrumentation does not interfere with normal operation. However, scheduling downtime to switch binaries is difficult, and generally needs to be happen on a weekend, but sometimes can be expedited. I'll look into that. > (BTW, just to be perfectly clear ... the "could not find pg_class tuple" > errors always mention index 2662, right, never any other number?) Yes, only index 2662, never any other. I'm attaching a somewhat redacted log for two different databases on the same instance around the time of vacuum full of pg_class in each database. My observations so far are: - the error occurs at commit of vacuum full of pg_class - in these cases error hits autovacuum after it waited for a lock on pg_class - in these two cases there was a new process startup while the vacuum was running. Don't know if this is relevant. - while these hit autovacuum, the error does hit other processs (just not in these sessions). Unknown if autovacuum is a required component. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. NOTES: -- most statements for db c01 in one catalog vacuum session -- database c01 is oid 16404 -- applications run as user 'app' -- vacuums run as user postgres -- somewhat redacted MM:SS.ms piduser log text - - - 03:00.350 2716 c01 postgres vacuum full pg_catalog.pg_class; 03:00.507 2866 LOG: process 2866 acquired ShareUpdateExclusiveLock on relation 2604 of database 16416 after 10143.750 ms 03:01.222 21685 c01 app 11100.670 ms 03:01.223 21685 c01 app SELECT DISTINCT 03:01.224 21685 c01 app 1.524 ms 03:02.031 3001 c01 app connection authorized: user=app database=c01 03:04.952 2787 LOG: process 2787 still waiting for AccessShareLock on relation 1259 of database 16412 after 5001.035 ms 03:05.065 2787 LOG: process 2787 acquired AccessShareLock on relation 1259 of database 16412 after 5114.253 ms 03:05.564 2977 LOG: process 2977 still waiting for AccessShareLock on relation 1259 of database 16404 after 5000.970 ms 03:05.640 2731 LOG: process 2731 still waiting for RowExclusiveLock on relation 1259 of database 16404 after 5000.186 ms 03:06.045 2977 LOG: process 2977 acquired AccessShareLock on relation 1259 of database 16404 after 5482.389 ms 03:06.045 2731 LOG: process 2731 acquired RowExclusiveLock on relation 1259 of database 16404 after 5405.652 ms 03:06.046 2731 ERROR: could not find pg_class tuple for index 2662 03:06.046 2731 CONTEXT: automatic vacuum of table "c01.pg_catalog.pg_index" 03:06.046 2731 ERROR: could not find pg_class tuple for index 2662 03:06.046 2716 c01 postgres 5696.537 ms 03:06.056 3001 c01 app SET SESSION TIME ZONE 'UTC'; ... 03:06.057 2716 c01 postgres vacuum full pg_catalog.pg_rewrite; 03:06.066 3001 c01 app 10.459 ms 03:06.091 3001 c01 app disconnection: session time: 0:00:04.085 user=app database=c01 host=xxxl01 03:08.908 3006 LOG: process 3006 still waiting for AccessShareLock on relation 2659 of database 16407 after 5000.778 ms 03:11.777 21685 c01 app SELECT 03:11.779 21685 c01 app 2.296 ms 03:11.779 21685 c01 app SELECT 03:11.780 21685 c01 app 0.328 ms 03:11.798 21685 c01 app SELECT 03:11.799 21685 c01 app 0.348 ms 03:11.800 21685 c01 app SELECT 03:11.800 21685 c01 app 0.205 ms 03:11.804 21685 c01 app SELECT 03:11.805 21685 c01 app 0.589 ms 03:11.806 21685 c01 app SELECT DISTINCT 03:11.809 21685 c01 app 3.552 ms 03:11.810 21685 c01 app
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Thu, Jul 28, 2011 at 11:31:31PM -0700, daveg wrote: > On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote: > > REINDEX. My guess is that this is happening either right around the > > time the VACUUM FULL commits or right around the time the REINDEX > > commits. It'd be helpful to know which, if you can figure it out. > > I'll update my vacuum script to skip reindexes after vacuum full for 9.0 > servers and see if that makes the problem go away. Thanks for reminding > me that they are not needed. However, I suspect it is the vacuum, not the > reindex causing the problem. I'll update when I know. Here is the update: the problem happens with vacuum full alone, no reindex is needed to trigger it. I updated the script to avoid reindexing after vacuum. Over the past two days there are still many ocurrances of this error coincident with the vacuum. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Fri, Jul 29, 2011 at 09:55:46AM -0400, Tom Lane wrote: > The thing that was bizarre about the one instance in the buildfarm was > that the error was persistent, ie, once a session had failed all its > subsequent attempts to access pg_class failed too. I gather from Dave's > description that it's working that way for him too. I can think of ways > that there might be a transient race condition against a REINDEX, but > it's very unclear why the failure would persist across multiple > attempts. The best idea I can come up with is that the session has > somehow cached a wrong commit status for the reindexing transaction, > causing it to believe that both old and new copies of the index's > pg_class row are dead ... but how could that happen? The underlying It is definitely persistant. Once triggered the error occurs for any new statement until the session exits. > state in the catalog is not wrong, because no concurrent sessions are > upset (at least not in the buildfarm case ... Dave, do you see more than > one session doing this at a time?). It looks like it happens to multiple sessions so far as one can tell from the timestamps of the errors: timestampsessionid error - -- 03:05:37.434 4e26a861.4a6d could not find pg_class tuple for index 2662 03:05:37.434 4e26a861.4a6f could not find pg_class tuple for index 2662 03:06:12.041 4e26a731.438e could not find pg_class tuple for index 2662 03:06:12.042 4e21b6a3.629b could not find pg_class tuple for index 2662 03:06:12.042 4e26a723.42ec could not find pg_class tuple for index 2662 at character 13 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 5:46 PM, daveg wrote: > > On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote: > >> On Wed, Jul 27, 2011 at 8:28 PM, daveg wrote: > >> > My client has been seeing regular instances of the following sort of > >> > problem: > >> On what version of PostgreSQL? > > > > 9.0.4. > > > > I previously said: > >> > This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have > >> > identical systems still running 8.4.8 that do not have this issue, so I'm > >> > assuming it is related to the vacuum full work done for 9.0. Oddly, we > >> > don't > >> > see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4, > >> > so it may be timing related. > > Ah, OK, sorry. Well, in 9.0, VACUUM FULL is basically CLUSTER, which > means that a REINDEX is happening as part of the same operation. In > 9.0, there's no point in doing VACUUM FULL immediately followed by > REINDEX. My guess is that this is happening either right around the > time the VACUUM FULL commits or right around the time the REINDEX > commits. It'd be helpful to know which, if you can figure it out. I'll update my vacuum script to skip reindexes after vacuum full for 9.0 servers and see if that makes the problem go away. Thanks for reminding me that they are not needed. However, I suspect it is the vacuum, not the reindex causing the problem. I'll update when I know. > If there's not a hardware problem causing those read errors, maybe a > backend is somehow ending up with a stale or invalid relcache entry. > I'm not sure exactly how that could be happening, though... It does not appear to be a hardware problem. I also suspect it is a stale relcache. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] error: could not find pg_class tuple for index 2662
On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote: > On Wed, Jul 27, 2011 at 8:28 PM, daveg wrote: > > My client has been seeing regular instances of the following sort of > > problem: > On what version of PostgreSQL? 9.0.4. I previously said: > > This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have > > identical systems still running 8.4.8 that do not have this issue, so I'm > > assuming it is related to the vacuum full work done for 9.0. Oddly, we don't > > see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4, > > so it may be timing related. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] error: could not find pg_class tuple for index 2662
My client has been seeing regular instances of the following sort of problem: ... 03:06:09.453 exec_simple_query, postgres.c:900 03:06:12.042 XX000: could not find pg_class tuple for index 2662 at character 13 03:06:12.042 RelationReloadIndexInfo, relcache.c:1740 03:06:12.042 INSERT INTO zzz_k(k) SELECT ... 03:06:12.045 0: statement: ABORT 03:06:12.045 exec_simple_query, postgres.c:900 03:06:12.045 0: duration: 0.100 ms 03:06:12.045 exec_simple_query, postgres.c:1128 03:06:12.046 0: statement: INSERT INTO temp_807 VALUES (...) 03:06:12.046 exec_simple_query, postgres.c:900 03:06:12.046 XX000: could not find pg_class tuple for index 2662 at character 13 03:06:12.046 RelationReloadIndexInfo, relcache.c:1740 03:06:12.046 INSERT INTO temp_807 VALUES (...) 03:06:12.096 08P01: unexpected EOF on client connection 03:06:12.096 SocketBackend, postgres.c:348 03:06:12.096 XX000: could not find pg_class tuple for index 2662 03:06:12.096 RelationReloadIndexInfo, relcache.c:1740 03:06:12.121 0: disconnection: session time: 0:06:08.537 user=ZZZ database=ZZZ_01 03:06:12.121 log_disconnections, postgres.c:4339 The above happens regularly (but not completely predictably) corresponding with a daily cronjob that checks the catalogs for bloat and does vacuum full and/or reindex as needed. Since some of the applications make very heavy use of temp tables this will usually mean pg_class and pg_index get vacuum full and reindex. Sometimes queries will fail due to being unable to open a tables containing file. On investigation the file will be absent in both the catalogs and the filesystem so I don't know what table it refers to: 20:41:19.063 ERROR: could not open file "pg_tblspc/16401/PG_9.0_201008051/16413/1049145092": No such file or directory 20:41:19.063 STATEMENT: insert into r_ar__30 select aid, mid, pid, sum(wdata) as wdata, ... -- 20:41:19.430 ERROR: could not open file "pg_tblspc/16401/PG_9.0_201008051/16413/1049145092": No such file or directory 20:41:19.430 STATEMENT: SELECT nextval('j_id_seq') Finallly, I have seen a several instances of failure to read data by vacuum full itself: 03:05:45.699 0: statement: vacuum full pg_catalog.pg_index; 03:05:45.699 exec_simple_query, postgres.c:900 03:05:46.142 XX001: could not read block 65 in file "pg_tblspc/16401/PG_9.0_201008051/16416/1049146489": read only 0 of 8192 bytes 03:05:46.142 mdread, md.c:656 03:05:46.142 vacuum full pg_catalog.pg_index; This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have identical systems still running 8.4.8 that do not have this issue, so I'm assuming it is related to the vacuum full work done for 9.0. Oddly, we don't see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4, so it may be timing related. This seems possibly related to the issues in: Bizarre buildfarm failure on baiji: can't find pg_class_oid_index http://archives.postgresql.org/pgsql-hackers/2010-02/msg02038.php Broken HOT chains in system catalogs http://archives.postgresql.org/pgsql-hackers/2011-04/msg00777.php As far as I can tell from the logs I have, once a session sees one of these errors any subsequent query will hit it again until the session exits. However, it does not seem to harm other sessions or leave any persistant damage (crossing fingers and hoping here). I'm ready to do any testing/investigation/instrumented builds etc that may be helpful in resolving this. Regards -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 08, 2011 at 10:37:24AM +0200, Heikki Linnakangas wrote: > On 08.03.2011 10:00, Heikki Linnakangas wrote: > >Another idea is to give up on the warning when it appears that > >oldestxmin has moved backwards, and assume that it's actually fine. We > >could still warn in other cases where the flag appears to be incorrectly > >set, like if there is a deleted tuple on the page. > > This is probably a better idea at least in back-branches. It also > handles the case of twiddling vacuum_defer_cleanup_age, which tracking > two xmins per transactions would not handle. > > Here's a patch. I also changed the warning per Robert's suggestion. > Anyone see a hole in this? It would be helpful to have the dbname and schema in the message in addition to the relname. I added those to the original diagnostic patch as it was not clear that the messages were all related to the same page/table/dg. Also, in your comment you might mention that multiple databases are one way we could see oldestxmin move backwards. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 08, 2011 at 10:00:01AM +0200, Heikki Linnakangas wrote: > On 08.03.2011 04:07, Greg Stark wrote: > >Well from that log you definitely have OldestXmin going backwards. And > >not by a little bit either. at 6:33 it set the all_visible flag and > >then at 7:01 it was almost 1.3 million transactions earlier. In fact > >to precisely the same value that was in use for a transaction at 1:38. > >That seems like a bit of a coincidence though it's not repeated > >earlier. > > Yep. After staring at GetOldestXmin() again, it finally struck me how > OldestXmin can move backwards. You need two databases for it, which > probably explains why this has been so elusive. ... > What there are no other transactions active in the same database, > GetOldestXmin() returns just latestCompletedXid. When you open a > transaction in the same database after that, its xid will be above > latestCompletedXid, but its xmin includes transactions from all > databases, and there might be a transaction in some other database with > an xid that precedes the value that GetOldestXmin() returned earlier. > > I'm not sure what to do about that. One idea is track two xmin values in > proc-array, one that includes transactions in all databases, and another > that only includes transactions in the same database. GetOldestXmin() > (when allDbs is false) would only pay attention to the latter. It would > add a few instructions to GetSnapshotData(), though. > > Another idea is to give up on the warning when it appears that > oldestxmin has moved backwards, and assume that it's actually fine. We > could still warn in other cases where the flag appears to be incorrectly > set, like if there is a deleted tuple on the page. I read this to mean that it is safe to ignore this warning and that these databases are not at risk for data corruption or wrong results so long as the warning is due to oldestxmin. Please correct me if I have misunderstood. Thanks -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Fri, Mar 04, 2011 at 05:52:29PM +0200, Heikki Linnakangas wrote: > Hmm, if these all came from the same database, then it looks OldestXmin > has moved backwards. That would explain the warnings. First one vacuum > determines that all the tuples are visible to everyone and sets the > flag. Then another vacuum runs with an older OldestXmin, and determines > that there's a tuple on the page with an xmin that is not yet visible to > everyone, hence it thinks that the flag should not have been set yet. > > Looking at the code, I don't see how that situation could arise, though. > The value calculated by GetOldestXmin() should never move backwards. And > GetOldestXmin() is called in lazy_vacuum_rel(), after it has acquired a > lock on the table, which should protect from a race condition where two > vacuums could run on the table one after another, in a way where the > later vacuum runs with an OldestXmin calculated before the first vacuum. > > Hmm, fiddling with vacuum_defer_cleanup_age on the fly could cause that, > though. You don't do that, do you? No. I've updated the patch to collect db and schema and added Merlins patch as well and run it for a while. The attached log is all the debug messages for pg_statistic page 333 from one database. I've also attached the two most recent page images for that particular page, the last digits in the filename are the hour and minute of when the page was saved. What else can I be doing to help figure this out? Thanks -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. 2011-03-05 00:50:03.238 PST 13304 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 337186202) 2011-03-05 21:55:00.004 PST 10702 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 341342282) 2011-03-05 23:28:06.368 PST 16660 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 342000646) 2011-03-06 05:24:03.461 PST 12850 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 343420345) 2011-03-06 06:23:40.174 PST 21650 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 343773970) 2011-03-06 07:50:49.921 PST 3847 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 344373260) 2011-03-06 09:11:35.662 PST 10820 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 344660206) 2011-03-07 01:23:57.124 PST 14121 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347873961) 2011-03-07 01:25:01.129 PST 14266 WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "c57.pg_catalog.pg_statistic" page 333 via OldestXmin (OldestXmin 347186993) 2011-03-07 01:26:01.143 PST 14356 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347879220) 2011-03-07 01:27:01.158 PST 14456 WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "c57.pg_catalog.pg_statistic" page 333 via OldestXmin (OldestXmin 347186993) 2011-03-07 01:33:05.353 PST 15108 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347705144) 2011-03-07 01:38:09.298 PST 15869 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 347705144) 2011-03-07 01:51:17.622 PST 18829 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 348042728) 2011-03-07 02:24:41.994 PST 22594 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 348116608) 2011-03-07 06:33:47.460 PST 18384 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 349004767) 2011-03-07 07:01:51.400 PST 23896 WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "c57.pg_catalog.pg_statistic" page 333 via OldestXmin (OldestXmin 347705144) 2011-03-07 10:23:31.416 PST 10654 WARNING: debugging: setting PD_ALL_VISIBLE in relation "c57.pg_catalog.pg_statistic" on page 333 (OldestXmin 349660381) pageimage_c57_pg_catalog_pg_statistic_333.0127 Description: Binary data pageimage_c57_pg_catalog_pg_statistic_333.0701 Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Thu, Mar 03, 2011 at 09:04:04AM -0600, Merlin Moncure wrote: > On Thu, Mar 3, 2011 at 2:16 AM, Heikki Linnakangas > wrote: > > On 03.03.2011 09:12, daveg wrote: > >> > >> Question: what would be the consequence of simply patching out the setting > >> of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only > >> problem (big assumption perhaps) then simply never setting it would at > >> least > >> avoid the possibility of returning wrong answers, presumably at some > >> performance cost. We possibly could live with that until we get a handle > >> on the real cause and fix. > > > > Yes. With that assumption. > > > > If you really want to do that, I would suggest the attached patch instead. > > This just disables the optimization in seqscans to trust it, so an > > incorrectly set flag won't affect correctness of query results, but the > > flag is still set as usual and you still get the warnings so that we can > > continue to debug the issue. > > This. The mis-set flag can is likely a bug/concurrency issue etc, > but could also be a symptom of more sinister data corruption. I did > various vacuum experiments all day yesterday on my windows workstation > and was not able to produce any mis-flags. I trust iscsi more than > nfs, but maybe there is a connection here that is hardware based. hm. > do you think it would be helpful to know what is causing the > all_visible flag to get flipped? If so, the attached patch shows > which case is throwing it... I'll apply your patch and try it. Probably can only do it for a few minutes tomorrow evening though as the output is huge and we have only limited down time availability. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Thu, Mar 03, 2011 at 10:16:29AM +0200, Heikki Linnakangas wrote: > On 03.03.2011 09:12, daveg wrote: > >Question: what would be the consequence of simply patching out the setting > >of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only > >problem (big assumption perhaps) then simply never setting it would at > >least > >avoid the possibility of returning wrong answers, presumably at some > >performance cost. We possibly could live with that until we get a handle > >on the real cause and fix. > > Yes. With that assumption. > > If you really want to do that, I would suggest the attached patch > instead. This just disables the optimization in seqscans to trust it, so > an incorrectly set flag won't affect correctness of query results, but > the flag is still set as usual and you still get the warnings so that we > can continue to debug the issue. Thanks. I'll be applying this tomorrow and will send you some page images to look at assuming it still does it. I had a look at how this gets set and cleared and did not see anything obvious so I'm pretty mystified. Also, we are seeing thousands of these daily for at least a month on 4 large hosts and no-one has noticed any other issues, which suprises me. Very strange. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 01, 2011 at 08:40:37AM -0500, Robert Haas wrote: > On Mon, Feb 28, 2011 at 10:32 PM, Greg Stark wrote: > > On Tue, Mar 1, 2011 at 1:43 AM, David Christensen > > wrote: > >> Was this cluster upgraded to 8.4.4 from 8.4.0? It sounds to me like a > >> known bug in 8.4.0 which was fixed by this commit: > >> > > > > The reproduction script described was running vacuum repeatedly. A > > single vacuum run out to be sufficient to clean up the problem if it > > was left-over. > > > > I wonder if it would help to write a regression test that runs 100 or > > so vacuums and see if the bulid farm turns up any examples of this > > behaviour. > > One other thing to keep in mind here is that the warning message we've > chosen can be a bit misleading. The warning is: > > WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "test" page 1 > > ...which implies that the state of the tuples is correct, and that the > page-level bit is wrong in comparison. But I recently saw a case > where the infomask got clobbered, resulting in this warning. The page > level bit was correct, at least relative to the intended page > contents; it was the a tuple on the page that was screwed up. It > might have been better to pick a more neutral phrasing, like "page is > marked all-visible but some tuples are not visible". Yeesh. Yikes. I hope that this is not the case as we are seeing thousands of these daily on each of 4 large production hosts. Mostly on catalogs, especially pg_statistic. However it does occur on some high delete/insert traffic user tables too. Question: what would be the consequence of simply patching out the setting of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only problem (big assumption perhaps) then simply never setting it would at least avoid the possibility of returning wrong answers, presumably at some performance cost. We possibly could live with that until we get a handle on the real cause and fix. I had a look and don't really see anything except vacuum_lazy that sets it, so it seems simple to disable. Or have I understood this incorrectly? Anything else I can be doing to try to track this down? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Wed, Mar 02, 2011 at 04:20:24PM -0800, bricklen wrote: > On Wed, Mar 2, 2011 at 3:53 PM, daveg wrote: > >> > Postgresql version is 8.4.4. > >> > >> I don't see how this could be related, but since you're running on NFS, > >> maybe it is, somehow: > >> http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com > >> (for example what if the visibility map fork's last page is overwritten?) > > > > Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll > > look. > > Also, we are not seeing any of the "unexpected data beyond EOF" errors, > > just thousands per day of the PD_ALL_VISIBLE error. > > > > -dg > > FWIW, we had a couple occurrences of that message about a month ago on 9.0.2 > > http://archives.postgresql.org/pgsql-general/2011-01/msg00887.php > > Haven't seen it since we ran a cluster-wide vacuum. We did a shutdown and restart to clear the buffer cache (but did not reboot the host) and a vacuum on all dbs in the cluster last night. That cleared it up for a couple hours, but we are still getting lots of these messages. Most of them are pg_statistic and we create and drop hundreds of thousands of temp tables daily, so there is a good chance there is a concurrancy issue. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Wed, Mar 02, 2011 at 06:45:13PM -0300, Alvaro Herrera wrote: > Excerpts from daveg's message of mié mar 02 18:30:34 -0300 2011: > > > After a restart and vacuum of all dbs with no other activity things were > > quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE > > messages again. > > > > Going back through the logs we have been getting these since at least before > > mid January. Oddly, this only happens on four systems which are all new Dell > > 32 core Nehalem 512GB machines using iscsi partitions served off a Netapp. > > Our older 8 core 64GB hosts have never logged any of these errors. I'm not > > saying it is related to the hw, as these hosts are doing a lot more work > > than > > the old hosts so it may be a concurrency problem that just never came up at > > lower levels before. > > > > Postgresql version is 8.4.4. > > I don't see how this could be related, but since you're running on NFS, > maybe it is, somehow: > http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com > (for example what if the visibility map fork's last page is overwritten?) Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll look. Also, we are not seeing any of the "unexpected data beyond EOF" errors, just thousands per day of the PD_ALL_VISIBLE error. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 01, 2011 at 01:20:43PM -0800, daveg wrote: > On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote: > > On 28.02.2011 23:28, daveg wrote: > > >On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote: > > >>We'll likely need to go back and forth a few times with various > > >>debugging patches until we get to the heart of this.. > > > > > >Anything new on this? I'm seeing at on one of my clients production boxes. > > > > I haven't heard anything from the OP since. > > > > >Also, what is the significance, ie what is the risk or damage potential if > > >this flag is set incorrectly? > > > > Sequential scans will honor the flag, so you might see some dead rows > > incorrectly returned by a sequential scan. That's the only "damage", but > > an incorrectly set flag could be a sign of something more sinister, like > > corrupt tuple headers. The flag should never be set incorrectly, so if > > you see that message you have hit a bug in PostgreSQL, or you have bad > > hardware. > > > > This flag is quite new, so a bug in PostgreSQL is quite possible. If you > > still have a backup that contains those incorrectly set flags, I'd like > > to see what the page looks like. > > > I ran vacuums on all the affected tables last night. I plan to take a downtime > to clear the buffer cache and then to run vacuums on all the dbs in the > cluster. > > Most but not all the tables involved are catalogs. > > However, I could probably pick up your old patch sometime next week if it > recurrs and send you page images. After a restart and vacuum of all dbs with no other activity things were quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE messages again. Going back through the logs we have been getting these since at least before mid January. Oddly, this only happens on four systems which are all new Dell 32 core Nehalem 512GB machines using iscsi partitions served off a Netapp. Our older 8 core 64GB hosts have never logged any of these errors. I'm not saying it is related to the hw, as these hosts are doing a lot more work than the old hosts so it may be a concurrency problem that just never came up at lower levels before. Postgresql version is 8.4.4. I'll pick up Heikkis page logging patch and run it for a bit to get some damaged page images. What else could I be doing to track this down? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote: > On 28.02.2011 23:28, daveg wrote: > >On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote: > >>We'll likely need to go back and forth a few times with various > >>debugging patches until we get to the heart of this.. > > > >Anything new on this? I'm seeing at on one of my clients production boxes. > > I haven't heard anything from the OP since. > > >Also, what is the significance, ie what is the risk or damage potential if > >this flag is set incorrectly? > > Sequential scans will honor the flag, so you might see some dead rows > incorrectly returned by a sequential scan. That's the only "damage", but > an incorrectly set flag could be a sign of something more sinister, like > corrupt tuple headers. The flag should never be set incorrectly, so if > you see that message you have hit a bug in PostgreSQL, or you have bad > hardware. > > This flag is quite new, so a bug in PostgreSQL is quite possible. If you > still have a backup that contains those incorrectly set flags, I'd like > to see what the page looks like. I ran vacuums on all the affected tables last night. I plan to take a downtime to clear the buffer cache and then to run vacuums on all the dbs in the cluster. Most but not all the tables involved are catalogs. However, I could probably pick up your old patch sometime next week if it recurrs and send you page images. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Mon, Feb 28, 2011 at 07:43:39PM -0600, David Christensen wrote: > > On Feb 28, 2011, at 3:28 PM, daveg wrote: > > > Anything new on this? I'm seeing at on one of my clients production boxes. > > Also, what is the significance, ie what is the risk or damage potential if > > this flag is set incorrectly? > > > Was this cluster upgraded to 8.4.4 from 8.4.0? It sounds to me like a known > bug in 8.4.0 which was fixed by this commit: > > commit 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f > Author: Tom Lane > Date: Mon Aug 24 02:18:32 2009 + > > Fix a violation of WAL coding rules in the recent patch to include an > "all tuples visible" flag in heap page headers. The flag update *must* > be applied before calling XLogInsert, but heap_update and the tuple > moving routines in VACUUM FULL were ignoring this rule. A crash and > replay could therefore leave the flag incorrectly set, causing rows > to appear visible in seqscans when they should not be. This might explain > recent reports of data corruption from Jeff Ross and others. > > In passing, do a bit of editorialization on comments in visibilitymap.c. > > oy:postgresql machack$ git describe --tag > 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f > REL8_4_0-190-g7fc7a7c > > If the flag got twiddled while running as 8.4.0, the incorrect PD_ALL_VISIBLE > flag would (obviously) not be fixed by the upgrade to 8.4.4. (Is this a > separate issue?) This cluster was installed with 8.4.4. So it is still an existing problem. Also, to my recollection, this cluster has never crashed. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote: > On 12.01.2011 06:21, Fujii Masao wrote: > >On Sat, Dec 25, 2010 at 2:09 PM, Maxim Boguk wrote: > >>While I trying create reproducible test case for BUG #5798 I > >>encountered very strange effect on two of my servers (both servers > >>have same hardware platform/OS (freebsd 7.2) and PostgreSQL 8.4.4). > >> > >>Very simple test table created as: > >>CREATE TABLE test (id integer); > >>INSERT INTO test select generate_series(0,1); > >> > >>And I trying repeateble vacuum of that table with script: > >> perl -e "foreach (1..10) {system \"psql -d test -h -c 'vacuum > >> test'\";}" > >> > >>And once per like an minute (really random intervals can be 5 minutes > >>without problems can be 3 vacuum in row show same error) I getting > >>next errors: > >>WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "test" page > >>1 > >>... > >>WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "test" > >>page 30 for all pages of the relation. > > Oh, interesting. This is the first time anyone can reliably reproducible > that. I can't reproduce that on my laptop with that script, though, so > I'm going to need your help to debug this. > > Can you compile PostgreSQL with the attached patch, and rerun the test? > It will dump the pages with incorrectly set flags to files in /tmp/, and > adds a bit more detail in the WARNING. Please run the test until you > get those warnings, and tar up the the created "/tmp/pageimage*" files, > and post them along with the warning generated. > > We'll likely need to go back and forth a few times with various > debugging patches until we get to the heart of this.. Anything new on this? I'm seeing at on one of my clients production boxes. Also, what is the significance, ie what is the risk or damage potential if this flag is set incorrectly? Thanks -dg -- David Gould da...@sonic.net If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL and HugePage
On Thu, Oct 21, 2010 at 08:16:27AM -0700, Mark Wong wrote: > On Tue, Oct 19, 2010 at 8:30 PM, daveg wrote: > > On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote: > >> On 20/10/10 16:05, Mark Kirkwood wrote: > >> > > >> > > >> >shmget and friends are hugetlbpage aware, so it seems it should 'just > >> >work'. > >> > > >> > >> Heh - provided you specify > >> > >> SHM_HUGETLB > >> > >> > >> in the relevant call that is :-) > > > > I had a patch for this against 8.3 that I could update if there is any > > interest. I suspect it is helpful. > > Oh, probably better than me digging up my broken one. Send it out as > is if you don't want to update it. :) I'll update it and see if I can get a largish machine to test, at least with pgbench on. But not today alas. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL and HugePage
On Wed, Oct 20, 2010 at 12:28:25PM -0700, Greg Stark wrote: > On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark wrote: > > I don't think it's a big cost once all the processes > > have been forked if you're reusing them beyond perhaps slightly more > > efficient cache usage. > > Hm, this site claims to get a 13% win just from the reduced tlb misses > using a preload hack with Pg 8.2. That would be pretty substantial. > > http://oss.linbit.com/hugetlb/ That was my motivation in trying a patch. TLB misses can be a substantial overhead. I'm not current on the state of play, but working at Sun's benchmark lab on a DB TPC-B benchmark something for the first generation of MP systems, something like 30% of all bus traffic was TLB misses. The next iteration of the hardward had a much larger TLB. I have a client with 512GB memory systems, currently with 128GB configured as postgresql buffer cache. Which is 32M TLB entires trying to fit in the few dozed cpu TLB slots. I suspect there may be some contention. I'll benchmark of course. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL and HugePage
On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote: > On 20/10/10 16:05, Mark Kirkwood wrote: > > > > > >shmget and friends are hugetlbpage aware, so it seems it should 'just > >work'. > > > > Heh - provided you specify > > SHM_HUGETLB > > > in the relevant call that is :-) I had a patch for this against 8.3 that I could update if there is any interest. I suspect it is helpful. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] a faster compression algorithm for pg_dump
On Tue, Apr 13, 2010 at 03:03:58PM -0400, Tom Lane wrote: > Joachim Wieland writes: > > If we still cannot do this, then what I am asking is: What does the > > project need to be able to at least link against such a compression > > algorithm? > > Well, what we *really* need is a convincing argument that it's worth > taking some risk for. I find that not obvious. You can pipe the output > of pg_dump into your-choice-of-compressor, for example, and that gets > you the ability to spread the work across multiple CPUs in addition to > eliminating legal risk to the PG project. And in any case the general > impression seems to be that the main dump-speed bottleneck is on the > backend side not in pg_dump's compression. My client uses pg_dump -Fc and produces about 700GB of compressed postgresql dump nightly from multiple hosts. They also depend on being able to read and filter the dump catalog. A faster compression algorithm would be a huge benefit for dealing with this volume. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TCP keepalive support for libpq
On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote: > Tollef Fog Heen wrote: > >(please Cc me on replies, I am not subscribed) > > > >Hi, > > > >libpq currently does not use TCP keepalives. This is a problem in our > >case where we have some clients waiting for notifies and then the > >connection is dropped on the server side. The client never gets the FIN > >and thinks the connection is up. The attached patch unconditionally > >adds keepalives. I chose unconditionally as this is what the server > >does. We didn't need the ability to tune the timeouts, but that could > >be added with reasonable ease. > > ISTM that the default behavior should be keep alives disabled, as it is > now, and those wanting it can just set it in their apps: > > setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...) I disagree. I have clients who have problems with leftover client connections due to server host failures. They do not write apps in C. For a non-default change to be effective we would need to have all the client drivers, eg JDBC, psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding this option as a non-default will not really help. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump enhancement proposal
On Thu, Nov 12, 2009 at 04:31:37PM -0500, Tom Lane wrote: > Mark Hammonds writes: > > 2. Custom Query Exports > > > In my use of mysqldump, I found one feature very useful: the ability > > to execute a custom SELECT. . .WHERE statement and then dump only the > > results. This feature currently provides MySQL users with the ability > > to quickly and easily export very granular data subsets, and I see no > > reason why PostgreSQL users wouldn't benefit from the same capability. > > While it is true that this functionality can already be achieved in > > PostgreSQL using Copy, it seems to me that it would logically fit well > > as an extension to pg_dump, especially since many beginning and even > > some intermediate PostgreSQL users aren't aware of the alternatives. > > As you say, we already have this using COPY, and I don't agree that > it would be a good idea to plaster it into pg_dump as well. pg_dump > is intended for dumping and restoring data, not for ETL-type tasks. > Furthermore, pg_dump is a overly complex beast already --- much more > so than one could wish, for a tool that is absolutely fundamental to > database reliability. Putting requirements on it that are well outside > its charter seems like a short route to maintenance disaster. > > There has been some occasional chatter about developing one or more > tools focused on ETL rather than dump/restore, and my thought is that > this idea would fit better there. An ETL tool would not have the > kind of requirements pg_dump has for coping with multiple server > versions and knowing everything there is to know about database > contents, so it seems like it could address new areas of functionality > without a complexity explosion. > > You might want to check the archives for previous discussions --- > I think the last go-round started with someone wanting to add an > arbitrary WHERE filter to pg_dump dumps. Sorry I missed this thread. Not only has there been previous discussion, there have been at least two, and I seem to recall three, patches implementing this. None of the patches was very large, and none of them impacted the basic "make a backup" paths in pg_dump. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EOL for 7.4?
On Fri, Nov 13, 2009 at 02:47:56AM +, Greg Stark wrote: > On Fri, Nov 13, 2009 at 2:35 AM, daveg wrote: > > I suggest we announce now that both 7.4 and 8.0 will EOL when 8.5 is > > expected > > to ship, or to comfort those who never use .0 versions when 8.5.1 ships. > > What would this mean? How would it be different than the status quo? I suppose it would mean posting periodic prominent notices, moving the sources to the OLD directory, that sort of thing. I thought that was the topic of this thread? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EOL for 7.4?
On Fri, Nov 13, 2009 at 02:22:01AM +, Greg Stark wrote: > > Really I think you guys are on the wrong track trying to map Postgres > releases to commercial support terms. None of the Postgres releases > are "supported" in the sense that there's no warranty and no promises, > it's all best effort. If you want a promise of anything then pay > someone for that service. > > As with any open source software if you're running 7-year-old versions > of the software you can't seriously expect the developers to take any > interest in bugs you discover which don't affect current releases. > Other projects don't release back branches at all. The most the > developers are likely to do if your bugs require serious engineering > is declare that the version you're using is too old. Claiming to support versions that are "too old" is giving users a false sense of comfort. Encouraging users to use these versions is actually harming them as when this happens they will be stuck with either living with the bug or doing an immediate unplanned upgrade. I suggest we announce now that both 7.4 and 8.0 will EOL when 8.5 is expected to ship, or to comfort those who never use .0 versions when 8.5.1 ships. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EOL for 7.4?
On Tue, Nov 03, 2009 at 10:32:17AM -0800, Josh Berkus wrote: > So I'm going to make a case in favor of EOL'ing 7.4. In fact, I'd be in > favor of doing so in, say, February after an announcement this month. > > The main reason I'm in favor of this is that we have a lot of users > using 7.4 out of inertia, and they need a message that 7.4 is "not > supported" to get them to upgrade. I can think of several here in SF > who have been "working on upgrade plans" for the past 3 years. An EOL > is what's needed to give them a kick in the pants. > > The same goes for other OSS projects. There's quite a few random OSS > apps which were created on PG 7.4 and have never offered their users an > upgrade path (Gnuworld comes to mind). They need an EOL announcement to > get them motivated to upgrade. +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres server goes in recovery mode repeteadly
On Fri, Oct 02, 2009 at 07:57:13PM -0700, daveg wrote: > On Fri, Oct 02, 2009 at 10:41:07AM -0400, Alvaro Herrera wrote: > > daveg escribió: > > > > > I work with Kunal and have been looking into this. It appears to be the > > > same > > > as the bug described in: > > > > > > http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php > > > > > > as I have localized it to a NULL pointer deference in > > > RelationCacheInitializePhase2() as well. Tom speculates in: > > > > > > http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php > > > > > > that large numbers of table drops might trigger this. The system in > > > question > > > creates and drops temp tables at a high rate which tends to confirm this. > > > > Did you test the patch posted by Tom? > > We are testing it since last night in our test environment. If it does not > break anything (unlikely) we will deploy it next week. However, since the > problem is only occasional, only happens every few days on one of 50+ hosts, > it will take some extended time without further segfaults to say anything > confident about the patches effectiveness. We have had this deployed in our test and production environments for a couple weeks now. We have not seen any further instance of the problem. Without the patch, we would have expected to see at least a few by now. So the patch appears to be effective. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Application name patch - v2
On Tue, Oct 20, 2009 at 12:16:42PM -0400, Tom Lane wrote: > Magnus Hagander writes: > > Also, how many platforms can't we do this on? If we have BSD and > > Windows covered already. on linux, I believe you can easily read it > > out of /proc/self/cmdline, no? > > Writing a pile of platform-specific code for this is simply insane from > a support point of view. The feature is NOT worth it. Especially not > since the typical result will be something quite uninformative like > "psql" or "java". The cases that are actually useful are the ones where > the application sets it. I don't think we should have a default at all > --- you don't set it, you don't get a name. -1 I'd like a default, especially for psql, to help identify interactive sessions. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Application name patch - v2
On Mon, Oct 19, 2009 at 01:00:28PM +0100, Dave Page wrote: > On Mon, Oct 19, 2009 at 12:57 PM, Pavel Stehule > wrote: > > It is not practical. I'll log errors. Usually SQL injection generates > > lot of errors. Loging all statements has not sense. What is difference > > bad and good SQL statement.? Maybe multistatements are good candidates > > for log as possible attackers statements. On highly load databases > > loging all statements significantly increase load :( > > Ahh, I see. > > >> My point is, that the query to change the app name is logged using the > >> *original* app name, thus it will not be discarded by the log analysis > >> tools in your scenario. > >> > > > > I thing, so change of original name should generate warning. > > Well, if other people think that's necessary, it's certainly possible. I have clients working around the lack of this feature by simply prepending a single line comment to their sql in the application to supply the app name. eg: -- monthly_report monthly_process.py:524 select wev from foo; This feature would be very handy, but not if it requires special permission to use it. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
On Sat, Oct 17, 2009 at 03:01:27PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > Sounds like a good reason to remove add_missing_from in 8.5. > > Seems like the general consensus is that it's okay to do that. > I will go make it happen unless somebody squawks pretty soon... > > regards, tom lane +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres server goes in recovery mode repeteadly
On Fri, Oct 02, 2009 at 10:41:07AM -0400, Alvaro Herrera wrote: > daveg escribió: > > > I work with Kunal and have been looking into this. It appears to be the same > > as the bug described in: > > > > http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php > > > > as I have localized it to a NULL pointer deference in > > RelationCacheInitializePhase2() as well. Tom speculates in: > > > > http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php > > > > that large numbers of table drops might trigger this. The system in question > > creates and drops temp tables at a high rate which tends to confirm this. > > Did you test the patch posted by Tom? We are testing it since last night in our test environment. If it does not break anything (unlikely) we will deploy it next week. However, since the problem is only occasional, only happens every few days on one of 50+ hosts, it will take some extended time without further segfaults to say anything confident about the patches effectiveness. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres server goes in recovery mode repeteadly
On Tue, Sep 29, 2009 at 09:52:06PM +0530, kunal sharma wrote: > Hi , > We are using Postgres 8.4 and its been found going into recovery > mode couple of times. The server process seems to fork another child process > which is another postgres server running under same data directory and after > some time it goes away while the old server is still running. There were few > load issues on the server but the load didnt went above "32". > >We are running opensuse 10.2 x86_64 with 32Gb of physical memory. > Checking the logs I found that theres a segmentation fault , > > > Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0030 > rip 0066ba8c rsp 7fffd364da30 error 4 > > gdb dump shows this > > Reading symbols from /lib64/libdl.so.2...done. > Loaded symbols for /lib64/libdl.so.2 > Reading symbols from /lib64/libm.so.6...done. > Loaded symbols for /lib64/libm.so.6 > Reading symbols from /lib64/libc.so.6...done. > Loaded symbols for /lib64/libc.so.6 > Reading symbols from /lib64/ld-linux-x86-64.so.2...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 > Reading symbols from /lib64/libnss_files.so.2...done. > Loaded symbols for /lib64/libnss_files.so.2 > 0x2ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6 > (gdb) > > Any suggestions what is causing this segmentation fault? I work with Kunal and have been looking into this. It appears to be the same as the bug described in: http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php as I have localized it to a NULL pointer deference in RelationCacheInitializePhase2() as well. Tom speculates in: http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php that large numbers of table drops might trigger this. The system in question creates and drops temp tables at a high rate which tends to confirm this. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Limit allocated memory per session
On Thu, Oct 01, 2009 at 11:47:43AM -0400, Tom Lane wrote: > Euler Taveira de Oliveira writes: > > Tom Lane escreveu: > >> daveg writes: > >>> I'd like to propose adding a new GUC to limit the amount of memory a > >>> backend > >>> can allocate for its own use. > >> > >> Use ulimit. > Seriously, the proposed patch introduces overhead into a place that is > already a known hot spot, in return for not much of anything. It will The overhead is simply an integer addition and compare with values that are likely already in processor caches. And this only occurs when we actually call malloc() to get a new block, not on every palloc. So I suspect it will not be noticable. However, I welcome any suggestion on how to test this and actually measure the overhead if any. pg_bench? Something else? > *not* bound backend memory use very accurately, because there is no way > to track raw malloc() calls. And I think that 99% of users will > not find it useful. The use case that motivated is a client that runs many postgresql instances with a mostly batch/large query workload. Some of the queries are code generated by an application and can be very complex. A few times a month one of these will run through 64GB of memory and oom the host. So it seriously hurts production. Setting work_mem low enough to prevent this results in poor query performance. This client does not use any outside libraries that call malloc() directly. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Limit allocated memory per session
On Thu, Oct 01, 2009 at 10:35:55AM -0400, Tom Lane wrote: > daveg writes: > > I'd like to propose adding a new GUC to limit the amount of memory a backend > > can allocate for its own use. > > Use ulimit. That was my initial thought too. However, ulimit() is documented as superceded by setrlimit(). Which has the option RLIMIT_DATA to limit the size of the data segment. Perfect! Except, RLIMIT_DATA does not appear to work on linux. The call succeeds and the new value can even be read back with getrlimit(), but it does not seem to do anything to actually limit the memory allocated. I tested this on SuSE 11: kernel 2.6.25, and Ubuntu Intrepid: kernel 2.6.28. Setting RLIMIT_AS to limit the total address space for a process works as expected. However this seems undesireable for postgresql as it can also cause stack expansion to fail, which would then force a general restart. Also, this limit would interact with the buffercache size setting as it includes the shared address space as well. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Limit allocated memory per session
I'd like to propose adding a new GUC to limit the amount of memory a backend can allocate for its own use. The problem this addresses is that sometimes one needs to set work_mem fairly high to get good query plans for large joins. However, some complex queries will then use huge amounts of memory so that one or a few of them will consume all the memory on the host and run it deep into swap or trigger the oom killer or worse. I've attached a patch based on 8.4.1. It works by keeping a track of the total memory allocated via malloc to AllocBlocks (aset.c). If this is not shot down/up too badly I will rebase it on CVS and submit it for the next commit fest. I would also like to propose a similar limit on temp space use. It is quite easy for an unintended cartesion product to use hundreds of gigabytes of scratch space and cause other processes to fail due to lack of disk space. If this is not objectionable, I'll work on it too. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. *** ./src/include/utils/memutils.h.orig 2009-09-30 01:54:36.0 -0700 --- ./src/include/utils/memutils.h 2009-09-30 03:33:44.0 -0700 *** *** 114,119 --- 114,122 */ /* aset.c */ + + extern int max_allocated_mem; + extern MemoryContext AllocSetContextCreate(MemoryContext parent, const char *name, Size minContextSize, *** ./src/backend/utils/mmgr/aset.c.orig2009-09-29 16:14:23.0 -0700 --- ./src/backend/utils/mmgr/aset.c 2009-10-01 03:07:34.0 -0700 *** *** 168,173 --- 168,187 } AllocBlockData; /* + * AllocBlock accounting maintains total allocated memory to enforce the memory use limit. + */ + int max_allocated_mem = 0; + Size AllocBlockAccountingMemUsed = 0; + + #define AllocBlockAccountingFree(block) \ + (AllocBlockAccountingMemUsed -= block->endptr - (char *) (block)) + #define AllocBlockAccountingAlloc(block) \ +(AllocBlockAccountingMemUsed += block->endptr - (char *) (block)) + #define AllocBlockAccountingOverLimit() \ + (max_allocated_mem != 0 \ +&& AllocBlockAccountingMemUsed / 1024 > max_allocated_mem) + + /* * AllocChunk *The prefix of each piece of memory in an AllocBlock * *** *** 393,398 --- 407,423 context->blocks = block; /* Mark block as not to be released at reset time */ context->keeper = block; + + AllocBlockAccountingAlloc(block); + if (AllocBlockAccountingOverLimit()) + { + MemoryContextStats(TopMemoryContext); + ereport(ERROR, + (errcode(ERRCODE_OUT_OF_MEMORY), +errmsg("memory limit exceeded"), +errdetail("Failed while creating memory context \"%s\".", + name))); + } } context->isReset = true; *** *** 476,481 --- 501,507 else { /* Normal case, release the block */ + AllocBlockAccountingFree(block); #ifdef CLOBBER_FREED_MEMORY /* Wipe freed memory for debugging purposes */ memset(block, 0x7F, block->freeptr - ((char *) block)); *** *** 521,526 --- 547,553 { AllocBlock next = block->next; + AllocBlockAccountingFree(block); #ifdef CLOBBER_FREED_MEMORY /* Wipe freed memory for debugging purposes */ memset(block, 0x7F, block->freeptr - ((char *) block)); *** *** 597,602 --- 624,640 set->blocks = block; } + AllocBlockAccountingAlloc(block); + if (AllocBlockAccountingOverLimit()) + { + MemoryContextStats(TopMemoryContext); + ereport(ERROR, + (errcode(ERRCODE_OUT_OF_MEMORY), +errmsg("memory limit exceeded"), +errdetail("Failed on request of size %lu.", + (unsigned long) size))); + } + set->isReset = false; AllocAllocInfo(set, chunk); *** *** 767,772 --- 805,821 block->next = set->blocks; set->blocks = block; + + AllocBlockAccountingAlloc(block); + i
Re: [HACKERS] "BEGIN TRANSACTION" and "START TRANSACTION": different error handling
On Thu, Sep 24, 2009 at 12:16:43PM +0300, Hannu Krosing wrote: > "I expect the transaction is aborted and rollback is executed > automatically." - this is not how postgreSQL behaves. PostgreSQL needs > an explicit end of transaction from client, either COMMIT; or ROLLBACK; > > when run from psql, they both act the same, except the string returned > > hannu=# begin transaction; > BEGIN > hannu=# select 1/0; > ERROR: division by zero > hannu=# select 1/0; > ERROR: current transaction is aborted, commands ignored until end of > transaction block > hannu=# abort; > ROLLBACK > hannu=# start transaction; > START TRANSACTION > hannu=# select 1/0; > ERROR: division by zero > hannu=# select 1/0; > ERROR: current transaction is aborted, commands ignored until end of > transaction block > hannu=# abort; > ROLLBACK > > I suspect, that psqlodbc is the one doing the automatic rollback and it > seems to rely on reply "BEGIN" to establish an in-transaction state. > > so when "start transaction;" returns "START TRANSACTION" instead of > "BEGIN", psqlodbc does not realise that it is in transaction and does > not initiate the automatic rollback. Well. I'd always thought BEGIN and START were syntactic Aspartame and had the same underlying implementation. So this is a surprise. Why do they return a different status? -dg ` -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding \ev view editor?
On Mon, Sep 21, 2009 at 02:26:05PM -0400, Andrew Dunstan wrote: >andrew=# select pg_get_viewdef('foo',true); >pg_get_viewdef >-- > SELECT 'a'::text AS b, > ( SELECT 1 >FROM dual) AS x, > random() AS y, > CASE > WHEN true THEN 1 > ELSE 0 > END AS c, > 1 AS d >FROM dual; >(1 row) +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] numeric_to_number() function skipping some digits
On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote: > > It seems that Oracle reads formatting string from right-to-left. Here are > few results: > ('number','format') ==> Oracle PG > > ('34,50','999,99') ==> 3450340 > ('34,50','99,99') ==> 34503450 > ('34,50','99,999') ==> Invalid Number 3450 > ('34,50','999,999') ==> Invalid Number 340 It seems worse to to give a wrong answer silently then to throw an error. What we do now seems sort of MySqlish. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] happy birthday Tom Lane ...
On Fri, Sep 18, 2009 at 01:04:23PM +0200, Hans-Juergen Schoenig -- PostgreSQL wrote: > Tom, > > On behalf of the entire PostgreSQL team here in Austria I want to wish > you a happy birthday. > We hope that you fill be a vital part of PostgreSQL for many years to come. > > Best regards, > > Hans-Jürgen Schönig + team +1 from me too. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RfD: more powerful "any" types
On Tue, Sep 15, 2009 at 07:38:18AM +0200, Pavel Stehule wrote: > it isn't fair :) why you use $$ without single quote? And still this > case should be vulnerable on SQL injection. Maybe you or me knows, > what SQL injection means, but beginners knows nothing and this people > use following bad code: > > sql := $$SELECT * FROM '${table_name}'$$} and are happy. But this code > is wrong! I have an idea you will like less: have multiple interpolation codes that automagically do the right quoting. Perhaps as extra printf like type codes. The above then becomes: sql := pgprintf($$SELECT * FROM %I;$$, table_name ) Where %I evaluates as if it were quote_ident(%s). This would maybe even encourage users to do the quoting they should by making it easy. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ragged CSV import
On Fri, Sep 11, 2009 at 10:27:06AM +0200, Dimitri Fontaine wrote: > Maybe instead of opening FROM for COPY, having it accepted in WITH would > be better, the same way (from the user point of view) that DML returning > are worked on. ... > WITH csv AS ( >COPY t FROM stdin CSV > ) > INSERT INTO foo(x, y, z) > SELECT t[3], t[2], mytimestamptz([5], [6], [7]) >FROM csv; > > Now the text[] has a strange feeling, without it it'd be: > > WITH csv AS ( >COPY t(a, b, c, d, e, f, g) >FROM stdin > CSV IGNORING EXTRA COLUMNS -- random nice syntax > MISSING COLUMNS DEFAULTS NULL -- that needs some reality check > ) > INSERT INTO foo(x, y, z) > SELECT c, b, mytimestamptz(e, f, g) >FROM csv; > > The function mytimestamptz(date text, time text, timezone text) will > accept input that PostgreSQL input types would have errored out on... so > you can process in one go strange formats from other products. +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RfD: more powerful "any" types
On Fri, Sep 11, 2009 at 11:43:32AM -0400, Merlin Moncure wrote: > > If you are going to use printf format codes, which is good and useful > being something of a standard, I'd call routine printf (not format) > and actually wrap vsnprintf. The format codes in printf have a very > specific meaning: converting native C types to arrays of characters. > I think that a postgresql implementation should do exactly that: > attempt to convert the passed in datum to the c type in question if > possible (erroring if no cast exists) and then pass it down. The idea > is we are not adding new formatting routines but using a very high > quality existing one...why reinvent the wheel? > > so if you did: select printf('%s %3.1f', foo::box, bar::circle); > the box to char* cast would work (using the text cast) but the second > cast would fail unless the user added a cast to float. The code in > question is easy to imagine...parse the format string, and loop the > varargs using the appropriate looked up cast one by one... +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove flatfiles.c
On Thu, Sep 03, 2009 at 07:57:25PM -0400, Andrew Dunstan wrote: > daveg wrote: > >On Tue, Sep 01, 2009 at 07:42:56PM -0400, Tom Lane wrote: > >>I'm having a hard time believing that VACUUM FULL really has any > >>interesting use-case anymore. > > > >I have a client who uses temp tables heavily, hundreds of thousands of > >creates > >and drops per day. They also have long running queries. The only thing that > >keeps catalog bloat somewhat in check is vacuum full on bloated catalogs > >a few times a day. Without that pg_class, pg_attribute etc quickly balloon > >to thousands of pages. > > That's a rate of more than one create and drop per second. How does your > client handle the fact that VACUUM FULL will exclusively lock those > catalog tables? Without knowing more, it looks like a bit of a design issue. I'd say it is several per second. They wait for the catalog locks sometimes. This is not an interactive application so that is somewhat acceptable. It also occasionally causes deadlocks which is less agreeable. There are various reasons for the heavy use of temps, mainly having to do with loading external feeds or reusing intermediate query results in a series of queries. It would be great if there was a way to have temp tables that did not get cataloged, eg local cache only. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove flatfiles.c
On Tue, Sep 01, 2009 at 07:42:56PM -0400, Tom Lane wrote: > Greg Stark writes: > > On Wed, Sep 2, 2009 at 12:01 AM, Alvaro > > Herrera wrote: > >>> The use cases where VACUUM FULL wins currently are where storing two > >>> copies of the table and its indexes concurrently just isn't practical. > >> > >> Yeah, but then do you really need to use VACUUM FULL? If that's really > >> a problem then there ain't that many dead tuples around. > > > That's what I want to believe. But picture if you have, say a > > 1-terabyte table which is 50% dead tuples and you don't have a spare > > 1-terabytes to rewrite the whole table. > > But trying to VACUUM FULL that table is going to be horridly painful > too, and you'll still have bloated indexes afterwards. You might as > well just live with the 50% waste, especially since if you did a > full-table update once you'll probably do it again sometime. > > I'm having a hard time believing that VACUUM FULL really has any > interesting use-case anymore. I have a client who uses temp tables heavily, hundreds of thousands of creates and drops per day. They also have long running queries. The only thing that keeps catalog bloat somewhat in check is vacuum full on bloated catalogs a few times a day. Without that pg_class, pg_attribute etc quickly balloon to thousands of pages. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add YAML option to explain
On Mon, Aug 31, 2009 at 02:15:08PM -, Greg Sabino Mullane wrote: > > Greg, can we see a few examples of the YAML output > > compared to both json and text? ... > greg=# explain (format json, analyze on) select * from pg_class where relname > ~ 'x' order by 1,2,3; > QUERY PLAN > - --- An interesting property of json, it is almost exactly the same as python data structure syntax. If I paste the following into python: plan = [ { "Plan": { "Node Type": "Sort", "Startup Cost": 12.82, "Total Cost": 13.10, "Plan Rows": 111, "Plan Width": 185, "Actual Startup Time": 1.152, "Actual Total Time": 1.373, "Actual Rows": 105, "Actual Loops": 1, "Sort Key": ["relname", "relnamespace", "reltype"], "Sort Method": "quicksort", "Sort Space Used": 44, "Sort Space Type": "Memory", "Plans": [ { "Node Type": "Seq Scan", "Parent Relationship": "Outer", "Relation Name": "pg_class", "Alias": "pg_class", "Startup Cost": 0.00, "Total Cost": 9.05, "Plan Rows": 111, "Plan Width": 185, "Actual Startup Time": 0.067, "Actual Total Time": 0.817, "Actual Rows": 105, "Actual Loops": 1, "Filter": "(relname ~ 'x'::text)" } ] }, "Triggers": [ ], "Total Runtime": 1.649 } ] I get a python data structure. Which can be manipulated directly, or pretty printed: >>> import pprint >>> pprint.pprint(plan) [{'Plan': {'Actual Loops': 1, 'Actual Rows': 105, 'Actual Startup Time': 1.1519, 'Actual Total Time': 1.373, 'Node Type': 'Sort', 'Plan Rows': 111, 'Plan Width': 185, 'Plans': [{'Actual Loops': 1, 'Actual Rows': 105, 'Actual Startup Time': 0.067004, 'Actual Total Time': 0.81695, 'Alias': 'pg_class', 'Filter': "(relname ~ 'x'::text)", 'Node Type': 'Seq Scan', 'Parent Relationship': 'Outer', 'Plan Rows': 111, 'Plan Width': 185, 'Relation Name': 'pg_class', 'Startup Cost': 0.0, 'Total Cost': 9.0507}], 'Sort Key': ['relname', 'relnamespace', 'reltype'], 'Sort Method': 'quicksort', 'Sort Space Type': 'Memory', 'Sort Space Used': 44, 'Startup Cost': 12.82, 'Total Cost': 13.1}, 'Total Runtime': 1.649, 'Triggers': []}] I'm not sure if all json can be read this way, but the python and json notations are very similar. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add YAML option to explain
On Fri, Aug 28, 2009 at 04:37:41PM -0700, David E. Wheeler wrote: > On Aug 28, 2009, at 3:45 PM, Stephen Frost wrote: > > >+1 from me. I've read the other comments and just plain don't agree > >with them. It's a small patch, adds a useful format for EXPLAIN, and > >would be used. > > > >One of the best things about PG is the flexibility and usability. > > I agree, I tend to prefer YAML output where it's parseable (and I > expect it the EXPLAIN YAML output won't be doing anything tricky). > > That said, maybe there should be a way to create modules add formats, > instead of adding them to core? +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 release timetable, again
On Thu, Aug 27, 2009 at 08:02:03PM -0700, Ron Mayer wrote: > Andrew Dunstan wrote: > > I don't know of anyone who is likely to want to try out alphas in their > > normal development environments. The client I approached was > > specifically prepared to test beta releases that way. > > Perhaps end-users won't, but I think companies who develop software that > works on top of postgres will. Perhaps to make sure their existing software > continues to work; or perhaps to get a head start working with new features. > I test against CVS-head occasionally. I've been trying to help a client take up new versions of postgresql more quickly as the performance or feature content is often very valuable to them. Accordingly, I have encouraged them to run periodic samples of the nightly snapshots on at least one development instance, and to run the betas in the test environment. The goal is to be confident on the day of the postgresql release that we have tested enough and fixed any incompatibilities so that, if they so choose, they could migrate production to the new release immediately. This way they get several months extra benefit from improvements to postgresql. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 release timetable, again
On Thu, Aug 27, 2009 at 09:38:15PM +0200, Dimitri Fontaine wrote: > Exactly, and I think that what we're missing here is a simple tool for > our users to check a new PostgreSQL release against their existing > application. > > We already know how to either log all queries and analyze the log files > (CSV makes it easier, pgfouine parses them too) or to have a fe/be > protocol proxy to record application SQL traffic (tsung recorder does > that). > > What we miss is a tool to run the captured queries through both versions > of PG and report any resultset mismatch, of course with a way to account > for ordering issues (but we've seen people rely on the ordering when > they don't give an order by clause, then bug the lists about it if a new > release changes it). This would be very useful. I often am asked "how much better will the new release run our apps" as part of convincing a client to upgrade to a more current postgresql release. Being able to replay a days workload in a somewhat realistic manner would be a great help. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GRANT ON ALL IN schema
On Sun, Aug 16, 2009 at 02:59:53PM +0200, Pavel Stehule wrote: > 2009/8/16 Peter Eisentraut : > > On sön, 2009-08-16 at 00:04 -0400, Andrew Dunstan wrote: > >> SQL is not Lisp. Simple is good. I didn't think Peter was really very > >> serious. > > > > Well, I don't know if we really need to call it "lambda", but I fully > > expect to be able to use these "ad hoc functions" as part of other > > expressions. So making DO or whatever a top-level command that does not > > integrate with anything else would not really satisfy me. > > +1 +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GRANT ON ALL IN schema
On Sat, Aug 15, 2009 at 11:34:04PM +0200, Dimitri Fontaine wrote: > Nitpicking dept, I think I prefer: > > DO [ [LANGUAGE] language] $$ ... $$; > DO plperl $$ ... $$; > DO language plpython $$ ... $$; > > language is optional and defaults to plpgsql. +1 -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha releases: How to tag
On Fri, Aug 07, 2009 at 06:28:34PM -0400, Tom Lane wrote: > David Fetter writes: > > I am not suggesting that this change be immediate, and it's not ivory > > tower. It's just how everybody else does it. > > You keep saying that, and it's completely meaningless. What do you know > about the development practices of Oracle, or DB2, or even Mysql? When I was at Sybase, changes to the on disk structure were required to provide code to do the migration. Nonetheless, at release time, the migrate process was almost always discovered to be broken, sometimes even before it was shipped to customers. Of course, Sybase implemented its own complete filesystem layer on top of raw partitions, so there was more scope to go wrong, especially since it was possible to corrupt the on disk structure in subtle ways that would not be discovered in normal operation but that would cause migration to corrupt it still further. In fairness, this is a very difficult problem to solve well and I expect to rely on dump/load migrations for quite sometime. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic
On Mon, Aug 03, 2009 at 11:21:43AM -0400, Tom Lane wrote: > "Kevin Grittner" writes: > > Over the weekend I ran 40 restores of Milwaukee County's production > > data using Friday's snapshot with and without the patch. I alternated > > between patched and unpatched. It appears that this latest version is > > slightly slower for our production database on the same machine and > > configuration where the previous patch appeared to be 1% to 2% faster > > than unpatched (although I had fewer samples of that). > > I think we can conclude that for this particular test case, the effects > of the patch are pretty much masked by noise. I definitely see no way > that the latest version of the patch could really be slower than the > original; it has the same job-scheduling behavior and strictly less > list-munging overhead. Now the patch could be slower than unpatched > as a result of different job-scheduling behavior ... but there's no > evidence here of a consistently measurable benefit or loss from that. > > IIRC daveg was volunteering to do some tests with his own data; maybe > we should wait for those results. I have run extensive tests with three trials of each configuration on two hosts with a variety of db sizes from 3GB to 142GB. These just finished, and I will send a more detailed summary later, but at the moment I don't see any significant difference between the patched and vanilla pg_restore. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic
On Thu, Jul 30, 2009 at 12:29:34PM -0500, Kevin Grittner wrote: > Tom Lane wrote: > > > I think we've pretty much established that it doesn't make things > > *worse*, so I'm sort of inclined to go ahead and apply it. The > > theoretical advantage of eliminating O(N^2) search behavior seems > > like reason enough, even if it takes a ridiculous number of tables > > for that to become significant. > > Agreed, although I'm having some concerns about whether this should > proceed based exclusively on my benchmarks. On a thread on the > performance list, people are talking about restores which go several > times faster with parallel restore (compared to a single job). On my > hardware, I haven't even gotten it to run twice as fast. This means > that parallel restore is not a good fit for servers like we have, at > least with databases like we have, which means it's probably a poor > environment to get benchmarks for this patch. :-( > > Can we get someone who has benchmarks showing parallel restore to be > eight times the speed of a single job to benchmark with this patch, > just for confirmation? I have a couple "spare" 32GB 4 core and 64GB 8 core servers with 15 scsi/sas drives and dumps of production dbs in the 100GB to 500 GB range. These have several hundred tables most with an index or few and an fkey or too many. It will take a couple days to run a variety of tests I suppose, and I will be away starting mid next week, but maybe I could get some done before I go. Will the patch apply to a vanilla 8.4.0? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] display previous query string of idle-in-transaction
On Thu, Jun 04, 2009 at 10:22:41PM -0400, Robert Haas wrote: > The only thing I don't like about this is that I think it's kind of a > hack to shove the in transaction designation and the query > string into the same database column. I've never liked having to > write: > > select sum(1) from pg_stat_activity where current_query = ' in > transaction'; > > ...and I like it even less if I now have to modify that query to use > "like". We should find some way to represent this as structured > data... maybe make a separate column called "idle" that's a boolean, > or something, and let the query column contain the most recent query > (whether or not it's still executing). I like this idea a lot. Possibly it would be useful to have the end time of the last query too, then one could find idle sessions that were old and truly idle rather than just waiting for a busy client to send the next query. select ... from pg_stat_activity where idle and last_statement_endtime < now() - interval '1 minute'; -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is a plan for lmza commpression in pg_dump
On Sat, Feb 07, 2009 at 08:49:29PM -0500, Robert Haas wrote: > "Proprietary compression algorithms, even with Postgresql-specific > license exceptions"? To be fair, lzo is GPL, which is a stretch to consider proprietary. -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is a plan for lmza commpression in pg_dump
On Sat, Feb 07, 2009 at 02:47:05PM -0500, Bruce Momjian wrote: > daveg wrote: > > On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote: > > > Dann Corbit wrote: > > > > > > > >The LZMA SDK is granted to the public domain: > > > >http://www.7-zip.org/sdk.html > > > > > > > > > > I played with this but found the SDK extremely confusing and flat out > > > horrible. One personal dislike was the unnecessary use of C++; although > > > it > > > was the horrible API that turned me off. I'm not even sure if I ever > > > got a > > > test program working. > > > > > > LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy > > > API with many variants; my fav is LZO1X-1(15). Its known for its > > > compresison and decompresison speeds ... its blazing fast. zlib > > > typically > > > gets 5-8% more compression. > > > > LZO rocks. I wonder if the lzo developer would consider a license exception > > so that postgresql could use it? What would we need? > > The chance of us using anything but one zlib is near zero so please do > not persue this; this discussion comes up much too often. That this comes up "much to often" suggests that there is more than near zero interest. Why can only one compression library can be considered? We use multiple readline implementations, for better or worse. I think the context here is for pg_dump only and in that context a faster compression library makes a lot of sense. I'd be happy to prepare a patch if the license issue can be accomodated. Hence my question, what sort of licence accomodation would we need to be able to use this library? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is a plan for lmza commpression in pg_dump
On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote: > Dann Corbit wrote: > > > >The LZMA SDK is granted to the public domain: > >http://www.7-zip.org/sdk.html > > > > I played with this but found the SDK extremely confusing and flat out > horrible. One personal dislike was the unnecessary use of C++; although it > was the horrible API that turned me off. I'm not even sure if I ever got a > test program working. > > LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy > API with many variants; my fav is LZO1X-1(15). Its known for its > compresison and decompresison speeds ... its blazing fast. zlib typically > gets 5-8% more compression. LZO rocks. I wonder if the lzo developer would consider a license exception so that postgresql could use it? What would we need? -dg -- David Gould da...@sonic.net 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keeping creation time of objects
On Tue, Sep 09, 2008 at 11:03:56PM +0300, Hannu Krosing wrote: > On Tue, 2008-09-09 at 12:40 -0700, daveg wrote: > > > I'd be very interested in seeing a last schema modification time for > > pg_class > > objects. I don't care about it being preserved over dump and restore as my > > use case is more to find out when a table was created with a view to finding > > out if it is still needed. > > Isn't it easier to find out if it is still needed by looking if it is > still used, say from pg_stat_user_tables ? Except that pg_dump will access it and make it look used. Also, this does not work for functions, views etc. It seems to me to be pretty simple to put an abstime or timestamp column on the major catalog tables and update it when the row is updated. A mod time is more useful probably than a create time. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keeping creation time of objects
On Tue, Sep 09, 2008 at 03:36:19PM -0400, Tom Lane wrote: > Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= <[EMAIL PROTECTED]> writes: > > AFAICS, PostgreSQL is not keeping info about when a table, database, > > sequence, etc was created. We cannot get that info even from OS, since > > CLUSTER or VACUUM FULL may change the metadata of corresponding > > relfilenode. > > > Does anyone think that adding a timestamp column to pg_class would bring > > an overhead? > > There isn't sufficient support for such a "feature". In any case, why > would creation time (as opposed to any other time, eg last schema > modification, last data modification, yadda yadda) be especially > significant? Would you expect it to be preserved over dump/restore? > How about every other object type in the system? I'd be very interested in seeing a last schema modification time for pg_class objects. I don't care about it being preserved over dump and restore as my use case is more to find out when a table was created with a view to finding out if it is still needed. So the question I'm looking to answer is "when did that get here?" -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IN vs EXISTS equivalence
On Thu, Aug 14, 2008 at 06:50:09PM +0100, Simon Riggs wrote: > > On Fri, 2008-08-08 at 16:23 -0400, Tom Lane wrote: > > > NOT IN is a lot trickier, > > condition: you must also assume that the comparison operator involved > > never yields NULL for non-null inputs. That might be okay for btree > > comparison functions but it's not a very comfy assumption in general; > > we certainly haven't got any explicit knowledge that any functions are > > guaranteed to act that way. So this case might be worth doing later ... > Just found this comment, after reading what you said on other thread > about NOT IN. > > NOT IN is a serious performance issue for most people. We simply can't > say to people "you were told not to". > > If we can fix it easily for the majority of cases, we should. We can't > let the "it won't work in certain cases" reason prevent various A suggestion: what about adding an attribute to functions to declare that they never return null? declare foo(int, int) returns int immutable not null as ... -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal sql: labeled function params
On Sat, Aug 23, 2008 at 05:08:25PM +0100, Gregory Stark wrote: > "Pavel Stehule" <[EMAIL PROTECTED]> writes: > > > Hello > > > > 2008/8/23 Peter Eisentraut <[EMAIL PROTECTED]>: > >> On Friday 22 August 2008 07:41:30 Decibel! wrote: > >>> If we're really worried about it we can have a GUC for a few versions > >>> that turns off named parameter assignment. But I don't think we > >>> should compromise the design on the theory that some folks might be > >>> using that as an operator *and* can't change their application to > >>> wrap it's use in (). > >> > >> Even if that were a reasonable strategy, you can't use GUC parameters to > >> alter > >> parser behavior. > > > > I thing, so it's possible - in this case. We should transform named > > params to expr after syntax analyze. > > So for a bit of useless syntactic sugar we should introduce conflicts with > named parameters, conflicts with operators, introduce an un-sqlish syntax and > remove a feature users have already made use of and introduce backwards > compatibility issues for those users? > > At any point in this discussion has anyone explained why these labels would > actually be a good idea? I was missing that too. What is this for that makes it so compelling? -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A smaller default postgresql.conf
On Tue, Aug 19, 2008 at 09:39:39PM +0300, Peter Eisentraut wrote: > On Tuesday 19 August 2008 19:12:16 Tom Lane wrote: > > Well, why not just make a one-eighty and say that the default > > postgresql.conf is *empty* (except for whatever initdb puts into it)? > > Well, my original implementation of GUC had an empty default configuration > file, which was later craptaculated to its current form based on seemingly > popular demand. I am very happy to work back toward the empty state, and > there appears to be growing support for that. I like the almost empty file idea. I often use the include facility to setup postgresql.conf files that look roughly like this: #include postgresql_site_policy.conf #include postgresql_.conf listen_port= where the included files set things like the overall sites logging preferences and buffer_cache and workmen based on installed memory and disk system (ex: postgresql_32GB.conf) etc. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adjusting debug_print_plan to be more useful by default
On Tue, Aug 19, 2008 at 06:33:33PM +0100, Simon Riggs wrote: > > On Tue, 2008-08-19 at 12:40 -0400, Tom Lane wrote: > > Back in April we changed EXPLAIN VERBOSE to not dump the internal plan > > tree anymore, on the grounds that non-hackers didn't want that info and > > hackers could get it with debug_print_plan and related variables. > > Well, now that I've tried to do some planner development work relying on > > debug_print_plan instead of EXPLAIN VERBOSE, I find it a mite annoying. > > It's not sufficient to set debug_print_plan = true, because the output > > comes out at priority DEBUG1, which is to say it doesn't come out at all > > in a default configuration. If you boost up client_min_messages or > > log_min_messages so you can see it, you get lots of extraneous debugging > > messages too. > > > > I'd like to propose that the messages emitted by debug_print_plan > > and friends be given priority LOG rather than DEBUG1. If you've gone > > to the trouble of turning on the variable, then you presumably want the > > results, so it seems dumb to print them at a priority that isn't logged > > by default. (Note that this is biased to the assumption that you want > > the messages in the postmaster log, not on the console. Which is > > usually what I want, but maybe someone wants to argue for NOTICE?) > > > > I'd also like to propose making debug_pretty_print default to ON. > > At least for me, the other formatting is 100% unreadable. > > +1 +1 -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visibility Groups
On Thu, Aug 07, 2008 at 01:30:27PM +0100, Gregory Stark wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > > Currently, we calculate a single OldestXmin across all snapshots on the > > assumption that any transaction might access any table. > > > > I propose creating "Visibility Groups" that *explicitly* limit the > > ability of a transaction to access data outside its visibility group(s). > > By default, visibility_groups would be NULL, implying potential access > > to all tables. > > > > Once set, any attempt to lock an object outside of a transactions > > defined visibility_groups will result in an error: > > ERROR attempt to lock table outside of visibility group(s): foo > > HINT you need to set a different value for visibility_groups > > A transaction can only ever reduce or restrict its visibility_groups, it > > cannot reset or add visibility groups. > > Hm, so backing up a bit from the specific proposed interface, the key here is > being able to explicitly mark which tables your transaction will need in the > future? > > Is it always just a handful of heavily updated tables that you want to > protect? In that case we could have a lock type which means "I'll never need > to lock this object". Then a session could issue "LOCK TABLE foo IN > INACCESSIBLE MODE" or something like that. That requires people to hack up > their pg_dump or replication script though which might be awkward. > > Perhaps the way to do that would be to preemptively take locks on all the > objects that you'll need, then have a command to indicate you won't need any > further objects beyond those. +1 -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mini improvement: statement_cost_limit
On Mon, Aug 04, 2008 at 05:19:50PM -0400, Robert Treat wrote: > See, this is what we ended up talking about before. Someone will say "I'd > like > to prevent my devs from accidentally doing queries with cartesian products" > and they will use this to do it... but that will only work in some cases, so > it becomes a poor tool to solve a different problem. > > BTW, what I really love about statement costs, is that they aren't even > reliable on the same machine with the same data. I have seen query plans > which run on the same data on the same machine where the resultant query > runtime can vary from 2 hours to 5 hours, depending on how much other > concurrent traffic is on the machine. Awesome eh? Sure, I don't think anyone believes that costs are precise. But the case that is interesting is 2 hours versus years and years. > The footgun in my mind is that people will think this solves a number of > problems even though it doesnt solve them well. However, the footgun for yo I suspect that a good solution to this problem is impossible as it is more or less the halting problem. So I'm willing to accept a poor solution based on costs and then hope we improve the cost model. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Automatic Client Failover
On Mon, Aug 04, 2008 at 05:17:59PM -0400, Jonah H. Harris wrote: > On Mon, Aug 4, 2008 at 5:08 PM, Simon Riggs <[EMAIL PROTECTED]> wrote: > > When primary server fails, it would be good if the clients connected to > > the primary knew to reconnect to the standby servers automatically. > > This would be a nice feature which many people I've talked to have > asked for. In Oracle-land, it's called Transparent Application > Failover (TAF) and it gives you a lot of options, including the > ability to write your own callbacks when a failover is detected. This might be better done as part of a proxy server, eg pgbouncer, pgpool than as part of postgresql or libpq. I like the concept, but the logic to determine when a failover has occurred is complex and a client will often not have access to enough information to make this determination accurately. postgresql could have hooks to support this though, ie to determine when a standby thinks it has become the master. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mini improvement: statement_cost_limit
On Mon, Aug 04, 2008 at 11:59:03AM -0700, Josh Berkus wrote: > Greg, > > >Well that's going to depend on the application But I suppose there's > >nothing wrong with having options which aren't always a good idea to use. > >The > >real question I guess is whether there's ever a situation where it would > >be a > >good idea to use this. I'm not 100% sure. > > I can think of *lots*. Primarily, simple web applications, where > queries are never supposed to take more than 50ms. If a query turns up > with an estimated cost of 100, then you know something's wrong; > in the statistics if not in the query. In either case, that query has a > good chance of dragging down the whole system. > > In such a production application, it is better to have false positives > and reject otherwise-OK queries becuase their costing is wrong, than to > let a single cartesian join bog down an application serving 5000 > simultaneous users. Further, with a SQL error, this would allow the > query rejection to be handled in a user-friendly way from the UI > ("Search too complex. Try changing search terms.") rather than timing > out, which is very difficult to handle well. > > The usefulness of this feature for interactive sessions is > limited-to-nonexistant. It's for production applications. Ok, that is a different use case where an error seems very useful. What about slightly extending the proposal to have the severity of exceeding the limit configurable too. Something like: costestimate_limit = 10 # default 0 to ignore limit costestimate_limit_severity = error # debug, notice, warning, error -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mini improvement: statement_cost_limit
On Mon, Aug 04, 2008 at 02:35:07PM -0400, Robert Treat wrote: > On Monday 04 August 2008 03:50:40 daveg wrote: > > That's great for you, I am talking in the scope of a general solution. (Note > I'd also bet that even given the same hardware, different production loads > can produce different relative mappings of cost vs. performance, but > whatever) Even on different hardware it would still likely warn of mistakes like products due to missing join conditions etc. > > > I still think it is worth revisiting what problems people are trying to > > > solve, and see if there are better tools they can be given to solve them. > > > Barring that, I suppose a crude solution is better than nothing, though > > > I fear people might point at the crude solution as a good enough solution > > > to justify not working on better solutions. > > > > Alerting developers and QA to potentially costly queries would help solve > > some of the probems we are trying to solve. Better tools are welcome, an > > argument that the good is the enemy of the best so we should be content > > with nothing is not. > > And you'll note, I specifically said that a crude tool is better than > nothing. I released somewhat after I sent the above that it might have sounded a bit snippy. I hope I have not offended. > But your completely ignoring that a crude tool can often end-up as a foot-gun > once relased into the wild. I'm suggesting a warning, or even just a notice into the logs, I don't see the footgun. What am I missing? Regards -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mini improvement: statement_cost_limit
On Mon, Aug 04, 2008 at 03:09:34PM -0400, Greg Smith wrote: > On Mon, 4 Aug 2008, daveg wrote: > >We load the production dumps into our dev environment, which are the same > >hardware spec, so the costs should be identical. > > Not identical, just close. ANALYZE samples data from your table randomly. > The statistics used to compute the costs will therefore be slightly > different on the two servers even if the data is the same. The problem of > discovering one plan on production and another on development is not quite > that easy to remove. Ultimately, if your developers aren't thorough > enough to do thinks like look at EXPLAIN plans enough to discover things > that are just bad, I just chuckle at your thinking that putting a single > limiter on their bad behavior will somehow magically make that better. Not all developers can be persuaded to run explain on every change. However, many will investigate a new message. I'm only hoping to try to focus their attention toward possible problem queries. > Anyway, if your production server is small enough that you can afford to > have another one just like it for the developers to work on, that's great. > Robert's point is that many installs don't work like that. The > development teams in lots of places only get a subset of the production > data because it's too large to deploy on anything but a big server, which > often is hard to cost justify buying just for development purposes. Not to get into a size war ;-), but the production environment I'd like this feature for is over 40 32GB 16 scsi drive quadcore boxes. These are dedicated to postgresql and run one or just a few databases. There are also a bunch of client boxes that we will not speak of. The staging and test environments are similar hardware but have only a subset of the databases copied to them. There are probably than a dozen DB hosts for that. > I like the concept of a cost limit, but I'm a bit horrified by the thought > of it being exposed simply through the internal cost numbers because they > are so arbitrary. One of the endless projects I think about but never > start coding is to write something that measures the things the planner > cost constants estimate on a particular machine, so that all those numbers > actually can be tied to some real-world time measure. If you did that, > you'd actually have a shot at accomplishing the real goal here, making > statement_cost_limit cut off statements expected to take longer than > statement_timeout before they even get started. That is a nice idea. Possibly it could be a utility like the fsync tester. But planner estimates are never going to be all that accurate even with solid cost numbers because for some classes of queries, particularly those with many joins the stats can be good at each level but the error accumulates exponentially. Which is why I think a warning is appropriate instead of an error. Even a notice in the logs would be useful. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mini improvement: statement_cost_limit
On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote: > > ISTR that what ended up killing the enthusiasm for this was that most people > realized that this GUC was just a poor tool to take a stab at solving other > problems (ie. rate limiting cpu for queries). I'm not concerned with that, I want developers to have feed back on costs in a way that is obvious. > > > I think a variation on this could be very useful in development and test > > > environments. Suppose it raised a warning or notice if the cost was over > > > the limit. Then one could set a limit of a few million on the development > > > and test servers and developers would at least have a clue that they > > > needed to look at explain for that query. As it is now, one can exhort > > > them to run explain, but it has no effect. Instead we later see queries > > > killed by a 24 hour timeout with estimated costs ranging from "until they > > > unplug the machine and dump it" to "until the sun turns into a red > > > giant". > > > > Great argument. So that's 4 in favour at least. > > > > Not such a great argument. Cost models on development servers can and often > are quite different from those on production, so you might be putting an > artifical limit on top of your developers. We load the production dumps into our dev environment, which are the same hardware spec, so the costs should be identical. > I still think it is worth revisiting what problems people are trying to > solve, > and see if there are better tools they can be given to solve them. Barring > that, I suppose a crude solution is better than nothing, though I fear people > might point at the crude solution as a good enough solution to justify not > working on better solutions. Alerting developers and QA to potentially costly queries would help solve some of the probems we are trying to solve. Better tools are welcome, an argument that the good is the enemy of the best so we should be content with nothing is not. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mini improvement: statement_cost_limit
On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote: > On Aug 2, 2008, at 8:38 PM, Tom Lane wrote: > > >Andrew Dunstan <[EMAIL PROTECTED]> writes: > >>Hans-Jürgen Schönig wrote: > >>>i introduced a GUC called statement_cost_limit which can be used to > >>>error out if a statement is expected to be too expensive. > > > >>You clearly have far more faith in the cost estimates than I do. > > > >Wasn't this exact proposal discussed and rejected awhile back? > > > > regards, tom lane > > > > > i don't remember precisely. > i have seen it on simon's wiki page and it is something which would > have been useful in some cases in the past. I think a variation on this could be very useful in development and test environments. Suppose it raised a warning or notice if the cost was over the limit. Then one could set a limit of a few million on the development and test servers and developers would at least have a clue that they needed to look at explain for that query. As it is now, one can exhort them to run explain, but it has no effect. Instead we later see queries killed by a 24 hour timeout with estimated costs ranging from "until they unplug the machine and dump it" to "until the sun turns into a red giant". -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Copy storage parameters on CREATE TABLE LIKE/INHERITS
On Wed, Jul 30, 2008 at 04:45:47PM +0900, ITAGAKI Takahiro wrote: > Here is a patch to copy column storage parameters and reloptions on > CREATE TABLE LIKE, which I proposed at: > [HACKERS] Uncopied parameters on CREATE TABLE LIKE > http://archives.postgresql.org/message-id/[EMAIL PROTECTED] > > Column storage parameters (by ALTER COLUMN SET STORAGE) and table > storage parameters (by ALTER TABLE SET (...) ) are copied from template > tables on LIKE or parent tables on INHERITS (1. and 2. at above e-mail). > The patch does not include copying of comments (3.) for now. > It also includes new regression tests and rough documentation. > > When template or parent tables have non-default settings, > they are copied into a new table automatically on CREATE TABLE. > If CREATE TABLE statement has own storage parameter definition, > they overwrites inherited settings. I'd like to have the ability to copy these parameters, but not to have it be automatic. Some of my clients applications commonly use CREATE TABLE LIKE to get empty work tables to populate and update before adding the data to the main table (which may have non-default settings). The automatic behaviour may be undesirable for this use. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] pg_dump additional options for performance
On Sun, Jul 27, 2008 at 10:37:34AM +0100, Simon Riggs wrote: > > On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote: > > > 2. We have no concurrency which means, anyone with any database over 50G > > has unacceptable restore times. > > Agreed. > > Also the core reason for wanting -w > > > 3. We have to continue develop hacks to define custom utilization. Why > > am I passing pre-data anything? It should be automatic. For example: [adding hackers for discussion] On Sat, Jul 26, 2008 at 01:56:14PM -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I want to dump tables separately for performance reasons. There are > > documented tests showing 100% gains using this method. There is no gain > > adding this to pg_restore. There is a gain to be had - parallelising > > index creation, but this patch doesn't provide parallelisation. > > Right, but the parallelization is going to happen sometime, and it is > going to happen in the context of pg_restore. So I think it's pretty > silly to argue that no one will ever want this feature to work in > pg_restore. > > To extend the example I just gave to Stephen, I think a fairly probable > scenario is where you only need to tweak some "before" object > definitions, and then you could do > > pg_restore --schema-before-data whole.dump >before.sql > edit before.sql > psql -f before.sql target_db > pg_restore --data-only --schema-after-data -d target_db whole.dump > > which (given a parallelizing pg_restore) would do all the time-consuming > steps in a fully parallelized fashion. A few thoughts about pg_restore performance: To take advantage of non-logged copy, the table create and data load should be in the same transaction. To take advantage of file and buffer cache, it would be better to create indexes immediately after loading table data. Many tables will be small enough to fit in cache on and this will avoid re-reading them for index builds. This is more advantagious with more indexes on one table. There may also be some filesytem placement benefits to building the indexes for a table immediately after loading the data. Creating constraints immediately after loading data also would benefit from warm buffer and file caches. Doing this this is complicated by the need for indexes and data in the referenced tables to exist first. It seems that a high performance restore will want to procede in a different order than the current sort order or that proposed by the before/data/after patch. - The simplest unit of work for parallelism may be the table and its "decorations", eg indexes and relational constraints. - Sort tables by foreign key dependency so that referenced tables are loaded before referencing tables. - Do table creation and data load together in one transaction to use non-logged copy. Index builds, and constraint creation should follow immediately, either as part of the same transaction, or possibly parallelized themselves. Table creation, data load, index builds, and constraint creation could be packaged up as the unit of work to be done in a subprocess which either completes or fails as a unit. The worker process would be called with connection info, a file pointer to the data, and the DDL for the table. pg_restore would keep a work queue of tables to be restored in FK dependency order and also do the other schema operations such as functions and types. -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding WHERE clause to pg_dump
On Fri, Jul 25, 2008 at 11:17:20PM +0100, Gregory Stark wrote: > "Joshua D. Drake" <[EMAIL PROTECTED]> writes: > > > How do we deal with this? > > > > pg_dump -w "last_update_timestamp < ..." -t 'table*' > > > > What I see is a recipe for inconsistent, un-restorable backups without a > > user realizing what they have done. The only way to deal with the above > > is: > > > > 1. Wildcards aren't allowed if you have -w > > 2. You dump everything, if the WHERE clause isn't relevant you just dump > > the whole table > > There's always > > 3. Apply the WHERE clause to all tables and if there's a table missing > columns referenced in the where clause then fail with the appropriate > error. > > Which seems like the right option to me. The tricky bit would be how to deal > with cases where you want a different where clause for different tables. But > even if it doesn't handle all cases that doesn't mean a partial solution is > unreasonable. Actually, Davy's patch does deal with the case "where you want a different where clause for different tables". -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers