Re: [HACKERS] Streaming base backups
On Tue, Jan 11, 2011 at 01:28, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/10 Magnus Hagander mag...@hagander.net: On Sun, Jan 9, 2011 at 23:33, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I've committed the backend side of this, without that. Still working on the client, and on cleaning up Heikki's patch for grammar/parser support. attached is a small patch fixing -d basedir when its called with an absolute path. maybe we can use pg_mkdir_p() instead of mkdir ? Heh, that was actually a hack to be able to run pg_basebackup on the same machine as the database with the tablespaces. It will be removed before commit :-) (It was also in the wrong place to work, I realize I managed to break it in a refactor) I've put in a big ugly comment to make sure it gets removed :-) And yes, using pg_mkdir_p() is good. I used to do that, I think I removed it by mistake when it was supposed to be removed elsewhere. I've put it back. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Set process title to indicate base backup is running
On 10.01.2011 22:54, Magnus Hagander wrote: Set process title to indicate base backup is running It would be good to reset it after the base backup is finished.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Set process title to indicate base backup is running
On Tue, Jan 11, 2011 at 09:50, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 10.01.2011 22:54, Magnus Hagander wrote: Set process title to indicate base backup is running It would be good to reset it after the base backup is finished.. Good point, fixed. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP: RangeTypes
Ok, I have made some progress. This is still a proof-of-concept patch, but the important pieces are working together. Synopsis: CREATE TYPE numrange AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp); SELECT range_eq('[1,2.2)'::numrange,'[1,2.2]'); SELECT range_lbound('(3.7,9]'::numrange); SELECT range(6.7); SELECT '-'::numrange; -- empty SELECT '[1, NULL]'::numrange; -- ] will become ) SELECT '(INF, 3)'::numrange; I haven't completed many of the other generic functions, because I'd like to make sure I'm on the right track first. The important thing about the functions above is that they show ANYRANGE working in conjunction with ANYELEMENT in various combinations, which was a significant part of this patch. Here are the open items: 1. Generic functions -- most of which are fairly obvious. However, I want to make sure I'm on the right track first. 2. GiST -- I'll need a mechanism to implement the penalty function, and perhaps I'll also need additional support for the picksplit function. For the penalty function, I think I'll need to require a function to convert the subtype into a float, and I can use that to find a distance (which can be the penalty). That should also satisfy anything that picksplit might need. 3. Typmod -- There is still one annoyance about typmod remaining. I need to treat it like an array in find_typmod_coercion_function(), and then create a coercion expression. Is it worth it? Would typmod on a range be confusing, or should I just finish this item up? 4. Docs 5. Tests 6. pg_dump -- should be pretty easy; I just want to settle some of the other stuff first. 7. Right now the parse function is quite dumb. Is there some example code I should follow to make sure I get this right? 8. In order to properly support the various combinations of ANYRANGE and ANYELEMENT in a function definition (which are all important), we need to be able to determine the range type given a subtype. That means that each subtype can only have one associated range, which sounds somewhat limiting, but it can be worked around by using domains. I don't think this is a major limitation. Comments? 9. Representation -- right now I store the OID of the range type in the range itself, much like arrays, in order to call the find the functions to operate on the subtype. Robert has some justifiable concerns about that 4-byte overhead. Possible ideas: * Forget about ANYRANGE altogether, and generate new catalog entries for the generic functions for each new range type defined. I don't particularly like this approach because it makes it very difficult to define new generic functions. * Somehow fix the type system so that we know the specific types of arguments in all situations. I don't know if this is feasible. * Store a 8- or 16-bit unique number in pg_range, and store that number in the representation. That would be pretty ugly, and limit the total possible range types defined at once, but it saves a couple bytes per value. * Try to somehow mimic what records do. Records use a global array and use the typmod as an index into that array. It looks like a hack to me, but might be worth borrowing anyway. Also related to representation: * Right now I always align the subtypes within the range according to typalign. I could avoid that by packing the bytes tightly, and then copying them around later. Suggestions? And what should the overall alignment of the range type be? * If it's a fixed-length type, we can save the varlena header byte on the overall range; but we lose the ability to save space when one of the boundaries of the range is missing (NULL or INF), and it would complicate the code a little. Thoughts? Regards, Jeff Davis rangetypes-20110110.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Mon, 10 Jan 2011 19:26:11 -0500 Tom Lane t...@sss.pgh.pa.us wrote: Shigeru HANADA han...@metrosystems.co.jp writes: For the purpose of file_fdw, additional ResetCopyFrom() would be necessary. I'm planning to include such changes in file_fdw patch. Please find attached partial patch for ResetCopyFrom(). Is there anything else which should be done at reset? Seems like it would be smarter to close and re-open the copy operation. Adding a reset function is just creating an additional maintenance burden and point of failure, for what seems likely to be a negligible performance benefit. Agreed. fileReScan can be implemented with close/re-open with storing some additional information into FDW private area. I would withdraw the proposal. If you think it's not negligible, please show some proof of that before asking us to support such code. Anyway, I've measured overhead of re-open with executing query including inner join between foreign tables copied from pgbench schema. I used SELECT statement below: EXPLAIN (ANALYZE) SELECT count(*) FROM csv_accounts a JOIN csv_branches b ON (b.bid = a.bid); On the average of (Nested Loop - (Foreign Scan * 2)), overhead of re-open is round 0.048ms per tuple (average of 3 times measurement). After the implementation of file_fdw, I'm going to measure again. If ResetCopyFrom significantly improves performance of ReScan, I'll propose it as a separate patch. = The results of EXPLAIN ANALYZE are: [using ResetCopyFrom] QUERY PLAN Aggregate (cost=11717.02..11717.03 rows=1 width=0) (actual time=73357.655..73357.657 rows=1 loops=1) - Nested Loop (cost=0.00..11717.01 rows=1 width=0) (actual time=0.209..71424.059 rows=100 loops=1) - Foreign Scan on public.csv_accounts a (cost=0.00..11717.00 rows=1 width=4) (actual time=0.144..6998.497 rows=100 loops=1) - Foreign Scan on public.csv_branches b (cost=0.00..0.00 rows=1 width=4) (actual time=0.008..0.037 rows=10 loops=100) Total runtime: 73358.135 ms (11 rows) [using EndCopyFrom + BeginCopyFrom] QUERY PLAN Aggregate (cost=11717.02..11717.03 rows=1 width=0) (actual time=120724.138..120724.140 rows=1 loops=1) - Nested Loop (cost=0.00..11717.01 rows=1 width=0) (actual time=0.321..118583.681 rows=100 loops=1) - Foreign Scan on public.csv_accounts a (cost=0.00..11717.00 rows=1 width=4) (actual time=0.156..7208.968 rows=100 loops=1) - Foreign Scan on public.csv_branches b (cost=0.00..0.00 rows=1 width=4) (actual time=0.016..0.046 rows=10 loops=100) Total runtime: 121118.792 ms (11 rows) Time: 121122.205 ms = Regards, -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Sun, 2011-01-09 at 16:59 -0500, Noah Misch wrote: This begins the patch series for the design I recently proposed[1] for avoiding some table rewrites in ALTER TABLE ... ALTER COLUMN ... TYPE. I'm posting these patches today: These sound very good. I have a concern that by making the ALTER TABLE more complex that we might not be able to easily tell if a rewrite happens, or not. Perhaps we should add a WITHOUT REWRITE clause? That would allow a user to specify that they do not wish a rewrite, so if the AT requires them to have one it would then fail. You might point out I didn't do anything like that for my ALTER TABLE patch, and I would agree with you. WITHOUT ACCESS EXCLUSIVE LOCK might be an option here also. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
On Fri, Jan 7, 2011 at 6:28 PM, Florian Pflug f...@phlo.org wrote: I forgot about sequences earlier. If we dump while someone deletes all rows and resets the sequence the dump might contain rows and still reset the sequence. This could cause duplicate key errors on restore. I haven't checked if this is really possible though - I guess it would depend on the exact order of these events... To fix this, you'd need a lock that allowed getting values from the sequence but prevented resetting it, and... I wonder how such locks would work. Would such locks prevent accessing these objects? Or just modifications? For example, if I locked a function, could someone else execute it while I held the lock? ...in fact we do very little locking of objects other than tables. DROP takes an AccessExclusiveLock on whatever it's dropping, and COMMENT and SECURITY LABEL take ShareUpdateExclusiveLocks to avoid orphaning pg_{sh,}description or pg_seclabel entries in the face of a concurrent drop, but most operations on non-table objects don't AFAIK take any lock at all. We probably don't want to make too many changes in this area, because there are already workloads where the heavyweight lock manager can become a bottleneck, and one can easily imagine that locking operators or namespaces could make that problem much worse. For query based replication tools like pgpool-II (I don't know any other tools, for example Postgres XC falls in this category or not...), we need to be able to lock sequences. Fortunately it is allowed to: SELECT 1 FROM foo_sequece FOR UPDATE; but LOCK foo_sequence looks more appropreate syntax for me. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
At present, XC does not seem to need locks to maintain cluster-wide integrity because it is maintained centrally in GTM. If application needs to do this, for example, to synchronize between different clusters, XC needs this capability obviously. -- Koichi Suzuki 2011/1/11 Tatsuo Ishii is...@postgresql.org: On Fri, Jan 7, 2011 at 6:28 PM, Florian Pflug f...@phlo.org wrote: I forgot about sequences earlier. If we dump while someone deletes all rows and resets the sequence the dump might contain rows and still reset the sequence. This could cause duplicate key errors on restore. I haven't checked if this is really possible though - I guess it would depend on the exact order of these events... To fix this, you'd need a lock that allowed getting values from the sequence but prevented resetting it, and... I wonder how such locks would work. Would such locks prevent accessing these objects? Or just modifications? For example, if I locked a function, could someone else execute it while I held the lock? ...in fact we do very little locking of objects other than tables. DROP takes an AccessExclusiveLock on whatever it's dropping, and COMMENT and SECURITY LABEL take ShareUpdateExclusiveLocks to avoid orphaning pg_{sh,}description or pg_seclabel entries in the face of a concurrent drop, but most operations on non-table objects don't AFAIK take any lock at all. We probably don't want to make too many changes in this area, because there are already workloads where the heavyweight lock manager can become a bottleneck, and one can easily imagine that locking operators or namespaces could make that problem much worse. For query based replication tools like pgpool-II (I don't know any other tools, for example Postgres XC falls in this category or not...), we need to be able to lock sequences. Fortunately it is allowed to: SELECT 1 FROM foo_sequece FOR UPDATE; but LOCK foo_sequence looks more appropreate syntax for me. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, Jan 11, 2011 at 02:24, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 10, 2011 at 10:41 AM, Simon Riggs si...@2ndquadrant.com wrote: I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3 phases of replication. That seems reasonable. But if we keep BACKUP in there, should we really have it called pg_stat_replication? (yeah, I know, I'm not giving up :P) (You'd need a 4th mode for WAITING or so, to indicate it's waiting for a command) That's something different. The 3 phases are more concrete. BACKUP -- CATCHUP--- STREAM When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode you never issue a BACKUP. Once we have caught up we move to STREAM. That has nothing to do with idle/active. So how does a walsender that's waiting for a command from the client show up? Surely it's not in catchup mode yet? There is a trivial state between connect and first command. If you think that is worth publishing, feel free. STARTING? I think it's worth publishing. STARTING would be OK, or maybe STARTUP to parallel the other two -UP states. Here's a patch for this. I chose IDLE, because that's what we call other backends that are waiting for commands... Does this seem correct? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ *** a/doc/src/sgml/monitoring.sgml --- b/doc/src/sgml/monitoring.sgml *** *** 298,305 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re entrystructnamepg_stat_replication/indextermprimarypg_stat_replication/primary/indexterm/entry entryOne row per WAL sender process, showing process acronymID/, user OID, user name, application name, client's address and port number, ! time at which the server process began execution, and transaction log ! location. /entry /row --- 298,305 entrystructnamepg_stat_replication/indextermprimarypg_stat_replication/primary/indexterm/entry entryOne row per WAL sender process, showing process acronymID/, user OID, user name, application name, client's address and port number, ! time at which the server process began execution, current WAL sender ! state and transaction log location. /entry /row *** a/src/backend/catalog/system_views.sql --- b/src/backend/catalog/system_views.sql *** *** 501,506 CREATE VIEW pg_stat_replication AS --- 501,507 S.client_addr, S.client_port, S.backend_start, + W.state, W.sent_location FROM pg_stat_get_activity(NULL) AS S, pg_authid U, pg_stat_get_wal_senders() AS W *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *** *** 24,29 --- 24,30 #include libpq/pqformat.h #include nodes/pg_list.h #include replication/basebackup.h + #include replication/walsender.h #include storage/fd.h #include storage/ipc.h #include utils/builtins.h *** *** 83,88 SendBaseBackup(const char *options) --- 84,91 ALLOCSET_DEFAULT_MAXSIZE); old_context = MemoryContextSwitchTo(backup_context); + WalSndSetState(WalSndState_BACKUP); + if (backup_label == NULL) ereport(FATAL, (errcode(ERRCODE_PROTOCOL_VIOLATION), *** a/src/backend/replication/walsender.c --- b/src/backend/replication/walsender.c *** *** 179,184 WalSndHandshake(void) --- 179,185 { int firstchar; + WalSndSetState(WalSndState_IDLE); set_ps_display(idle, false); /* Wait for a command to arrive */ *** *** 482,487 WalSndLoop(void) --- 483,491 if (!XLogSend(output_message, caughtup)) break; } + + /* Update our state to indicate if we're behind or not */ + WalSndSetState(caughtup ? WalSndState_STREAMING : WalSndState_CATCHUP); } /* *** *** 533,538 InitWalSnd(void) --- 537,543 */ walsnd-pid = MyProcPid; MemSet(walsnd-sentPtr, 0, sizeof(XLogRecPtr)); + walsnd-state = WalSndState_IDLE; SpinLockRelease(walsnd-mutex); /* don't need the lock anymore */ OwnLatch((Latch *) walsnd-latch); *** *** 960,965 WalSndWakeup(void) --- 965,999 SetLatch(WalSndCtl-walsnds[i].latch); } + /* Set state for current walsender (only called in walsender) */ + void WalSndSetState(WalSndState state) + { + /* use volatile pointer to prevent code rearrangement */ + volatile WalSnd *walsnd = MyWalSnd; + + SpinLockAcquire(walsnd-mutex); + walsnd-state = state; + SpinLockRelease(walsnd-mutex); + } + + /* + * Return a string constant representing the state. This is used + * in system views, and should *not* be translated. + */ + static const char * + WalSndGetStateString(WalSndState state) + { + switch (state) + { + case WalSndState_IDLE: return IDLE; +
Re: [HACKERS] LOCK for non-tables
On Tue, Jan 11, 2011 at 4:46 AM, Tatsuo Ishii is...@postgresql.org wrote: On Fri, Jan 7, 2011 at 6:28 PM, Florian Pflug f...@phlo.org wrote: I forgot about sequences earlier. If we dump while someone deletes all rows and resets the sequence the dump might contain rows and still reset the sequence. This could cause duplicate key errors on restore. I haven't checked if this is really possible though - I guess it would depend on the exact order of these events... To fix this, you'd need a lock that allowed getting values from the sequence but prevented resetting it, and... I wonder how such locks would work. Would such locks prevent accessing these objects? Or just modifications? For example, if I locked a function, could someone else execute it while I held the lock? ...in fact we do very little locking of objects other than tables. DROP takes an AccessExclusiveLock on whatever it's dropping, and COMMENT and SECURITY LABEL take ShareUpdateExclusiveLocks to avoid orphaning pg_{sh,}description or pg_seclabel entries in the face of a concurrent drop, but most operations on non-table objects don't AFAIK take any lock at all. We probably don't want to make too many changes in this area, because there are already workloads where the heavyweight lock manager can become a bottleneck, and one can easily imagine that locking operators or namespaces could make that problem much worse. For query based replication tools like pgpool-II (I don't know any other tools, for example Postgres XC falls in this category or not...), we need to be able to lock sequences. Fortunately it is allowed to: SELECT 1 FROM foo_sequece FOR UPDATE; but LOCK foo_sequence looks more appropreate syntax for me. Those aren't doing the same thing. The first is locking the one and only tuple that is contained within the sequence, while the second is locking the sequence object itself. At this point, I'm inclined to think that the pg_dump comment is just wrong, and we ought to fix it to say that we don't really want to be able to lock other relations after all, and call it good. As a side node, locking a sequence for replication seems like it could have pretty dire effects on performance in certain workloads. Why do you need to do that, anyway? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, Jan 11, 2011 at 5:28 AM, Magnus Hagander mag...@hagander.net wrote: Does this seem correct? It looks reasonable, except that I the way you've chosen to capitalize the wal sender states makes me want to shoot myself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compatibility GUC for serializable
On Jan10, 2011, at 23:56 , Kevin Grittner wrote: The proposed GUC would suppress the monitoring in SERIALIZABLE mode and avoid the new serialization failures, thereby providing legacy behavior -- anomalies and all. After posting that I realized that there's no technical reason that such a GUC couldn't be set within each session as desired, as long as we disallowed changes after the first snapshot of a transaction was acquired. The IsolationIsSerializable() macro could be modified to use that along with XactIsoLevel. From a security point of view, it seems dangerous to allow such a GUC to be set by non-superusers. It might allow users to e.g. circumvent some access control scheme by exploiting a race condition that only exists without true serializability. The risk of confusion is also much higher if such a thing can be set per-session. So, if we need such a GUC at all, which I'm not sure we do, I believe it should be settable only from postgresql.conf and the command line. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python custom exceptions for SPI
On 11/01/11 01:27, Tom Lane wrote: Hannu Krosing ha...@2ndquadrant.com writes: On 10.1.2011 17:20, Jan Urbański wrote: I changed that patch to use Perl instead of sed to generate the exceptions, which should be a more portable. Why not python ? Because we're not adding even more different tool requirements to the build process. Perl is what we've chosen to depend on, and there is no reason to use a different tool here. Yep, exactly. While looking at it again I fixed a problem with spiexceptions.h not being distributed (and IIRC we don't require Perl for tarball builds) and added generating it to the MSVC build system. Updated patch attached. Cheers, Jan diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile index 33dddc6..1fe386b 100644 *** a/src/pl/plpython/Makefile --- b/src/pl/plpython/Makefile *** rpathdir = $(python_libdir) *** 38,44 NAME = plpython$(python_majorversion) OBJS = plpython.o ! # Python on win32 ships with import libraries only for Microsoft Visual C++, # which are not compatible with mingw gcc. Therefore we need to build a --- 38,44 NAME = plpython$(python_majorversion) OBJS = plpython.o ! SPIEXCEPTIONS = spiexceptions.h # Python on win32 ships with import libraries only for Microsoft Visual C++, # which are not compatible with mingw gcc. Therefore we need to build a *** PSQLDIR = $(bindir) *** 86,93 include $(top_srcdir)/src/Makefile.shlib ! all: all-lib install: all installdirs install-lib ifeq ($(python_majorversion),2) --- 86,102 include $(top_srcdir)/src/Makefile.shlib + .PHONY: gen-spiexceptions ! # Generate spiexceptions.h from utils/errcodes.h ! spiexceptions.h: $(top_srcdir)/src/include/utils/errcodes.h ! $(PERL) $(srcdir)/generate-spiexceptions.pl $^ $(SPIEXCEPTIONS) ! ! gen-spiexceptions: $(SPIEXCEPTIONS) ! ! all: gen-spiexceptions all-lib ! ! distprep: gen-spiexceptions install: all installdirs install-lib ifeq ($(python_majorversion),2) *** endif *** 134,143 submake: $(MAKE) -C $(top_builddir)/src/test/regress pg_regress$(X) ! clean distclean maintainer-clean: clean-lib rm -f $(OBJS) rm -rf results rm -f regression.diffs regression.out ifeq ($(PORTNAME), win32) rm -f python${pytverstr}.def endif --- 143,157 submake: $(MAKE) -C $(top_builddir)/src/test/regress pg_regress$(X) ! clean distclean: clean-lib rm -f $(OBJS) rm -rf results rm -f regression.diffs regression.out + + # since we distribute spiexceptions.h, only remove it in maintainer-clean + maintainer-clean: clean distclean + rm -f $(SPIEXCEPTIONS) + ifeq ($(PORTNAME), win32) rm -f python${pytverstr}.def endif diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out index 7fc8337..718ebce 100644 *** a/src/pl/plpython/expected/plpython_error.out --- b/src/pl/plpython/expected/plpython_error.out *** CREATE FUNCTION sql_syntax_error() RETUR *** 8,14 'plpy.execute(syntax error)' LANGUAGE plpythonu; SELECT sql_syntax_error(); ! ERROR: plpy.SPIError: syntax error at or near syntax CONTEXT: PL/Python function sql_syntax_error /* check the handling of uncaught python exceptions */ --- 8,14 'plpy.execute(syntax error)' LANGUAGE plpythonu; SELECT sql_syntax_error(); ! ERROR: spiexceptions.SyntaxError: syntax error at or near syntax CONTEXT: PL/Python function sql_syntax_error /* check the handling of uncaught python exceptions */ *** CREATE FUNCTION exception_index_invalid_ *** 27,33 return rv[0]' LANGUAGE plpythonu; SELECT exception_index_invalid_nested(); ! ERROR: plpy.SPIError: function test5(unknown) does not exist CONTEXT: PL/Python function exception_index_invalid_nested /* a typo */ --- 27,33 return rv[0]' LANGUAGE plpythonu; SELECT exception_index_invalid_nested(); ! ERROR: spiexceptions.UndefinedFunction: function test5(unknown) does not exist CONTEXT: PL/Python function exception_index_invalid_nested /* a typo */ *** return None *** 43,49 ' LANGUAGE plpythonu; SELECT invalid_type_uncaught('rick'); ! ERROR: plpy.SPIError: type test does not exist CONTEXT: PL/Python function invalid_type_uncaught /* for what it's worth catch the exception generated by * the typo, and return None --- 43,49 ' LANGUAGE plpythonu; SELECT invalid_type_uncaught('rick'); ! ERROR: spiexceptions.UndefinedObject: type test does not exist CONTEXT: PL/Python function invalid_type_uncaught /* for what it's worth catch the exception generated by * the typo, and return None *** SELECT valid_type('rick'); *** 109,111 --- 109,149 (1 row) + /* Check catching specific types of exceptions + */ + CREATE TABLE specific ( + i integer PRIMARY KEY + ); + NOTICE: CREATE TABLE / PRIMARY KEY will
Re: [HACKERS] Bug in pg_describe_object
On Mon, Jan 10, 2011 at 8:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 10, 2011 at 7:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: My point is that this isn't a bug fix, it's more like moving the goalposts on what getObjectDescription is supposed to do. I think that adding the types to the description string is a pretty sensible thing to do. Not really. AFAIR, there are two cases that exist in practice, depending on which AM you're talking about: 1. The recorded types match the input types of the operator/function (btree hash). 2. The recorded types are always the same as the opclass's input type (gist gin). In neither case does printing those types really add much information. That's why it's not there now. I don't get it. If two different items that exist in the system out of the box have the same description, it seems clear that relevant piece of disambiguating information exists nowhere in the description string. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error code for terminating connection due to conflict with recovery
On Tue, Jan 11, 2011 at 1:30 AM, Tatsuo Ishii is...@postgresql.org wrote: On Sat, Jan 8, 2011 at 9:52 AM, Tatsuo Ishii is...@postgresql.org wrote: While looking at the backend code, I realized that error code for terminating connection due to conflict with recovery is ERRCODE_ADMIN_SHUTDOWN. I thought the error code is for somewhat a human interruption, such as shutdown command issued by pg_ctl. Is the usage of the error code appropreate? That doesn't sound right to me. I'd have thought something in class 40. What about: 40004 CONFLICT WITH RECOVERY conflict_with_recovery We should respect the following convention, from errcodes.h: * The convention is that new error codes defined by PostgreSQL in a * class defined by the standard have a subclass value that begins * with 'P'. In addition, error codes defined by PostgreSQL clients * (such as ecpg) have a class value that begins with 'Y'. And don't forget there are three places where the new error code would need to be added. Otherwise, +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, 2011-01-11 at 11:28 +0100, Magnus Hagander wrote: (You'd need a 4th mode for WAITING or so, to indicate it's waiting for a command) That's something different. The 3 phases are more concrete. BACKUP -- CATCHUP--- STREAM When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode you never issue a BACKUP. Once we have caught up we move to STREAM. That has nothing to do with idle/active. So how does a walsender that's waiting for a command from the client show up? Surely it's not in catchup mode yet? There is a trivial state between connect and first command. If you think that is worth publishing, feel free. STARTING? I think it's worth publishing. STARTING would be OK, or maybe STARTUP to parallel the other two -UP states. Here's a patch for this. I chose IDLE, because that's what we call other backends that are waiting for commands... Does this seem correct? No You can be idle yet in STREAMING mode. What mode we are in has nothing to do with idle/active. Either STARTING/STARTUP/NULL but not IDLE. If you want that as well, then we need a second column, but personally it's too much information and its too hard to say what it actually means. For example, with sync rep, the WALSender might be idle, yet there might yet be backends waiting for a reply. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
On Tue, Jan 11, 2011 at 4:46 AM, Tatsuo Ishii is...@postgresql.org wrote: On Fri, Jan 7, 2011 at 6:28 PM, Florian Pflug f...@phlo.org wrote: I forgot about sequences earlier. If we dump while someone deletes all rows and resets the sequence the dump might contain rows and still reset the sequence. This could cause duplicate key errors on restore. I haven't checked if this is really possible though - I guess it would depend on the exact order of these events... To fix this, you'd need a lock that allowed getting values from the sequence but prevented resetting it, and... I wonder how such locks would work. Would such locks prevent accessing these objects? Or just modifications? For example, if I locked a function, could someone else execute it while I held the lock? ...in fact we do very little locking of objects other than tables. DROP takes an AccessExclusiveLock on whatever it's dropping, and COMMENT and SECURITY LABEL take ShareUpdateExclusiveLocks to avoid orphaning pg_{sh,}description or pg_seclabel entries in the face of a concurrent drop, but most operations on non-table objects don't AFAIK take any lock at all. We probably don't want to make too many changes in this area, because there are already workloads where the heavyweight lock manager can become a bottleneck, and one can easily imagine that locking operators or namespaces could make that problem much worse. For query based replication tools like pgpool-II (I don't know any other tools, for example Postgres XC falls in this category or not...), we need to be able to lock sequences. Fortunately it is allowed to: SELECT 1 FROM foo_sequece FOR UPDATE; but LOCK foo_sequence looks more appropreate syntax for me. Those aren't doing the same thing. The first is locking the one and only tuple that is contained within the sequence, while the second is locking the sequence object itself. But a sequence relation contains only 1 tuple and there's no difference among them, no? At this point, I'm inclined to think that the pg_dump comment is just wrong, and we ought to fix it to say that we don't really want to be able to lock other relations after all, and call it good. As a side node, locking a sequence for replication seems like it could have pretty dire effects on performance in certain workloads. Why do you need to do that, anyway? Pgpool not only needs to replicate sequences but replicates tuples updated by DMLs which are using sequence value(I am talking about SERIAL data types). For this purpose, pgpool issue nextval() to master DB server first, then use the value for subsequent INSERT/UPDATE. This will guarantee that inserted/updated values using sequences are identical among master and slave DB servers. Problem is, if this process happens in concurrent sessions, inserted/updated tuples might not have identical value among DB servers. So I need sequence lock here. This is the price statement based replication tools have to pay for:- -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Sun, Jan 9, 2011 at 4:59 PM, Noah Misch n...@leadboat.com wrote: This begins the patch series for the design I recently proposed[1] for avoiding some table rewrites in ALTER TABLE ... ALTER COLUMN ... TYPE. I'm posting these patches today: 0 - new test cases This doesn't look right. You might be building it, but you sure aren't rebuilding it. +CREATE TABLE parent (keycol numeric PRIMARY KEY); +NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index parent_pkey for table parent +DEBUG: Rebuilding index parent_pkey In general, I think this is six kinds of overkill. I don't think we really need 2000 lines of new regression tests for this feature. I'd like to see that chopped down by at least 10x. I don't like this bit: + ereport(IsToastRelation(indexRelation) ? DEBUG2 : DEBUG1, I see no reason to set the verbosity differently depending on whether or not something's a toast relation; that seems more likely to be confusing than helpful. I guess my vote would be to make all of these messages DEBUG2, period. A quick test suggests that doesn't produce too much noise executing DDL commands. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
On Tue, Jan 11, 2011 at 6:31 AM, Tatsuo Ishii is...@postgresql.org wrote: For query based replication tools like pgpool-II (I don't know any other tools, for example Postgres XC falls in this category or not...), we need to be able to lock sequences. Fortunately it is allowed to: SELECT 1 FROM foo_sequece FOR UPDATE; but LOCK foo_sequence looks more appropreate syntax for me. Those aren't doing the same thing. The first is locking the one and only tuple that is contained within the sequence, while the second is locking the sequence object itself. But a sequence relation contains only 1 tuple and there's no difference among them, no? No, not really. It's still a different object. As a side node, locking a sequence for replication seems like it could have pretty dire effects on performance in certain workloads. Why do you need to do that, anyway? Pgpool not only needs to replicate sequences but replicates tuples updated by DMLs which are using sequence value(I am talking about SERIAL data types). For this purpose, pgpool issue nextval() to master DB server first, then use the value for subsequent INSERT/UPDATE. This will guarantee that inserted/updated values using sequences are identical among master and slave DB servers. Problem is, if this process happens in concurrent sessions, inserted/updated tuples might not have identical value among DB servers. So I need sequence lock here. This is the price statement based replication tools have to pay for:- Ouch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, Jan 11, 2011 at 12:17, Robert Haas robertmh...@gmail.com wrote: On Tue, Jan 11, 2011 at 5:28 AM, Magnus Hagander mag...@hagander.net wrote: Does this seem correct? It looks reasonable, except that I the way you've chosen to capitalize the wal sender states makes me want to shoot myself. Hah, I was waiting for that. I can certainly change that :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, Jan 11, 2011 at 12:23, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2011-01-11 at 11:28 +0100, Magnus Hagander wrote: (You'd need a 4th mode for WAITING or so, to indicate it's waiting for a command) That's something different. The 3 phases are more concrete. BACKUP -- CATCHUP--- STREAM When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode you never issue a BACKUP. Once we have caught up we move to STREAM. That has nothing to do with idle/active. So how does a walsender that's waiting for a command from the client show up? Surely it's not in catchup mode yet? There is a trivial state between connect and first command. If you think that is worth publishing, feel free. STARTING? I think it's worth publishing. STARTING would be OK, or maybe STARTUP to parallel the other two -UP states. Here's a patch for this. I chose IDLE, because that's what we call other backends that are waiting for commands... Does this seem correct? No You can be idle yet in STREAMING mode. What mode we are in has nothing to do with idle/active. Either STARTING/STARTUP/NULL but not IDLE. If you want that as well, then we need a second column, but personally it's too much information and its too hard to say what it actually means. For example, with sync rep, the WALSender might be idle, yet there might yet be backends waiting for a reply. That's a good point. Just to be clear, you're objecting to the *name* of the state, right, not how/where it's set? In particular, how the catchup/streaming things are set? I agree that using a second column for it is unnecessary. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compatibility GUC for serializable
On Jan10, 2011, at 20:29 , Josh Berkus wrote: The only reason I'm ambivalent about this is I'm unsure that there are more than a handful of people using SERIALIZABLE in production applications, precisely because it's been so unintuitive in the past. I've used it quite extensively in the past. Usually either to run two consecutive queries with the same snapshot, or to run an UPDATE .. FROM (since that can be quite a foot-gun in READ COMMITTED mode). In retrospect, I should have used REPEATABLE READ instead of SERIALIZABLE, of course. But I didn't, and the part of the reason for that is our very own documentation. The way it is written gives the impression that SERIALIZABLE is the real name of the isolation leven while REPEATABLE READ is a compatibility synonym, and it also leads one to believe that true serializability isn't something that will ever be implemented. The two sections of 13.2 Transaction Isolation dealing with the isolation levels READ COMMITTED and REPEATABLE READ/SERIALIZABLE are for example 13.2.1 Read Committed Isolation Level and 13.2.2 Serializable Isolation Level. And, at the end of 13.2, in the section about SERIALIZABLE vs. true serializability, we say To guarantee true mathematical serializability, it is necessary for a database system to enforce predicate locking, Such a locking system is complex to implement and extremely expensive in execution, And this large expense is mostly wasted, since in practice most applications do not do the sorts of things that could result in problems. ... For these reasons, PostgreSQL does not implement predicate locking. I'd be very surprised if nearly all out our users used only READ COMMITTED isolation level. And given the wording of our documentation, I'm quite certain I'm not the only one who used to spell that other isolation leven SERIALIZABLE, not REPEATABLE READ. The question thus (again) comes to to whether we believe that for virtually all of these users, true serializability is either an improvement or at least no regression. I cannot come up with a case where that wouldn't be the case *in theory*. In practice, however, Kevin's patch includes some performance vs. false-positives trade-offs, and these *have* the potential of biting people. So, to summarize, I believe we need to look at the trade-offs - for example the way conflict information is summarized to prevent memory overflow - to judge whether there are any realistic workloads where those might cause problems. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote: Just to be clear, you're objecting to the *name* of the state, right, not how/where it's set? Yes In particular, how the catchup/streaming things are set? You've set it in the right places. I would personally constrain the state transitions, so that we can't ever make illegal changes, such as CATCHUP - BACKUP. I would also check the state locally before grabbing the spinlock, as is typical in other xlog code. Continually resetting shared state to exactly what it is already seems strange, to me. If we make the rule that the state can only be changed by the WALSender itself, it won't need to grab spinlock. If you don't like that, a local variable works. Also, mixing upper camel case and uppercase for the STATe NamES looKS weIRD. Uppercase makes them look more clearly like enum/states as used elsewhere in similar code. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, Jan 11, 2011 at 12:58, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote: Just to be clear, you're objecting to the *name* of the state, right, not how/where it's set? Yes In particular, how the catchup/streaming things are set? You've set it in the right places. I would personally constrain the state transitions, so that we can't ever make illegal changes, such as CATCHUP - BACKUP. Well, once we enter the walsender loop we can never get out of it, so it simply cannot happen... I would also check the state locally before grabbing the spinlock, as is typical in other xlog code. Continually resetting shared state to exactly what it is already seems strange, to me. If we make the rule that the state can only be changed by the WALSender itself, it won't need to grab spinlock. If you don't like that, a local variable works. Hmm. I don't see why anybody other than the walsender should change it, so yeah. You mean I can just drop the spinlock calls completely, and then do an if (walsnd-state != state) walsnd-state = state; ? Do I need to keep the volatile pointer, or should I drop that as well? Also, mixing upper camel case and uppercase for the STATe NamES looKS weIRD. Uppercase makes them look more clearly like enum/states as used elsewhere in similar code. Yeah, I'll change that. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, Jan 11, 2011 at 09:24:46AM +, Simon Riggs wrote: On Sun, 2011-01-09 at 16:59 -0500, Noah Misch wrote: This begins the patch series for the design I recently proposed[1] for avoiding some table rewrites in ALTER TABLE ... ALTER COLUMN ... TYPE. I'm posting these patches today: These sound very good. Thanks. I have a concern that by making the ALTER TABLE more complex that we might not be able to easily tell if a rewrite happens, or not. Perhaps we should add a WITHOUT REWRITE clause? That would allow a user to specify that they do not wish a rewrite, so if the AT requires them to have one it would then fail. These changes do make it harder to guess how much work the ALTER TABLE will do. Indeed, about 1/4 of my own guesses prior to writing were wrong. Something like WITHOUT REWRITE might be the way to go, though there are more questions: if it does not rewrite, does it scan the table? Which indexes, if any, does it rebuild? Which foreign key constraints, if any, does it recheck? With patch 0, you can answer all these questions by enabling DEBUG1 messages and trying the command on your test system. For this reason, I did consider adding a VERBOSE clause to show those messages at DETAIL, rather than unconditionally showing them at DEBUG1. In any case, if a WITHOUT REWRITE like you describe covers the important question, it's certainly easy enough to implement. You might point out I didn't do anything like that for my ALTER TABLE patch, and I would agree with you. WITHOUT ACCESS EXCLUSIVE LOCK might be an option here also. True. At least we could completely document the lock choices on the ALTER TABLE reference page. The no-rewrite cases are defined at arms length from ALTER TABLE, and users can define their own, so it's a tougher fit. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, 2011-01-11 at 13:04 +0100, Magnus Hagander wrote: On Tue, Jan 11, 2011 at 12:58, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote: Just to be clear, you're objecting to the *name* of the state, right, not how/where it's set? Yes In particular, how the catchup/streaming things are set? You've set it in the right places. I would personally constrain the state transitions, so that we can't ever make illegal changes, such as CATCHUP - BACKUP. Well, once we enter the walsender loop we can never get out of it, so it simply cannot happen... Accidentally it can, so I would like to protect against that. Putting those checks in are like Asserts(), they help document what is and is not possible by design. I would also check the state locally before grabbing the spinlock, as is typical in other xlog code. Continually resetting shared state to exactly what it is already seems strange, to me. If we make the rule that the state can only be changed by the WALSender itself, it won't need to grab spinlock. If you don't like that, a local variable works. Hmm. I don't see why anybody other than the walsender should change it, so yeah. You mean I can just drop the spinlock calls completely, and then do an if (walsnd-state != state) walsnd-state = state; ? Do I need to keep the volatile pointer, or should I drop that as well? No, do this at top if (walsnd-state == state) return; Keep spinlocks when actually setting it. Also, mixing upper camel case and uppercase for the STATe NamES looKS weIRD. Uppercase makes them look more clearly like enum/states as used elsewhere in similar code. Yeah, I'll change that. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Tue, Jan 11, 2011 at 13:18, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2011-01-11 at 13:04 +0100, Magnus Hagander wrote: On Tue, Jan 11, 2011 at 12:58, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote: Just to be clear, you're objecting to the *name* of the state, right, not how/where it's set? Yes In particular, how the catchup/streaming things are set? You've set it in the right places. I would personally constrain the state transitions, so that we can't ever make illegal changes, such as CATCHUP - BACKUP. Well, once we enter the walsender loop we can never get out of it, so it simply cannot happen... Accidentally it can, so I would like to protect against that. Putting those checks in are like Asserts(), they help document what is and is not possible by design. I would also check the state locally before grabbing the spinlock, as is typical in other xlog code. Continually resetting shared state to exactly what it is already seems strange, to me. If we make the rule that the state can only be changed by the WALSender itself, it won't need to grab spinlock. If you don't like that, a local variable works. Hmm. I don't see why anybody other than the walsender should change it, so yeah. You mean I can just drop the spinlock calls completely, and then do an if (walsnd-state != state) walsnd-state = state; ? Do I need to keep the volatile pointer, or should I drop that as well? No, do this at top if (walsnd-state == state) return; Keep spinlocks when actually setting it. Aha. Thanks for the pointers, pfa a new version. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ *** a/doc/src/sgml/monitoring.sgml --- b/doc/src/sgml/monitoring.sgml *** *** 298,305 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re entrystructnamepg_stat_replication/indextermprimarypg_stat_replication/primary/indexterm/entry entryOne row per WAL sender process, showing process acronymID/, user OID, user name, application name, client's address and port number, ! time at which the server process began execution, and transaction log ! location. /entry /row --- 298,305 entrystructnamepg_stat_replication/indextermprimarypg_stat_replication/primary/indexterm/entry entryOne row per WAL sender process, showing process acronymID/, user OID, user name, application name, client's address and port number, ! time at which the server process began execution, current WAL sender ! state and transaction log location. /entry /row *** a/src/backend/catalog/system_views.sql --- b/src/backend/catalog/system_views.sql *** *** 501,506 CREATE VIEW pg_stat_replication AS --- 501,507 S.client_addr, S.client_port, S.backend_start, + W.state, W.sent_location FROM pg_stat_get_activity(NULL) AS S, pg_authid U, pg_stat_get_wal_senders() AS W *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *** *** 24,29 --- 24,30 #include libpq/pqformat.h #include nodes/pg_list.h #include replication/basebackup.h + #include replication/walsender.h #include storage/fd.h #include storage/ipc.h #include utils/builtins.h *** *** 83,88 SendBaseBackup(const char *options) --- 84,91 ALLOCSET_DEFAULT_MAXSIZE); old_context = MemoryContextSwitchTo(backup_context); + WalSndSetState(WALSNDSTATE_BACKUP); + if (backup_label == NULL) ereport(FATAL, (errcode(ERRCODE_PROTOCOL_VIOLATION), *** a/src/backend/replication/walsender.c --- b/src/backend/replication/walsender.c *** *** 179,184 WalSndHandshake(void) --- 179,185 { int firstchar; + WalSndSetState(WALSNDSTATE_IDLE); set_ps_display(idle, false); /* Wait for a command to arrive */ *** *** 482,487 WalSndLoop(void) --- 483,491 if (!XLogSend(output_message, caughtup)) break; } + + /* Update our state to indicate if we're behind or not */ + WalSndSetState(caughtup ? WALSNDSTATE_STREAMING : WALSNDSTATE_CATCHUP); } /* *** *** 533,538 InitWalSnd(void) --- 537,543 */ walsnd-pid = MyProcPid; MemSet(walsnd-sentPtr, 0, sizeof(XLogRecPtr)); + walsnd-state = WALSNDSTATE_IDLE; SpinLockRelease(walsnd-mutex); /* don't need the lock anymore */ OwnLatch((Latch *) walsnd-latch); *** *** 960,965 WalSndWakeup(void) --- 965,1004 SetLatch(WalSndCtl-walsnds[i].latch); } + /* Set state for current walsender (only called in walsender) */ + void WalSndSetState(WalSndState state) + { + /* use volatile pointer to prevent code rearrangement */ +
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, Jan 11, 2011 at 06:37:33AM -0500, Robert Haas wrote: On Sun, Jan 9, 2011 at 4:59 PM, Noah Misch n...@leadboat.com wrote: This begins the patch series for the design I recently proposed[1] for avoiding some table rewrites in ALTER TABLE ... ALTER COLUMN ... TYPE. ?I'm posting these patches today: 0 - new test cases This doesn't look right. You might be building it, but you sure aren't rebuilding it. +CREATE TABLE parent (keycol numeric PRIMARY KEY); +NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index parent_pkey for table parent +DEBUG: Rebuilding index parent_pkey Yes. I considered saying Building unconditionally. Differentiating the debug message by passing down the fact that the index recently existed seemed like overkill. What do you think? In general, I think this is six kinds of overkill. I don't think we really need 2000 lines of new regression tests for this feature. I'd like to see that chopped down by at least 10x. I just included all the tests I found useful to check my own work. If 200 lines of SQL and expected output is the correct amount, I'll make it so. I don't like this bit: + ereport(IsToastRelation(indexRelation) ? DEBUG2 : DEBUG1, I see no reason to set the verbosity differently depending on whether or not something's a toast relation; that seems more likely to be confusing than helpful. I guess my vote would be to make all of these messages DEBUG2, period. A quick test suggests that doesn't produce too much noise executing DDL commands. The theoretical basis is that users can do little to directly change the need to rebuild a TOAST index. We'll rebuild the TOAST index if and only if we rewrote the table. The practical basis is that the TOAST relation names contain parent relation OIDs, so we don't want them mentioned in regression test output. Thanks for reviewing. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.
On tis, 2011-01-11 at 12:30 +0900, Itagaki Takahiro wrote: On Tue, Jan 11, 2011 at 11:10, Tom Lane t...@sss.pgh.pa.us wrote: Itagaki Takahiro itagaki.takah...@gmail.com writes: It was reported from a tester that we don't have casts of money from/to integer types even though we have from/to numeric type. In most locales, the idea isn't sensible. The documentation says: | Input is accepted in a variety of formats, | including integer and floating-point literals If we won't to add accept integers for money, we should fix the docs. | integer and floating-point string literals |~~~ Will it get better? I think adding a cast from integer to money is probably quite reasonable. The other way around, maybe not, or only an explicit cast. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, Jan 11, 2011 at 7:14 AM, Noah Misch n...@leadboat.com wrote: I have a concern that by making the ALTER TABLE more complex that we might not be able to easily tell if a rewrite happens, or not. Perhaps we should add a WITHOUT REWRITE clause? That would allow a user to specify that they do not wish a rewrite, so if the AT requires them to have one it would then fail. These changes do make it harder to guess how much work the ALTER TABLE will do. Indeed, about 1/4 of my own guesses prior to writing were wrong. Something like WITHOUT REWRITE might be the way to go, though there are more questions: if it does not rewrite, does it scan the table? Which indexes, if any, does it rebuild? Which foreign key constraints, if any, does it recheck? With patch 0, you can answer all these questions by enabling DEBUG1 messages and trying the command on your test system. For this reason, I did consider adding a VERBOSE clause to show those messages at DETAIL, rather than unconditionally showing them at DEBUG1. In any case, if a WITHOUT REWRITE like you describe covers the important question, it's certainly easy enough to implement. I'm doubtful that it's worth complicating the parser logic. Our DDL is transactional, so someone can always begin a transaction, increase client_min_messages, and then look at the output from these added debug messages to see what is happening. It should be clear within a second or two if it's not what is wanted, and they can just hit ^C. True. At least we could completely document the lock choices on the ALTER TABLE reference page. The no-rewrite cases are defined at arms length from ALTER TABLE, and users can define their own, so it's a tougher fit. I don't think it's prohibitively, tough, though, and I think it would be very valuable. We may have to scratch our heads over exactly where to put the information, but we can figure out something that works. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
On 01/10/2011 06:03 PM, Kevin Grittner wrote: Due to popular request (Hey, David's popular, right?), I'm posting a patch for Serializable Snapshot Isolation (SSI), although I don't yet have everything in it that I was planning on submitting before the CF. I will probably be submitting another version before the deadline with the following, but there should be plenty here for people to test and benchmark. We're done with the major refactoring needed to address concerns raised in earlier reviews, and I don't expect the remaining work to destabilize what's there or to have a significant impact on performance. I think I found a problem. This is using SSI v8. The table definition: create table test_t (id integer, val1 text, val2 integer); create index test_idx on test_t(id) where val2 = 1; The data: insert into test_t (select generate_series(0, 1), 'a', 2); insert into test_t (select generate_series(0, 10), 'a', 1); The transactions: T1: hot2= begin transaction isolation level serializable; BEGIN hot2= select * from test_t where val2 = 1; id | val1 | val2 +--+-- 0 | a|1 1 | a|1 2 | a|1 3 | a|1 4 | a|1 5 | a|1 6 | a|1 7 | a|1 8 | a|1 9 | a|1 10 | a|1 (11 rows) hot2= update test_t set val2 = 2 where val2 = 1 and id = 10; UPDATE 1 -- The concurrent transaction: T2: hot2= begin transaction isolation level serializable; BEGIN hot2= select * from test_t where val2 = 1; id | val1 | val2 +--+-- 0 | a|1 1 | a|1 2 | a|1 3 | a|1 4 | a|1 5 | a|1 6 | a|1 7 | a|1 8 | a|1 9 | a|1 10 | a|1 (11 rows) hot2= update test_t set val2 = 2 where val2 = 1 and id = 9; UPDATE 1 hot2= commit; COMMIT -- Now, t1 can commit, too. Even though there is a serialization anomaly T1: hot2= commit; COMMIT If the test_idx is changed: (outside any transaction...) hot2= drop index test_idx; DROP INDEX hot2= create index test_idx on test_t(id, val2); CREATE INDEX T1: hot2= begin transaction isolation level serializable; BEGIN hot2= select * from test_t where val2 = 1; id | val1 | val2 +--+-- 0 | a|1 1 | a|1 2 | a|1 3 | a|1 4 | a|1 5 | a|1 6 | a|1 7 | a|1 8 | a|1 (9 rows) hot2= update test_t set val2 = 2 where val2 = 1 and id = 8; UPDATE 1 T2: hot2= select * from test_t where val2 = 1; id | val1 | val2 +--+-- 0 | a|1 1 | a|1 2 | a|1 3 | a|1 4 | a|1 5 | a|1 6 | a|1 7 | a|1 8 | a|1 (9 rows) hot2= update test_t set val2 = 2 where val2 = 1 and id = 7; UPDATE 1 hot2= commit; ERROR: could not serialize access due to read/write dependencies among transactions HINT: The transaction might succeed if retried. T1: hot2= commit; COMMIT So, something seems to be broken when using partial indexes. - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, 2011-01-11 at 07:14 -0500, Noah Misch wrote: These changes do make it harder to guess how much work the ALTER TABLE will do. Indeed, about 1/4 of my own guesses prior to writing were wrong. Something like WITHOUT REWRITE might be the way to go, though there are more questions: if it does not rewrite, does it scan the table? Which indexes, if any, does it rebuild? Which foreign key constraints, if any, does it recheck? With patch 0, you can answer all these questions by enabling DEBUG1 messages and trying the command on your test system. For this reason, I did consider adding a VERBOSE clause to show those messages at DETAIL, rather than unconditionally showing them at DEBUG1. In any case, if a WITHOUT REWRITE like you describe covers the important question, it's certainly easy enough to implement. Trouble is, only superusers can set DEBUG1. You're right, its more complex than I made out, though that strengthens the feeling that we need a way to check what it does before it does it, or a way to limit your expectations. Ideally that would be a programmatic way, so that pgAdmin et al can issue a warning. Given your thoughts above, my preference would be for EXPLAIN ALTER TABLE to describe the actions that will take place. Or other ideas... -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PGCon 2011 Call for Papers - reminder
A reminder about PGCon 2011; the deadline is 19 January 2011. PGCon 2011 will be held 19-20 May 2011, in Ottawa at the University of Ottawa. It will be preceded by two days of tutorials on 17-18 May 2011. We are now accepting proposals for talks. Proposals can be quite simple. We do not require academic-style papers. If you are doing something interesting with PostgreSQL, please submit a proposal. You might be one of the backend hackers or work on a PostgreSQL related project and want to share your know-how with others. You might be developing an interesting system using PostgreSQL as the foundation. Perhaps you migrated from another database to PostgreSQL and would like to share details. These, and other stories are welcome. Both users and developers are encouraged to share their experiences. Here are a some ideas to jump start your proposal process: - novel ways in which PostgreSQL is used - migration of production systems from another database - data warehousing - tuning PostgreSQL for different work loads - replication and clustering - hacking the PostgreSQL code - PostgreSQL derivatives and forks - applications built around PostgreSQL - benchmarking and performance engineering - case studies - location-aware and mapping software with PostGIS - PostgreSQL 9.1 and 9.2 features in development - research and teaching with PostgreSQL - things the PostgreSQL project could do better - integrating PostgreSQL with 3rd-party software Both users and developers are encouraged to share their experiences. The schedule is: 1 Dec 2010 Proposal acceptance begins 19 Jan 2011 Proposal acceptance ends 19 Feb 2011 Confirmation of accepted proposals See also http://www.pgcon.org/2011/papers.php Instructions for submitting a proposal to PGCon 2011 are available from: http://www.pgcon.org/2011/submissions.php -- Dan Langille - http://langille.org/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, Jan 11, 2011 at 12:37:28PM +, Simon Riggs wrote: On Tue, 2011-01-11 at 07:14 -0500, Noah Misch wrote: These changes do make it harder to guess how much work the ALTER TABLE will do. Indeed, about 1/4 of my own guesses prior to writing were wrong. Something like WITHOUT REWRITE might be the way to go, though there are more questions: if it does not rewrite, does it scan the table? Which indexes, if any, does it rebuild? Which foreign key constraints, if any, does it recheck? With patch 0, you can answer all these questions by enabling DEBUG1 messages and trying the command on your test system. For this reason, I did consider adding a VERBOSE clause to show those messages at DETAIL, rather than unconditionally showing them at DEBUG1. In any case, if a WITHOUT REWRITE like you describe covers the important question, it's certainly easy enough to implement. Trouble is, only superusers can set DEBUG1. Setting client_min_messages in one's session works, as does ALTER ROLE myself SET client_min_messages = debug1. Still, indeed, DEBUG1 is not the usual place to send a user for information. You're right, its more complex than I made out, though that strengthens the feeling that we need a way to check what it does before it does it, or a way to limit your expectations. Ideally that would be a programmatic way, so that pgAdmin et al can issue a warning. Given your thoughts above, my preference would be for EXPLAIN ALTER TABLE to describe the actions that will take place. That does seem like the best UI. Offhand, I would guess that's a project larger than the patch series I have here. We'd need to restructure ALTER TABLE into clear planning and execution stages, if not use the actual planner and executor. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add function dependencies
On mån, 2011-01-10 at 23:59 +0100, Joel Jacobson wrote: It would be equally useful if it warned you when trying to drop a function other functions might depend on. This would only work for a small subset of cases, so the argument can be made that it is less surprising to say, we don't track dependencies of the function source code at all. Making it work for language SQL would be nice, though. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, Jan 11, 2011 at 07:27:46AM -0500, Robert Haas wrote: On Tue, Jan 11, 2011 at 7:14 AM, Noah Misch n...@leadboat.com wrote: True. ?At least we could completely document the lock choices on the ALTER TABLE reference page. ?The no-rewrite cases are defined at arms length from ALTER TABLE, and users can define their own, so it's a tougher fit. I don't think it's prohibitively, tough, though, and I think it would be very valuable. We may have to scratch our heads over exactly where to put the information, but we can figure out something that works. Agreed. I don't mean to suggest giving up on documenting anything, just that some behaviors are clear enough by documentation alone, while others benefit from additional syntax/messages to constrain/see what actually happens. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, 2011-01-11 at 08:06 -0500, Noah Misch wrote: On Tue, Jan 11, 2011 at 12:37:28PM +, Simon Riggs wrote: On Tue, 2011-01-11 at 07:14 -0500, Noah Misch wrote: These changes do make it harder to guess how much work the ALTER TABLE will do. Indeed, about 1/4 of my own guesses prior to writing were wrong. Something like WITHOUT REWRITE might be the way to go, though there are more questions: if it does not rewrite, does it scan the table? Which indexes, if any, does it rebuild? Which foreign key constraints, if any, does it recheck? With patch 0, you can answer all these questions by enabling DEBUG1 messages and trying the command on your test system. For this reason, I did consider adding a VERBOSE clause to show those messages at DETAIL, rather than unconditionally showing them at DEBUG1. In any case, if a WITHOUT REWRITE like you describe covers the important question, it's certainly easy enough to implement. Trouble is, only superusers can set DEBUG1. Setting client_min_messages in one's session works, as does ALTER ROLE myself SET client_min_messages = debug1. Still, indeed, DEBUG1 is not the usual place to send a user for information. You're right, its more complex than I made out, though that strengthens the feeling that we need a way to check what it does before it does it, or a way to limit your expectations. Ideally that would be a programmatic way, so that pgAdmin et al can issue a warning. Given your thoughts above, my preference would be for EXPLAIN ALTER TABLE to describe the actions that will take place. That does seem like the best UI. Offhand, I would guess that's a project larger than the patch series I have here. We'd need to restructure ALTER TABLE into clear planning and execution stages, if not use the actual planner and executor. Please do something that works in this release, whatever that is. I will follow your lead in putting a similar mechanism in for judging lock levels. I don't want to be looking through the docs each time I run this, sweating between it taking 5 secs and 5 hours on a big server. We need to be able to run stuff overnight, with certainty that we know what will happen. And I want a clear answer when the but how can you be certain? question gets asked. Thank you for the rest of the patch. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
2011/1/11 Robert Haas robertmh...@gmail.com: I don't get it. If two different items that exist in the system out of the box have the same description, it seems clear that relevant piece of disambiguating information exists nowhere in the description string. I guess it is a question of prioritization. If backwards compatibility is to be guaranteed, even for functions returning text intended to be read by humans, then the function cannot be modified, without violating that golden rule, if such a rule exists within the PostgreSQL development project? If it's not a golden rule, then it's a totally different story and there is no excuse why it should return the same descriptions for the same objects. Any other reasoning is just silly. -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, Jan 11, 2011 at 01:17:23PM +, Simon Riggs wrote: On Tue, 2011-01-11 at 08:06 -0500, Noah Misch wrote: On Tue, Jan 11, 2011 at 12:37:28PM +, Simon Riggs wrote: Given your thoughts above, my preference would be for EXPLAIN ALTER TABLE to describe the actions that will take place. That does seem like the best UI. Offhand, I would guess that's a project larger than the patch series I have here. We'd need to restructure ALTER TABLE into clear planning and execution stages, if not use the actual planner and executor. Please do something that works in this release, whatever that is. I will follow your lead in putting a similar mechanism in for judging lock levels. Okay; I'll see what I can come up with. The other part I was going to try to finish before the last commitfest begins is avoiding unnecessary rebuilds of indexes involving changed columns. Is that more or less important than having an EXPLAIN ALTER TABLE? I don't want to be looking through the docs each time I run this, sweating between it taking 5 secs and 5 hours on a big server. We need to be able to run stuff overnight, with certainty that we know what will happen. And I want a clear answer when the but how can you be certain? question gets asked. Just to clarify: does Robert's suggestion of starting the command in a transaction block and cancelling it after messages appear (on any other DB with the same schema, if need be) give too little certainty? You could check pg_locks to see what lock was taken, too. It's surely not the ideal user experience, but it seems dependable at least. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
On Jan 11, 2011, at 8:25 AM, Joel Jacobson j...@gluefinance.com wrote: 2011/1/11 Robert Haas robertmh...@gmail.com: I don't get it. If two different items that exist in the system out of the box have the same description, it seems clear that relevant piece of disambiguating information exists nowhere in the description string. I guess it is a question of prioritization. If backwards compatibility is to be guaranteed, even for functions returning text intended to be read by humans, then the function cannot be modified, without violating that golden rule, if such a rule exists within the PostgreSQL development project? If it's not a golden rule, then it's a totally different story and there is no excuse why it should return the same descriptions for the same objects. Any other reasoning is just silly. Well, we shouldn't change them randomly or arbitrarily, but improving them is another thing altogether. I think the contention that any user or application anywhere is depending on the exact textual representation of a pg_amproc entry is exceedingly dubious. And I think the current messages are flat-out confusing. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Jan 11, 2011, at 8:50 AM, Noah Misch n...@leadboat.com wrote: Okay; I'll see what I can come up with. The other part I was going to try to finish before the last commitfest begins is avoiding unnecessary rebuilds of indexes involving changed columns. Is that more or less important than having an EXPLAIN ALTER TABLE? I think something like EXPLAIN ALTER TABLE would be #%^* awesome (and kudos to Simon for thinking of it), but your chances of getting into 9.1 are not very good. So I'd focus on the index rebuild issue, which is more clear-cut. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
On 01/10/2011 06:03 PM, Kevin Grittner wrote: Due to popular request (Hey, David's popular, right?), I'm posting a patch for Serializable Snapshot Isolation (SSI), although I don't yet have everything in it that I was planning on submitting before the CF. I will probably be submitting another version before the deadline with the following, but there should be plenty here for people to test and benchmark. We're done with the major refactoring needed to address concerns raised in earlier reviews, and I don't expect the remaining work to destabilize what's there or to have a significant impact on performance. A speed test showing a significant drop in performance when using SSI: hot2= create table test_t2 as (select generate_series(0, 100)); hot2= \timing begin transaction isolation level repeatable read; Time: 0.185 ms hot2= select count(*) from test_t2; Time: 134.521 ms hot2= select count(*) from test_t2; Time: 142.132 ms hot2= select count(*) from test_t2; Time: 147.469 ms hot2= commit; hot2= begin transaction isolation level serializable; Time: 0.165 ms hot2= select count(*) from test_t2; Time: 349.219 ms hot2= select count(*) from test_t2; Time: 354.010 ms hot2= select count(*) from test_t2; Time: 355.960 ms hot2= commit; So, count(*) queries are more than twice as slow compared to the old serializable transaction isolation level. The relative speed difference stays the same if testing with more rows. Also, the same speed difference is there if testing with more columns: create table test_t4 as (select g g1, g g2, g g3, g g4, g g5, g g6 from (select generate_series as g from generate_series(0, 100)) as t1); repeatable read times: 140.113 ms 134.628 ms 140.113 ms 156.166 ms serializable: 353.257 ms 366.558 ms 373.914 ms 359.682 ms - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Tue, 2011-01-11 at 07:14 -0500, Noah Misch wrote: On Tue, Jan 11, 2011 at 09:24:46AM +, Simon Riggs wrote: I have a concern that by making the ALTER TABLE more complex that we might not be able to easily tell if a rewrite happens, or not. What about add EXPLAIN support to it, then whoever wants to know what it does should just run explain on it. I have no idea how hard that would be to implement and if it makes sense at all, but I sure would like to see a plan for DDLs too to estimate how long it would take. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Anssi Kääriäinenanssi.kaariai...@thl.fi wrote: A speed test showing a significant drop in performance when using SSI: hot2= create table test_t2 as (select generate_series(0, 100)); hot2= \timing begin transaction isolation level repeatable read; Time: 0.185 ms hot2= select count(*) from test_t2; Time: 134.521 ms hot2= select count(*) from test_t2; Time: 142.132 ms hot2= select count(*) from test_t2; Time: 147.469 ms hot2= commit; hot2= begin transaction isolation level serializable; Time: 0.165 ms hot2= select count(*) from test_t2; Time: 349.219 ms hot2= select count(*) from test_t2; Time: 354.010 ms hot2= select count(*) from test_t2; Time: 355.960 ms hot2= commit; So, count(*) queries are more than twice as slow compared to the old serializable transaction isolation level. Thanks for the report. I'll profile tat and see what can be done about it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Anssi Kääriäinenanssi.kaariai...@thl.fi wrote: I think I found a problem. This is using SSI v8. Thanks much for testing. You're managing to exercise some code paths I didn't think to test, which is great! I guess this is the up side of having posted yesterday. :-) So, something seems to be broken when using partial indexes. Yes there is. The predicate locks against the heap tuples should have prevented that, but obviously we're missing a call to PredicateLockTuple from the code path for the partial index which is present on the code path for complete indexes. I'll go looking for the spot to add that line of code. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autogenerating error code lists (was Re: [COMMITTERS] pgsql: Add foreign data wrapper error code values for SQL/MED.)
On 28/12/10 12:25, Jan Urbański wrote: Here's the basic errcodes.txt file and three scripts to generate errcodes.h, plerrcodes.h and part of errcodes.sgml. I tried wiring it into the build system, but failed, I can't figure out which Makefiles should be updated in order to make errcodes.h and plerrcodes.h generated headers. Could someone help with that? Trying a bit harder to make src/include/utils/errcodes.h a generated file I found that it's included so early it makes the task not trivial at all. postgres.h includes elog.h, which includes errcodes.h (after defining the MAKE_SQLSTATE macro). I'm not sure how to proceed: 1) just give up on it, and keep on maintaining 3 places where new error codes have to go 2) do not include errcodes.h in elog.h, move the MAKE_SQLSTATE definition to the top of errcodes.h, guarded by a #ifndef MAKE_SQLSTATE, fix every place that included elog.h and assumed it has the sqlstate values to explicitly include errcodes.h. This means that you can still define your own MAKE_SQLSTATE macro and then include errcodes.h if you want to redefine what it does. 3) try to wire generating errcodes.h somewhere very early in the makefiles My preference is 2) followed by 1), because 3) seems too horrible. But I can understand if people will want to keep things as they are and just forget about generating errcodes.h Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
Joel Jacobson j...@gluefinance.com writes: I instead propose we introduce a new function named pg_get_object_unique_identifier( classid oid, objid oid, objsubid integer ) returns text. Seems like concatenating the OIDs would accomplish that. (If you think not, well, you still haven't explained what problem you're trying to solve...) The name would make sense since we already have a pg_get_function_identity_arguments( func_oid ), for a similar purpose but solely for functions. No, that does not exist for the purpose you claim it does. And it's far from obvious what the extension of that to all object classes would look like. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
Excerpts from Robert Haas's message of mar ene 11 10:52:12 -0300 2011: Well, we shouldn't change them randomly or arbitrarily, but improving them is another thing altogether. I think the contention that any user or application anywhere is depending on the exact textual representation of a pg_amproc entry is exceedingly dubious. And I think the current messages are flat-out confusing. Tom also mentioned that the descriptions are translated, so an application built to parse anything out of the strings is already broken. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
2011/1/11 Tom Lane t...@sss.pgh.pa.us: Seems like concatenating the OIDs would accomplish that. (If you think not, well, you still haven't explained what problem you're trying to solve...) The can be different in two different databases sharing the same original schema, but of two different versions. In this case it is better to compare textual strings describing the objects than to compare based on oids. -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
Robert Haas robertmh...@gmail.com writes: On Tue, Jan 11, 2011 at 4:46 AM, Tatsuo Ishii is...@postgresql.org wrote: For query based replication tools like pgpool-II (I don't know any other tools, for example Postgres XC falls in this category or not...), we need to be able to lock sequences. Fortunately it is allowed to: SELECT 1 FROM foo_sequece FOR UPDATE; but LOCK foo_sequence looks more appropreate syntax for me. Those aren't doing the same thing. The first is locking the one and only tuple that is contained within the sequence, while the second is locking the sequence object itself. At this point, I'm inclined to think that the pg_dump comment is just wrong, and we ought to fix it to say that we don't really want to be able to lock other relations after all, and call it good. The reason that pg_dump tries to acquire locks at all is to ensure that it dumps a consistent view of the database. The general excuse for not locking non-table objects is that (at least in most cases) they are defined by single catalog entries and so there's no way to see a non-self-consistent view of them. Tables, being defined by a collection of rows in different catalogs, are *very* risky to dump without any lock. This doesn't get noticeably better for non-table relation types. An example of the sort of risk I'm thinking about is dumping a view without any lock while someone else does a CREATE OR REPLACE VIEW on it. You could very easily see a set of attributes (in pg_attribute) that don't agree with the view rules you pulled from pg_rewrite. The risk is minimal now since we don't allow C.O.R.V. to change the column set, but as soon as somebody creates a patch that allows that, pg_dump will have a problem. Note that using a serializable transaction (with or without true serializability) doesn't fix this issue, since pg_dump depends so heavily on backend-side support functions that work in SnapshotNow mode. It really needs locks to ensure that the support functions see a view consistent with its own catalog reads. In the SEQUENCE example above, SELECT ... FOR UPDATE is certainly not adequate to protect the sequence against DDL-level changes. Fortunately sequences don't have too many DDL commands, but still an ALTER RENAME might be enough to confuse pg_dump. (By the way, does that SELECT ... FOR UPDATE actually accomplish anything at all? nextval() doesn't go through heap_update, and neither does ALTER SEQUENCE, so I'd be a bit surprised if it really manages to block changes to the sequence.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
On 01/11/2011 04:53 PM, Kevin Grittner wrote: Thanks much for testing. You're managing to exercise some code paths I didn't think to test, which is great! I guess this is the up side of having posted yesterday. :-) Glad that I can help. This feature is something that is very important to our usage of PostgreSQL. Just to add some positive feedback: I tried this patch with our in-house in development EAV database. Everything was working perfectly, and time to import current and history data for 7000 employees (total of 95000 attributes) to the DB using stored procedures went from 4 minutes 20 seconds to 4 minutes 30 seconds (the procedures are doing a lot of validation...). So in this case the real-world performance difference was barely noticeable. SSI was also able to block serialization anomalies and I did not have any false rollbacks in my (very limited) testing. So, thank you for the work on this feature. And, of course, I will try to break it some more.:-) - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
On Jan11, 2011, at 16:12 , Tom Lane wrote: Joel Jacobson j...@gluefinance.com writes: I instead propose we introduce a new function named pg_get_object_unique_identifier( classid oid, objid oid, objsubid integer ) returns text. Seems like concatenating the OIDs would accomplish that. (If you think not, well, you still haven't explained what problem you're trying to solve...) I think the OP wants a unique identifier which changes neither on dump reload nor on major version upgrades. I can see the value in that if you e.g. want to compare the structures of two different postgres databases. It seems impossible to guarantee the identifier to not change between major versions, though - if the structure of that catalog change, so will the identifier. @OP: Wouldn't it be sufficient to provide such a thing for structure objects that people are actually going to modify on a regular basis? I suggest restricting this to tables (including SQL/MED foreign tables) views sequences types servers (SQL/MED) tsearch configurations tsearch dictionaries for which schema.name suffices, triggers constraints columns for which schema.table.name should work, and functions operators which need to include the argument types. The reg* types already solve this for tables, views, sequences (regclass) tsearch configurations (regconfig) tsearch dictionaries (regdictionary) types (regtype) functions (regprocedure) operators (regoperator) which leaves servers triggers constraints columns best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] casts: max double precision text double precision fails with out or range error
Excerpts from Maciej Sakrejda's message of mar ene 11 03:28:13 -0300 2011: Tried asking this in pgsql-general but I got no response, so I thought I'd give hackers a shot: postgres=# select (((1.7976931348623157081e+308)::double precision)::text)::double precision; ERROR: 1.79769313486232e+308 is out of range for type double precision I'm working on a pg driver and in my float data decoder functional tests, I ran into some errors that I eventually traced back to this behavior. Essentially, postgres seems to cast the max normal double (i.e., the bits of ~(1ULL52 | 1ULL63)) to text in such a manner that it's rounded up, and the reverse cast, text-to-double-precision, does not recognize it as being in range. Curiously, pg_dump seems to print doubles with more precision (in both COPY and INSERT modes), avoiding this issue. Yeah, it sets the extra_float_digits parameter. alvherre=# set extra_float_digits to 3; SET alvherre=# select (((1.7976931348623157081e+308)::double precision)::text)::double precision; float8 ── 1.79769313486231571e+308 (1 fila) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_depend explained
Has anyone written a in-depth description on how to traverse the pg_depend tree? The 'a' and 'i' deptype really makes it hard to figure out the dependency order, a topological sort does not work. My latest attempt involved trying to group by all objects connected to each other via deptype 'a' or 'i', and replacing all such nodes in the tree with the source node, i.e. the node which according to the topological order could be created first. Am I on the right path, trying to fuse the internal/auto objects together, replacing them with the top most object in the tree? Or is there a simplier way to figure out the order in which objects can be created? I need a general solution, not a custom-made query for each regclass, which is quite trivial but feels far from bullet proof, I want something only relying on pg_depend, since it should be the safest method of them all. -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
On Mon, Jan 10, 2011 at 09:09:28AM -0500, Magnus Hagander wrote: On Sun, Jan 9, 2011 at 23:33, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/7 Magnus Hagander mag...@hagander.net: On Fri, Jan 7, 2011 at 01:47, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/5 Magnus Hagander mag...@hagander.net: On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: * Stefan mentiond it might be useful to put some posix_fadvise(POSIX_FADV_DONTNEED) in the process that streams all the files out. Seems useful, as long as that doesn't kick them out of the cache *completely*, for other backends as well. Do we know if that is the case? Maybe have a look at pgfincore to only tag DONTNEED for blocks that are not already in SHM? I think that's way more complex than we want to go here. DONTNEED will remove the block from OS buffer everytime. Then we definitely don't want to use it - because some other backend might well want the file. Better leave it up to the standard logic in the kernel. Looking at the patch, it is (very) easy to add the support for that in basebackup.c That supposed allowing mincore(), so mmap(), and so probably switch the fopen() to an open() (or add an open() just for mmap requirement...) Let's go ? Per above, I still don't think we *should* do this. We don't want to kick things out of the cache underneath other backends, and since we can't control that. Either way, it shouldn't happen in the beginning, and if it does, should be backed with proper benchmarks. Another option that occurs to me is an option to use direct IO (or another means as needed) to bypass the cache. So rather than kicking it out of the cache, we attempt just not to pollute the cache by bypassing it for cold pages and use either normal io for 'hot pages', or use a 'read()' to heat the cache afterward. Garick I've committed the backend side of this, without that. Still working on the client, and on cleaning up Heikki's patch for grammar/parser support. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add function dependencies
Peter Eisentraut pete...@gmx.net writes: Making it work for language SQL would be nice, though. Please consider a new DEPENDENCY_XXX constant for that though, because otherwise I think it could cause problems in the extension's dependency tracking. Even with a new DEPENDENCY_FUNCALL or other constant, the extension code would need to be aware of this new kind of not-to-follow dependencies somehow, I guess. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 10, 2011 at 8:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Not really. AFAIR, there are two cases that exist in practice, depending on which AM you're talking about: 1. The recorded types match the input types of the operator/function (btree hash). 2. The recorded types are always the same as the opclass's input type (gist gin). In neither case does printing those types really add much information. That's why it's not there now. I don't get it. If two different items that exist in the system out of the box have the same description, it seems clear that relevant piece of disambiguating information exists nowhere in the description string. The relevant piece of disambiguating information is the function name+parameters in the first case, or the opclass name in the second. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.
Peter Eisentraut pete...@gmx.net writes: On tis, 2011-01-11 at 12:30 +0900, Itagaki Takahiro wrote: If we won't to add accept integers for money, we should fix the docs. | integer and floating-point string literals |~~~ Will it get better? I think adding a cast from integer to money is probably quite reasonable. The other way around, maybe not, or only an explicit cast. As near as I can tell, this entire thread started because someone thought that the reference to numeric in the release notes implied any numerical type, not the type named numeric. We explicitly rejected the idea of providing direct casts to/from floating point types, on the grounds of not wanting any roundoff error; so I don't think this is a point that should be revisited. Perhaps it would be sufficient to clarify the release-note item. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autogenerating error code lists (was Re: [COMMITTERS] pgsql: Add foreign data wrapper error code values for SQL/MED.)
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: I tried wiring it into the build system, but failed, I can't figure out which Makefiles should be updated in order to make errcodes.h and plerrcodes.h generated headers. Could someone help with that? Trying a bit harder to make src/include/utils/errcodes.h a generated file I found that it's included so early it makes the task not trivial at all. postgres.h includes elog.h, which includes errcodes.h (after defining the MAKE_SQLSTATE macro). I'm not sure how to proceed: Huh? Why in the world would the specific location of the #include have anything to do with the problem? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
Florian Pflug f...@phlo.org writes: @OP: Wouldn't it be sufficient to provide such a thing for structure objects that people are actually going to modify on a regular basis? Yeah, I still don't see the need to argue over whether the elements of an operator class are uniquely identifiable or not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_depend explained
Joel Jacobson j...@gluefinance.com writes: Has anyone written a in-depth description on how to traverse the pg_depend tree? Try reading the code in src/backend/catalog/dependency.c. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
2011/1/11 Garick Hamlin gham...@isc.upenn.edu: On Mon, Jan 10, 2011 at 09:09:28AM -0500, Magnus Hagander wrote: On Sun, Jan 9, 2011 at 23:33, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/7 Magnus Hagander mag...@hagander.net: On Fri, Jan 7, 2011 at 01:47, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/5 Magnus Hagander mag...@hagander.net: On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: * Stefan mentiond it might be useful to put some posix_fadvise(POSIX_FADV_DONTNEED) in the process that streams all the files out. Seems useful, as long as that doesn't kick them out of the cache *completely*, for other backends as well. Do we know if that is the case? Maybe have a look at pgfincore to only tag DONTNEED for blocks that are not already in SHM? I think that's way more complex than we want to go here. DONTNEED will remove the block from OS buffer everytime. Then we definitely don't want to use it - because some other backend might well want the file. Better leave it up to the standard logic in the kernel. Looking at the patch, it is (very) easy to add the support for that in basebackup.c That supposed allowing mincore(), so mmap(), and so probably switch the fopen() to an open() (or add an open() just for mmap requirement...) Let's go ? Per above, I still don't think we *should* do this. We don't want to kick things out of the cache underneath other backends, and since we can't control that. Either way, it shouldn't happen in the beginning, and if it does, should be backed with proper benchmarks. Another option that occurs to me is an option to use direct IO (or another means as needed) to bypass the cache. So rather than kicking it out of the cache, we attempt just not to pollute the cache by bypassing it for cold pages and use either normal io for 'hot pages', or use a 'read()' to heat the cache afterward. AFAIR, even Linus is rejecting the idea to use it seriously, except if I shuffle in my memory. Garick I've committed the backend side of this, without that. Still working on the client, and on cleaning up Heikki's patch for grammar/parser support. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_depend explained
2011/1/11 Tom Lane t...@sss.pgh.pa.us: Try reading the code in src/backend/catalog/dependency.c. I've tried but failed to figure it out anyway. The focus in dependency.c is to find out dependencies of a given object. What I want to do is something slighly different. I need to figure out the order of creation of all objects, not just the dependencies for a single object. Basically, I want to do order by oid, but since you cannot trust the oid order (like we did in pre-7.3), I need to get the sorted list using pg_depend somehow. I guess I could make findDependentObjects() callable from sql and call it for each and every object, but that's a quite big project, I was hoping it was possible to do it in plain sql, or at least only by relying on plpgsql/plperl. I've implemented tsort() in plperl, but that didn't really help me much, since you need to jump around in the tree when you encounter 'internal' and 'auto' nodes. My last resort is to sort by oid, but I really don't want to do that since it would render the entire solution unreliable and I would never feel comfortable using it in the production environment. This is the last component I need in order to complete the work on the pov-project http://www.github.com/gluefinance/pov I would highly appreciate your help, I feel a bit lame since I've spent over two days working on this. It's not difficult if you are allowed to build specialized queries for each class, but my goal is a general query only relying on pg_depend. -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
Joel Jacobson j...@gluefinance.com writes: 2011/1/11 Tom Lane t...@sss.pgh.pa.us: Seems like concatenating the OIDs would accomplish that. (If you think not, well, you still haven't explained what problem you're trying to solve...) The can be different in two different databases sharing the same original schema, but of two different versions. In this case it is better to compare textual strings describing the objects than to compare based on oids. If that's what you're after, getObjectDescription is entirely unsuitable, because of the fact that its results are dependent on search path and language settings. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_depend explained
Joel Jacobson j...@gluefinance.com writes: I need to figure out the order of creation of all objects, not just the dependencies for a single object. In that case try pg_dump's pg_dump_sort.c. You will never get the order of creation of objects, because that isn't tracked; but you can find out what a safe order would be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [TESTERS] [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.
On tis, 2011-01-11 at 11:03 -0500, Tom Lane wrote: We explicitly rejected the idea of providing direct casts to/from floating point types, on the grounds of not wanting any roundoff error; so I don't think this is a point that should be revisited. We also explicitly chose floating point as the result of the money/money operator over numeric. Seems a bit inconsistent. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
On Tue, Jan 11, 2011 at 11:39:20AM -0500, Cédric Villemain wrote: 2011/1/11 Garick Hamlin gham...@isc.upenn.edu: On Mon, Jan 10, 2011 at 09:09:28AM -0500, Magnus Hagander wrote: On Sun, Jan 9, 2011 at 23:33, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/7 Magnus Hagander mag...@hagander.net: On Fri, Jan 7, 2011 at 01:47, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/5 Magnus Hagander mag...@hagander.net: On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: * Stefan mentiond it might be useful to put some posix_fadvise(POSIX_FADV_DONTNEED) in the process that streams all the files out. Seems useful, as long as that doesn't kick them out of the cache *completely*, for other backends as well. Do we know if that is the case? Maybe have a look at pgfincore to only tag DONTNEED for blocks that are not already in SHM? I think that's way more complex than we want to go here. DONTNEED will remove the block from OS buffer everytime. Then we definitely don't want to use it - because some other backend might well want the file. Better leave it up to the standard logic in the kernel. Looking at the patch, it is (very) easy to add the support for that in basebackup.c That supposed allowing mincore(), so mmap(), and so probably switch the fopen() to an open() (or add an open() just for mmap requirement...) Let's go ? Per above, I still don't think we *should* do this. We don't want to kick things out of the cache underneath other backends, and since we can't control that. Either way, it shouldn't happen in the beginning, and if it does, should be backed with proper benchmarks. Another option that occurs to me is an option to use direct IO (or another means as needed) to bypass the cache. So rather than kicking it out of the cache, we attempt just not to pollute the cache by bypassing it for cold pages and use either normal io for 'hot pages', or use a 'read()' to heat the cache afterward. AFAIR, even Linus is rejecting the idea to use it seriously, except if I shuffle in my memory. Direct IO is generally a pain. POSIX_FADV_NOREUSE is an alternative (I think). Realistically I wasn't sure which way(s) actually worked. My gut was that direct io would likely work right on Linux and Solaris, at least. If POSIX_FADV_NOREUSE works than maybe that is the answer instead, but I haven't tested either. Garick Garick I've committed the backend side of this, without that. Still working on the client, and on cleaning up Heikki's patch for grammar/parser support. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_depend explained
On Jan11, 2011, at 16:54 , Joel Jacobson wrote: Has anyone written a in-depth description on how to traverse the pg_depend tree? The 'a' and 'i' deptype really makes it hard to figure out the dependency order, a topological sort does not work. Could you give an example of the kind of trouble you're experiencing trying to use a topological sort? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
On Jan11, 2011, at 18:09 , Garick Hamlin wrote: My gut was that direct io would likely work right on Linux and Solaris, at least. Didn't we discover recently that O_DIRECT fails for ext4 on linux if ordered=data, or something like that? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
2011/1/11 Garick Hamlin gham...@isc.upenn.edu: On Tue, Jan 11, 2011 at 11:39:20AM -0500, Cédric Villemain wrote: 2011/1/11 Garick Hamlin gham...@isc.upenn.edu: On Mon, Jan 10, 2011 at 09:09:28AM -0500, Magnus Hagander wrote: On Sun, Jan 9, 2011 at 23:33, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/7 Magnus Hagander mag...@hagander.net: On Fri, Jan 7, 2011 at 01:47, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/1/5 Magnus Hagander mag...@hagander.net: On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: * Stefan mentiond it might be useful to put some posix_fadvise(POSIX_FADV_DONTNEED) in the process that streams all the files out. Seems useful, as long as that doesn't kick them out of the cache *completely*, for other backends as well. Do we know if that is the case? Maybe have a look at pgfincore to only tag DONTNEED for blocks that are not already in SHM? I think that's way more complex than we want to go here. DONTNEED will remove the block from OS buffer everytime. Then we definitely don't want to use it - because some other backend might well want the file. Better leave it up to the standard logic in the kernel. Looking at the patch, it is (very) easy to add the support for that in basebackup.c That supposed allowing mincore(), so mmap(), and so probably switch the fopen() to an open() (or add an open() just for mmap requirement...) Let's go ? Per above, I still don't think we *should* do this. We don't want to kick things out of the cache underneath other backends, and since we can't control that. Either way, it shouldn't happen in the beginning, and if it does, should be backed with proper benchmarks. Another option that occurs to me is an option to use direct IO (or another means as needed) to bypass the cache. So rather than kicking it out of the cache, we attempt just not to pollute the cache by bypassing it for cold pages and use either normal io for 'hot pages', or use a 'read()' to heat the cache afterward. AFAIR, even Linus is rejecting the idea to use it seriously, except if I shuffle in my memory. Direct IO is generally a pain. POSIX_FADV_NOREUSE is an alternative (I think). Realistically I wasn't sure which way(s) actually worked. My gut was that direct io would likely work right on Linux and Solaris, at least. If POSIX_FADV_NOREUSE works than maybe that is the answer instead, but I haven't tested either. yes it should be the best option, unfortunely it is a ghost flag, it doesn't do anythig. At some point there were a libprefetch library and a linux fincore() syscall in the air. Unfortunely actors of those items stop communication with open source afais. (I didn't get answers myself, neither linux ML get ones.) Garick Garick I've committed the backend side of this, without that. Still working on the client, and on cleaning up Heikki's patch for grammar/parser support. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
On Mon, 2011-01-10 at 11:50 -0600, Kevin Grittner wrote: I'm trying not to panic here, but I haven't looked at 2PC before yesterday and am just dipping into the code to support it, and time is short. Can anyone give me a pointer to anything I should read before I dig through the 2PC code, which might accelerate this? I don't see much about 2PC outside of twophase.c. Regarding the original post, I agree that we should have two phase-commit support for SSI. We opted not to support it for notifications, but there was a fairly reasonable argument why users wouldn't value the combination of 2PC and NOTIFY. I don't expect this to be a huge roadblock for the feature though. It seems fairly contained. I haven't read the 2PC code either, but I don't expect that you'll need to change the rest of your algorithm just to support it. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autogenerating error code lists (was Re: [COMMITTERS] pgsql: Add foreign data wrapper error code values for SQL/MED.)
On 11/01/11 17:11, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: I tried wiring it into the build system, but failed, I can't figure out which Makefiles should be updated in order to make errcodes.h and plerrcodes.h generated headers. Could someone help with that? Trying a bit harder to make src/include/utils/errcodes.h a generated file I found that it's included so early it makes the task not trivial at all. postgres.h includes elog.h, which includes errcodes.h (after defining the MAKE_SQLSTATE macro). I'm not sure how to proceed: Huh? Why in the world would the specific location of the #include have anything to do with the problem? I'v having a hard time convincing make to generate errcodes.h before compiling any .c file that includes postgres.h. The only way I found was to make src/include/errcodes.h a dependancy of the all target. For instance, I tried to copy the way we generate fmgroids.h and it turned out that it doesn't work (it tries to compile things in src/port before entering src/include, and things in src/port include postgres.h). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
Florian Pflug f...@phlo.org writes: On Jan11, 2011, at 18:09 , Garick Hamlin wrote: My gut was that direct io would likely work right on Linux and Solaris, at least. Didn't we discover recently that O_DIRECT fails for ext4 on linux if ordered=data, or something like that? Quite. Blithe assertions that something like this should work aren't worth the electrons they're written on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autogenerating error code lists (was Re: [COMMITTERS] pgsql: Add foreign data wrapper error code values for SQL/MED.)
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: On 11/01/11 17:11, Tom Lane wrote: Huh? Why in the world would the specific location of the #include have anything to do with the problem? I'v having a hard time convincing make to generate errcodes.h before compiling any .c file that includes postgres.h. The only way I found was to make src/include/errcodes.h a dependancy of the all target. Peter would probably be a better person than me to answer that, but I imagine that what you want is similar to what src/backend/Makefile does for parser/gram.h, only applied at the src/ level or maybe even the root. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
On Jan10, 2011, at 18:50 , Kevin Grittner wrote: I'm trying not to panic here, but I haven't looked at 2PC before yesterday and am just dipping into the code to support it, and time is short. Can anyone give me a pointer to anything I should read before I dig through the 2PC code, which might accelerate this? It roughly works as follows Upon PREPARE, the locks previously held by the transaction are transferred to a kind of virtual backend which only consists of a special proc array entry. The transaction will thus still appear to be running, and will still be holding its locks, even after the original backend is gone. The information necessary to reconstruct that proc array entry is also written to the 2PC state, and used to recreate the virtual backend after a restart or crash. There are also some additional pieces of transaction state which are stored in the 2PC state file like the full list of subtransaction xids (The proc array entry may not contain all of them if it overflowed). Upon COMMIT PREPARED, the information in the 2PC state file is used to write a COMMIT wal record and to update the clog. The transaction is then committed, and the special proc array entry is removed and all lockmgr locks it held are released. For 2PC to work for SSI transaction, I guess you must check for conflicts during PREPARE - at any later point the COMMIT may only fail transiently, not permanently. Any transaction that adds a conflict with an already prepared transaction must check if that conflict completes a dangerous structure, and abort if this is the case, since the already PREPAREd transaction can no longer be aborted. COMMIT PREPARED then probably doesn't need to do anything special for SSI transactions, apart from some cleanup actions maybe. Unfortunately, it seems that doing things this way will undermine the guarantee that retrying a failed SSI transaction won't fail due to the same conflict as it did originally. Consider T1 BEGIN TRANSACTION ISOLATION SERIALIZABLE T1 SELECT * FROM T T1 UPDATE T ... T1 PREPARE TRANSACTION T2 BEGIN TRANSACTION ISOLATION SERIALIZABLE T2 SELECT * FROM T T2 UPDATE T ... - Serialization Error Retrying T2 won't help as long as T1 isn't COMMITTED. There doesn't seems a way around that, however - any correct implementation of 2PC for SSI will have to behave that way I fear :-( Hope this helps best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Allowing multiple concurrent base backups
Now that we have a basic over-the-wire base backup capability in walsender, it would be nice to allow taking multiple base backups at the same time. It might not seem very useful at first, but it makes it easier to set up standbys for small databases. At the moment, if you want to set up two standbys, you have to either take a single base backup and distribute it to both standbys, or somehow coordinate that they don't try to take the base backup at the same time. Also, you don't want initializing a standby to conflict with a nightly backup cron script. So, this patch modifies the internal do_pg_start/stop_backup functions so that in addition to the traditional mode of operation, where a backup_label file is created in the data directory where it's backed up along with all other files, the backup label file is be returned to the caller, and the caller is responsible for including it in the backup. The code in replication/basebackup.c includes it in the tar file that's streamed the client, as backup_label. To make that safe, I've changed forcePageWrites into an integer. Whenever a backup is started, it's incremented, and when one ends, it's decremented. When forcePageWrites == 0, there's no backup in progress. The user-visible pg_start_backup() function is not changed. You can only have one backup started that way in progress at a time. But you can do streaming base backups at the same time with traditional pg_start_backup(). I implemented this in two ways, and can't decide which I like better: 1. The contents of the backup label file are returned to the caller of do_pg_start_backup() as a palloc'd string. 2. do_pg_start_backup() creates a temporary file that the backup label is written to (instead of backup_label). Implementation 1 changes more code, as pg_start/stop_backup() need to be changed to write/read from memory instead of file, but the result isn't any more complicated. Nevertheless, I somehow feel more comfortable with 2. Patches for both approaches attached. They're also available in my github repository at g...@github.com:hlinnaka/postgres.git. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 5b6a230..cc4d46e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -338,7 +338,8 @@ typedef struct XLogCtlInsert XLogPageHeader currpage; /* points to header of block in cache */ char *currpos; /* current insertion point in cache */ XLogRecPtr RedoRecPtr; /* current redo point for insertions */ - bool forcePageWrites; /* forcing full-page writes for PITR? */ + int forcePageWrites; /* forcing full-page writes for PITR? */ + bool exclusiveBackup; /* a backup was started with pg_start_backup() */ } XLogCtlInsert; /* @@ -8313,7 +8314,7 @@ pg_start_backup(PG_FUNCTION_ARGS) backupidstr = text_to_cstring(backupid); - startpoint = do_pg_start_backup(backupidstr, fast); + startpoint = do_pg_start_backup(backupidstr, fast, true, NULL); snprintf(startxlogstr, sizeof(startxlogstr), %X/%X, startpoint.xlogid, startpoint.xrecoff); @@ -8321,7 +8322,7 @@ pg_start_backup(PG_FUNCTION_ARGS) } XLogRecPtr -do_pg_start_backup(const char *backupidstr, bool fast) +do_pg_start_backup(const char *backupidstr, bool fast, bool exclusive, char **labelfile) { XLogRecPtr checkpointloc; XLogRecPtr startpoint; @@ -8332,6 +8333,7 @@ do_pg_start_backup(const char *backupidstr, bool fast) uint32 _logSeg; struct stat stat_buf; FILE *fp; + StringInfoData labelfbuf; if (!superuser() !is_authenticated_user_replication_role()) ereport(ERROR, @@ -8368,15 +8370,19 @@ do_pg_start_backup(const char *backupidstr, bool fast) * ensure adequate interlocking against XLogInsert(). */ LWLockAcquire(WALInsertLock, LW_EXCLUSIVE); - if (XLogCtl-Insert.forcePageWrites) + if (exclusive) { - LWLockRelease(WALInsertLock); - ereport(ERROR, -(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), - errmsg(a backup is already in progress), - errhint(Run pg_stop_backup() and try again.))); + if (XLogCtl-Insert.exclusiveBackup) + { + LWLockRelease(WALInsertLock); + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg(a backup is already in progress), + errhint(Run pg_stop_backup() and try again.))); + } + XLogCtl-Insert.exclusiveBackup = true; } - XLogCtl-Insert.forcePageWrites = true; + XLogCtl-Insert.forcePageWrites++; LWLockRelease(WALInsertLock); /* @@ -8393,7 +8399,7 @@ do_pg_start_backup(const char *backupidstr, bool fast) RequestXLogSwitch(); /* Ensure we release forcePageWrites if fail below */ - PG_ENSURE_ERROR_CLEANUP(pg_start_backup_callback, (Datum) 0); + PG_ENSURE_ERROR_CLEANUP(pg_start_backup_callback, (Datum) BoolGetDatum(exclusive)); { /* * Force a CHECKPOINT. Aside from being necessary to prevent torn @@ -8420,54
Re: [HACKERS] Streaming base backups
On Tue, Jan 11, 2011 at 12:45:02PM -0500, Tom Lane wrote: Florian Pflug f...@phlo.org writes: On Jan11, 2011, at 18:09 , Garick Hamlin wrote: My gut was that direct io would likely work right on Linux and Solaris, at least. Didn't we discover recently that O_DIRECT fails for ext4 on linux if ordered=data, or something like that? Quite. Blithe assertions that something like this should work aren't worth the electrons they're written on. Indeed. I wasn't making such a claim in case that wasn't clear. I believe, in fact, there is no single way that will work everywhere. This isn't needed for correctness of course, it is merely a tweak for performance as long as the 'not working case' on platform + filesystem X case degrades to something close to what would have happened if we didn't try. I expected POSIX_FADV_NOREUSE not to work on Linux, but haven't looked at it recently and not all systems are Linux so I mentioned it. This was why I thought direct io might be more realistic. I did not have a chance to test before I wrote this email so I attempted to make my uncertainty clear. I _know_ it will not work in some environments, but I thought it was worth looking at if it worked on more than one sane common setup, but I can understand if you feel differently about that. Garick regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
Florian Pflug f...@phlo.org wrote: On Jan10, 2011, at 18:50 , Kevin Grittner wrote: I'm trying not to panic here, but I haven't looked at 2PC before yesterday and am just dipping into the code to support it, and time is short. Can anyone give me a pointer to anything I should read before I dig through the 2PC code, which might accelerate this? It roughly works as follows Upon PREPARE, the locks previously held by the transaction are transferred to a kind of virtual backend which only consists of a special proc array entry. The transaction will thus still appear to be running, and will still be holding its locks, even after the original backend is gone. The information necessary to reconstruct that proc array entry is also written to the 2PC state, and used to recreate the virtual backend after a restart or crash. There are also some additional pieces of transaction state which are stored in the 2PC state file like the full list of subtransaction xids (The proc array entry may not contain all of them if it overflowed). Upon COMMIT PREPARED, the information in the 2PC state file is used to write a COMMIT wal record and to update the clog. The transaction is then committed, and the special proc array entry is removed and all lockmgr locks it held are released. For 2PC to work for SSI transaction, I guess you must check for conflicts during PREPARE - at any later point the COMMIT may only fail transiently, not permanently. Any transaction that adds a conflict with an already prepared transaction must check if that conflict completes a dangerous structure, and abort if this is the case, since the already PREPAREd transaction can no longer be aborted. COMMIT PREPARED then probably doesn't need to do anything special for SSI transactions, apart from some cleanup actions maybe. Thanks; that all makes sense. The devil, as they say, is in the details. As far as I've worked it out, the PREPARE must persist both the predicate locks and any conflict pointers which are to other prepared transactions. That leaves some fussy work around the coming and going of prepared transactions, because on recovery you need to be prepared to ignore conflict pointers with prepared transactions which committed or rolled back. What I haven't found yet is the right place and means to persist and recover this stuff, but that's just a matter of digging through enough source code. Any tips regarding that may save time. I'm also not clear on what, if anything, needs to be written to WAL. I'm really fuzzy on that, still. Unfortunately, it seems that doing things this way will undermine the guarantee that retrying a failed SSI transaction won't fail due to the same conflict as it did originally. I hadn't thought of that, but you're right. Of course, I can't enforce that guarantee, anyway, without some other patch first being there to allow me to cancel other transactions with serialization_failure, even if they are idle in transaction. There doesn't seems a way around that, however - any correct implementation of 2PC for SSI will have to behave that way I fear :-( I think you're right. Hope this helps best regards, It does. Even the parts which just confirm my tentative conclusions save me time in not feeling like I need to cross-check so much. I can move forward with more confidence. Thanks. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
Jeff Davis pg...@j-davis.com wrote: I don't expect this to be a huge roadblock for the feature though. It seems fairly contained. I haven't read the 2PC code either, but I don't expect that you'll need to change the rest of your algorithm just to support it. Agreed; but I am starting to get concerned about whether this particular area can be completed by the start of the CF. I might run a few days over on 2PC support. Unless ... Dan? Could you look into this while I chase down the issue Anssi raised? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
On 11.01.2011 20:08, Florian Pflug wrote: Unfortunately, it seems that doing things this way will undermine the guarantee that retrying a failed SSI transaction won't fail due to the same conflict as it did originally. Consider T1 BEGIN TRANSACTION ISOLATION SERIALIZABLE T1 SELECT * FROM T T1 UPDATE T ... T1 PREPARE TRANSACTION T2 BEGIN TRANSACTION ISOLATION SERIALIZABLE T2 SELECT * FROM T T2 UPDATE T ... - Serialization Error Retrying T2 won't help as long as T1 isn't COMMITTED. T2 should block until T1 commits. I would be very surprised if it doesn't behave like that already. In general, a prepared transaction should be treated like an in-progress transaction - it might still abort too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 11.01.2011 20:08, Florian Pflug wrote: Unfortunately, it seems that doing things this way will undermine the guarantee that retrying a failed SSI transaction won't fail due to the same conflict as it did originally. Consider T1 BEGIN TRANSACTION ISOLATION SERIALIZABLE T1 SELECT * FROM T T1 UPDATE T ... T1 PREPARE TRANSACTION T2 BEGIN TRANSACTION ISOLATION SERIALIZABLE T2 SELECT * FROM T T2 UPDATE T ... - Serialization Error Retrying T2 won't help as long as T1 isn't COMMITTED. T2 should block until T1 commits. I would be very surprised if it doesn't behave like that already. In general, a prepared transaction should be treated like an in-progress transaction - it might still abort too. It shouldn't block if the updates were to different rows, which is what I took Florian to mean; otherwise this would be handled by SI and would have nothing to do with the SSI patch. SSI doesn't introduce any new blocking (with the one exception of the READ ONLY DEFERRABLE style we invented to support long-running reports and backups, and all blocking there is at the front -- once it's running, it's going full speed ahead). -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
On Tue, 2011-01-11 at 11:43 -0500, Tom Lane wrote: If that's what you're after, getObjectDescription is entirely unsuitable, because of the fact that its results are dependent on search path and language settings. regards, tom lane Agreed, and as long as the additional information added to the description by my patch is not useful for any other purpose I see no reason for applying it. So would anyone be confused by a description of pg_amproc not including the types? I personally have no idea since I have not had to work with indexes enough to say. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI and 2PC
On Jan11, 2011, at 19:41 , Heikki Linnakangas wrote: On 11.01.2011 20:08, Florian Pflug wrote: Unfortunately, it seems that doing things this way will undermine the guarantee that retrying a failed SSI transaction won't fail due to the same conflict as it did originally. Consider T1 BEGIN TRANSACTION ISOLATION SERIALIZABLE T1 SELECT * FROM T T1 UPDATE T ... T1 PREPARE TRANSACTION T2 BEGIN TRANSACTION ISOLATION SERIALIZABLE T2 SELECT * FROM T T2 UPDATE T ... - Serialization Error Retrying T2 won't help as long as T1 isn't COMMITTED. T2 should block until T1 commits. The serialization error will occur even if T1 and T2 update *different* rows. This is due to the SELECTs in the interleaved schedule above returning the state of T prior to both T1 and T2. Which of course never the case for a serial schedule. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I implemented this in two ways, and can't decide which I like better: 1. The contents of the backup label file are returned to the caller of do_pg_start_backup() as a palloc'd string. 2. do_pg_start_backup() creates a temporary file that the backup label is written to (instead of backup_label). Implementation 1 changes more code, as pg_start/stop_backup() need to be changed to write/read from memory instead of file, but the result isn't any more complicated. Nevertheless, I somehow feel more comfortable with 2. Seems like either one of these is fairly problematic in that you have to have some monstrous kluge to get the backup_label file to appear with the right name in the tarfile. How badly do we actually need this? I don't think the use-case for concurrent base backups is all that large in practice given the I/O hit it's going to involve. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On Tue, Jan 11, 2011 at 19:51, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I implemented this in two ways, and can't decide which I like better: 1. The contents of the backup label file are returned to the caller of do_pg_start_backup() as a palloc'd string. 2. do_pg_start_backup() creates a temporary file that the backup label is written to (instead of backup_label). Implementation 1 changes more code, as pg_start/stop_backup() need to be changed to write/read from memory instead of file, but the result isn't any more complicated. Nevertheless, I somehow feel more comfortable with 2. Seems like either one of these is fairly problematic in that you have to have some monstrous kluge to get the backup_label file to appear with the right name in the tarfile. How badly do we actually need this? I don't think the use-case for concurrent base backups is all that large in practice given the I/O hit it's going to involve. I think it can be done cleaner in the tar file injection - I've been chatting with Heikki offlist about that. Not sure, but I have a feeling it does. As for the use-case, it doesn't necessarily involve a huge I/O hit if either the entire DB is in RAM (not all that uncommon - though then the backup finishes quickly as well), or if the *client* is slow, thus making the server backlogged on sending the base backup. Having it possible to do it concurrently also makes for *much* easier use of the feature. It might be just about overlapping with a few seconds, for example - which would probably have no major effect, but takes a lot more code on the other end to work around. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object
Andreas Karlsson andr...@proxel.se writes: So would anyone be confused by a description of pg_amproc not including the types? It really shouldn't be useful to include those. Attend what it says in the fine manual for CREATE OPERATOR CLASS: In a FUNCTION clause, the operand data type(s) the function is intended to support, if different from the input data type(s) of the function (for B-tree and hash indexes) or the class's data type (for GIN and GiST indexes). These defaults are always correct, so there is no point in specifying op_type in a FUNCTION clause in CREATE OPERATOR CLASS, but the option is provided for consistency with the comparable syntax in ALTER OPERATOR FAMILY. The reason the ALTER OPERATOR FAMILY DROP syntax has to include operand types is that it lacks the full name/types of the referenced function. Since getObjectDescription *does* provide those, it doesn't serve any real purpose to repeat the information. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On 11.01.2011 20:51, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: I implemented this in two ways, and can't decide which I like better: 1. The contents of the backup label file are returned to the caller of do_pg_start_backup() as a palloc'd string. 2. do_pg_start_backup() creates a temporary file that the backup label is written to (instead of backup_label). Implementation 1 changes more code, as pg_start/stop_backup() need to be changed to write/read from memory instead of file, but the result isn't any more complicated. Nevertheless, I somehow feel more comfortable with 2. Seems like either one of these is fairly problematic in that you have to have some monstrous kluge to get the backup_label file to appear with the right name in the tarfile. Oh. I'm surprised you feel that way - that part didn't feel ugly or kludgey at all to me. How badly do we actually need this? I don't think the use-case for concurrent base backups is all that large in practice given the I/O hit it's going to involve. It makes it very convenient to set up standbys, without having to worry that you'll conflict e.g with a nightly backup. I don't imagine people will use streaming base backups for very large databases anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
It makes it very convenient to set up standbys, without having to worry that you'll conflict e.g with a nightly backup. I don't imagine people will use streaming base backups for very large databases anyway. Also, imagine that you're provisioning a 10-node replication cluster on EC2. This would make that worlds easier. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
Magnus Hagander mag...@hagander.net writes: On Tue, Jan 11, 2011 at 19:51, Tom Lane t...@sss.pgh.pa.us wrote: Seems like either one of these is fairly problematic in that you have to have some monstrous kluge to get the backup_label file to appear with the right name in the tarfile. How badly do we actually need this? I don't think the use-case for concurrent base backups is all that large in practice given the I/O hit it's going to involve. I think it can be done cleaner in the tar file injection - I've been chatting with Heikki offlist about that. Not sure, but I have a feeling it does. One point that I'm particularly interested to see how you'll kluge it is ensuring that the tarball contains only the desired temp data and not also the real $PGDATA/backup_label, should there be a normal base backup being done concurrently with the streamed one. The whole thing just seems too fragile and dangerous to be worth dealing with given that actual usage will be a corner case. *I* sure wouldn't trust it to work when the chips were down. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On 11.01.2011 20:17, Heikki Linnakangas wrote: Patches for both approaches attached. They're also available in my github repository at g...@github.com:hlinnaka/postgres.git. Just so people won't report the same issues again, a couple of bugs have already cropped up (thanks Magnus): * a backup_label file in the data directory should now not be tarred up - we're injecting a different backup_label file there. * pg_start_backup_callback needs to be updated to just decrement forcePageWrites, not reset it to 'false' * pg_stop_backup() decrements forcePageWrites twice. oops. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: RangeTypes
On Tue, Jan 11, 2011 at 01:16:47AM -0800, Jeff Davis wrote: Ok, I have made some progress. This is still a proof-of-concept patch, but the important pieces are working together. Synopsis: CREATE TYPE numrange AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp); SELECT range_eq('[1,2.2)'::numrange,'[1,2.2]'); SELECT range_lbound('(3.7,9]'::numrange); SELECT range(6.7); SELECT '-'::numrange; -- empty SELECT '[1, NULL]'::numrange; -- ] will become ) SELECT '(INF, 3)'::numrange; I haven't completed many of the other generic functions, because I'd like to make sure I'm on the right track first. The important thing about the functions above is that they show ANYRANGE working in conjunction with ANYELEMENT in various combinations, which was a significant part of this patch. Here are the open items: 1. Generic functions -- most of which are fairly obvious. However, I want to make sure I'm on the right track first. 2. GiST -- I'll need a mechanism to implement the penalty function, and perhaps I'll also need additional support for the picksplit function. For the penalty function, I think I'll need to require a function to convert the subtype into a float, and I can use that to find a distance (which can be the penalty). That should also satisfy anything that picksplit might need. 3. Typmod -- There is still one annoyance about typmod remaining. I need to treat it like an array in find_typmod_coercion_function(), and then create a coercion expression. Is it worth it? Would typmod on a range be confusing, or should I just finish this item up? Probably not worth it for the first round. 4. Docs Happy to help evenings this week :) 5. Tests Same. What do you have so far? 6. pg_dump -- should be pretty easy; I just want to settle some of the other stuff first. 7. Right now the parse function is quite dumb. Is there some example code I should follow to make sure I get this right? KISS is a fine principle. Do you really need it smart on the first round? :) 8. In order to properly support the various combinations of ANYRANGE and ANYELEMENT in a function definition (which are all important), we need to be able to determine the range type given a subtype. That means that each subtype can only have one associated range, which sounds somewhat limiting, but it can be worked around by using domains. I don't think this is a major limitation. Comments? As we get a more nuanced type system, this is one of the things that will need to get reworked, so I'd say it's better not to put too much effort into things that a refactor of the type system http://wiki.postgresql.org/wiki/Refactor_Type_System would make much better, at least right now. Also related to representation: * Right now I always align the subtypes within the range according to typalign. I could avoid that by packing the bytes tightly, and then copying them around later. Suggestions? And what should the overall alignment of the range type be? For the first cut, the simplest possible. * If it's a fixed-length type, we can save the varlena header byte on the overall range; but we lose the ability to save space when one of the boundaries of the range is missing (NULL or INF), and it would complicate the code a little. Thoughts? Probably not worth complicating the code at this stage. KISS again :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Now that we have a basic over-the-wire base backup capability in walsender, it would be nice to allow taking multiple base backups at the same time. I would prefer to be able to take a base backup from a standby, or is that already possible? What about cascading the wal shipping? Maybe those ideas are much bigger projects, but if I don't ask, I don't get to know :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DISCARD ALL ; stored procedures
Excerpts from Stephen Frost's message of vie ene 07 15:29:52 -0300 2011: * Stephen Frost (sfr...@snowman.net) wrote: * Robert Haas (robertmh...@gmail.com) wrote: Making it part of DISCARD PLANS; and back-patching it to 8.3 where DISCARD was introduced would be awesome for me. :) I'd need to look at this more closely before committing anything, but at first blush I think that's reasonable. Have a patch? To be honest, I was fully expecting a response of that's hard to do. Soo, yeah, I found the this is hard part. Basically, the plan cacheing, etc, is all handled by the stored procedures themselves, and we havn't got any way to tell a PL destroy all your plans. We might be able to hack up fmgr to throw away all references to functions, but that wouldn't release the memory they use up, making 'discard plans;' leak like a sieve. What this discussion suggests to me is that cached plans need to be tracked by a resource owner that's linked to the function. The problem I see with this idea is figure out what module would keep track of resowners that need to be reset ... Other than that I think it should be straightforward :-) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On 11.01.2011 21:50, Dimitri Fontaine wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: Now that we have a basic over-the-wire base backup capability in walsender, it would be nice to allow taking multiple base backups at the same time. I would prefer to be able to take a base backup from a standby, or is that already possible? What about cascading the wal shipping? Maybe those ideas are much bigger projects, but if I don't ask, I don't get to know :) Yeah, those are bigger projects.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On Tue, 2011-01-11 at 20:17 +0200, Heikki Linnakangas wrote: So, this patch modifies the internal do_pg_start/stop_backup functions so that in addition to the traditional mode of operation, where a backup_label file is created in the data directory where it's backed up along with all other files, the backup label file is be returned to the caller, and the caller is responsible for including it in the backup. The code in replication/basebackup.c includes it in the tar file that's streamed the client, as backup_label. Perhaps we can use this more intelligent form of base backup to differentiate between: a. a primary that has crashed while a backup was in progress; and b. an online backup that is being restored. Allowing the user to do an unrestricted file copy as a base backup doesn't allow us to make that differentiation. That lead to the two bugs that we fixed in StartupXLOG(). And right now there are still two problems remaining (albeit less severe): 1. If it's a primary recovering from a crash, and there is a backup_label file, and the WAL referenced in the backup_label exists, then it does a bunch of extra work during recovery; and 2. In the same situation, if the WAL referenced in the backup_label does not exist, then it PANICs with a HINT to tell you to remove the backup_label. Is this an opportunity to solve these problems and simplify the code? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autogenerating error code lists (was Re: [COMMITTERS] pgsql: Add foreign data wrapper error code values for SQL/MED.)
On 11/01/11 18:59, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: On 11/01/11 17:11, Tom Lane wrote: Huh? Why in the world would the specific location of the #include have anything to do with the problem? I'v having a hard time convincing make to generate errcodes.h before compiling any .c file that includes postgres.h. The only way I found was to make src/include/errcodes.h a dependancy of the all target. Peter would probably be a better person than me to answer that, but I imagine that what you want is similar to what src/backend/Makefile does for parser/gram.h, only applied at the src/ level or maybe even the root. OK, that was a nudge in the right direction. Basing the rules from src/backend/Makefile I changed src/Makefile to rebuild src/include/errcodes.h before building the subdirectories and... it failed miserably. There's some trickery beyond my understanding here. There's a rule like this in src/backend/Makefile: $(SUBDIRS:%=%-recursive): $(top_builddir)/src/include/parser/gram.h that triggers checking whether gram.h needs to be rebuilt before recursing into each SUBDIR. A similar trick in src/Makefile doesn't work, because it's src/backend/common.mk that is responsible for the SUBDIR-recursive calls, and src/Makefile does not include it. From what I gathered by reading Makefile.global, the $(recurse) call in enters each SUBDIR and builds a target called target-subdir-recurse. And actually, if I change my rule to read: $(SUBDIRS:%=all-%-recurse): $(top_builddir)/src/include/utils/errcodes.h it works. Now whether that's acceptable or not is another thing entirely... Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On 11.01.2011 22:16, Jeff Davis wrote: On Tue, 2011-01-11 at 20:17 +0200, Heikki Linnakangas wrote: So, this patch modifies the internal do_pg_start/stop_backup functions so that in addition to the traditional mode of operation, where a backup_label file is created in the data directory where it's backed up along with all other files, the backup label file is be returned to the caller, and the caller is responsible for including it in the backup. The code in replication/basebackup.c includes it in the tar file that's streamed the client, as backup_label. Perhaps we can use this more intelligent form of base backup to differentiate between: a. a primary that has crashed while a backup was in progress; and b. an online backup that is being restored. Allowing the user to do an unrestricted file copy as a base backup doesn't allow us to make that differentiation. That lead to the two bugs that we fixed in StartupXLOG(). And right now there are still two problems remaining (albeit less severe): 1. If it's a primary recovering from a crash, and there is a backup_label file, and the WAL referenced in the backup_label exists, then it does a bunch of extra work during recovery; and 2. In the same situation, if the WAL referenced in the backup_label does not exist, then it PANICs with a HINT to tell you to remove the backup_label. Is this an opportunity to solve these problems and simplify the code? It won't change the situation for pg_start_backup(), but with the patch the base backups done via streaming won't have those issues, because backup_label is not created (with that name) in the master. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add function dependencies
On tis, 2011-01-11 at 16:57 +0100, Dimitri Fontaine wrote: Peter Eisentraut pete...@gmx.net writes: Making it work for language SQL would be nice, though. Please consider a new DEPENDENCY_XXX constant for that though, because otherwise I think it could cause problems in the extension's dependency tracking. Even with a new DEPENDENCY_FUNCALL or other constant, the extension code would need to be aware of this new kind of not-to-follow dependencies somehow, I guess. What's a not-to-follow dependency? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] pgbench to the MAXINT
Em 10-01-2011 05:25, Greg Smith escreveu: Euler Taveira de Oliveira wrote: Em 07-01-2011 22:59, Greg Smith escreveu: setrandom: invalid maximum number -2147467296 It is failing at atoi() circa pgbench.c:1036. But it just the first one. There are some variables and constants that need to be converted to int64 and some functions that must speak 64-bit such as getrand(). Are you working on a patch? http://archives.postgresql.org/pgsql-hackers/2010-01/msg02868.php http://archives.postgresql.org/message-id/4c326f46.4050...@2ndquadrant.com Greg, I just improved your patch. I tried to work around the problems pointed out in the above threads. Also, I want to raise some points: (i) If we want to support and scale factor greater than 21474 we have to convert some columns to bigint; it will change the test. From the portability point it is a pity but as we have never supported it I'm not too worried about it. Why? Because it will use bigint columns only if the scale factor is greater than 21474. Is it a problem? I don't think so because generally people compare tests with the same scale factor. (ii) From the performance perspective, we need to test if the modifications don't impact performance. I don't create another code path for 64-bit modifications (it is too ugly) and I'm afraid some modifications affect the 32-bit performance. I'm in a position to test it though because I don't have a big machine ATM. Greg, could you lead these tests? (iii) I decided to copy scanint8() (called strtoint64 there) from backend (Robert suggestion [1]) because Tom pointed out that strtoll() has portability issues. I replaced atoi() with strtoint64() but didn't do any performance tests. Comments? [1] http://archives.postgresql.org/pgsql-hackers/2010-07/msg00173.php -- Euler Taveira de Oliveira http://www.timbira.com/ diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 7c2ca6e..e9eb720 100644 *** a/contrib/pgbench/pgbench.c --- b/contrib/pgbench/pgbench.c *** *** 60,65 --- 60,67 #define INT64_MAX INT64CONST(0x7FFF) #endif + #define MAX_RANDOM_VALUE64 INT64_MAX + /* * Multi-platform pthread implementations */ *** usage(const char *progname) *** 364,378 progname, progname); } /* random number generator: uniform distribution from min to max inclusive */ ! static int ! getrand(int min, int max) { /* * Odd coding is so that min and max have approximately the same chance of * being selected as do numbers between them. */ ! return min + (int) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0)); } /* call PQexec() and exit() on failure */ --- 366,451 progname, progname); } + /* + * strtoint64 -- convert a string to 64-bit integer + * + * this function is a modified version of scanint8() from + * src/backend/utils/adt/int8.c. + * + * XXX should it have a return value? + * + */ + static int64 + strtoint64(const char *str) + { + const char *ptr = str; + int64 result = 0; + int sign = 1; + + /* + * Do our own scan, rather than relying on sscanf which might be broken + * for long long. + */ + + /* skip leading spaces */ + while (*ptr isspace((unsigned char) *ptr)) + ptr++; + + /* handle sign */ + if (*ptr == '-') + { + ptr++; + + /* + * Do an explicit check for INT64_MIN. Ugly though this is, it's + * cleaner than trying to get the loop below to handle it portably. + */ + if (strncmp(ptr, 9223372036854775808, 19) == 0) + { + result = -INT64CONST(0x7fff) - 1; + ptr += 19; + goto gotdigits; + } + sign = -1; + } + else if (*ptr == '+') + ptr++; + + /* require at least one digit */ + if (!isdigit((unsigned char) *ptr)) + fprintf(stderr, invalid input syntax for integer: \%s\\n, str); + + /* process digits */ + while (*ptr isdigit((unsigned char) *ptr)) + { + int64 tmp = result * 10 + (*ptr++ - '0'); + + if ((tmp / 10) != result) /* overflow? */ + fprintf(stderr, value \%s\ is out of range for type bigint\n, str); + result = tmp; + } + + gotdigits: + + /* allow trailing whitespace, but not other trailing chars */ + while (*ptr != '\0' isspace((unsigned char) *ptr)) + ptr++; + + if (*ptr != '\0') + fprintf(stderr, invalid input syntax for integer: \%s\\n, str); + + return ((sign 0) ? -result : result); + } + /* random number generator: uniform distribution from min to max inclusive */ ! static int64 ! getrand(int64 min, int64 max) { /* * Odd coding is so that min and max have approximately the same chance of * being selected as do numbers between them. */ ! return min + (int64) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE64 + 1.0)); } /* call PQexec() and exit() on failure */ *** top: *** 887,893 if (commands[st-state] == NULL) { st-state = 0; ! st-use_file = getrand(0,
Re: [HACKERS] Allowing multiple concurrent base backups
On Tue, 2011-01-11 at 22:56 +0200, Heikki Linnakangas wrote: 1. If it's a primary recovering from a crash, and there is a backup_label file, and the WAL referenced in the backup_label exists, then it does a bunch of extra work during recovery; and 2. In the same situation, if the WAL referenced in the backup_label does not exist, then it PANICs with a HINT to tell you to remove the backup_label. Is this an opportunity to solve these problems and simplify the code? It won't change the situation for pg_start_backup(), but with the patch the base backups done via streaming won't have those issues, because backup_label is not created (with that name) in the master. Do you think we should change the backup protocol for normal base backups to try to fix this? Or do you think that the simplicity of unrestricted file copy is worth these problems? We could probably make some fairly minor changes, like making a file on the primary and telling users to exclude it from any base backup. The danger, of course, is that they do copy it, and their backup is compromised. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers