Re: [HACKERS] patch: shared session variables
Having read through this thread, the consensus seems to be that we don't want this patch as it is (and I agree with that). As I understand it, you are trying to solve two problems: 1. Passing parameters to a DO statement. You could quote the parameters and inline them in the block itself in the client, but quoting can be difficult to do correctly and is not as readable anyway, so I agree it would be good to be able to pass them separately. 2. Returning results from a DO statement. At the moment, a DO statement is a utility statement, so it cannot return a result set. Two different approaches to these problems have been discussed that have some support: A) WITH FUNCTION syntax, to allow defining a temporary function for a single query, similar to the current WITH syntax. (http://archives.postgresql.org/pgsql-hackers/2012-07/msg00426.php) B) DO ... USING parameter list syntax. This is a straightforward extension of the current DO syntax, just adding a parameter list to it. Not sure how to return a result set from this, perhaps also support a RETURNS keyword, similar to CREATE FUNCTION. I'm ok with either of those approaches. A) would be more flexible, while B) would be straightforward extension of what we already have. I'm marking this patch as rejected in the commitfest app. Please pursue the WITH FUNCTION or DO ... USING syntax instead. Thanks! On 31.08.2012 21:27, Pavel Stehule wrote: 2012/8/31 Dimitri Fontainedimi...@2ndquadrant.fr: Pavel Stehulepavel.steh...@gmail.com writes: Pavel, you didn't say what you think about the WITH FUNCTION proposal? I don't like it - this proposal is too lispish - it is not SQL We're not doing lambda here, only extending a facility that we rely on today. The function would be named, for one. I don't know what you mean by SQL being lispish here, and I can't imagine why it would be something to avoid. And you didn't say how do you want to turn a utility statement into something that is able to return a result set. if we support real procedures ala sybase procedures (MySQL, MSSQL..) - then we can return result with same mechanism - there are no significant difference between DO and CALL statements - you don't know what will be result type before you call it. Currently we don't have CALL, and we have DO which is not a query but a utility statement. Are you proposing to implement CALL? What would be the difference between making DO a query and having CALL? defacto a CALL statement implementation can solve this issue. The core of this issue is an impossibility using parameters for utility statements. CALL and DO are utility statements - and if we can use parameters for CALL, then we can do it for DO too. CALL statement starts a predefined batch - inside this batch, you can do anything - can use COMMIT, ROLLBACK, SELECTs, ... DO is some batch with immediate start. Sure, there is relative significant between stored procedures implemented in popular RDBMS and although I don't like T-SQL too much, I like sybase concept of stored procedures - it is strong and very useful for maintaining tasks. Regards Pavel Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
2012-09-22 20:49 keltezéssel, Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: new version with a lot more cleanup is attached. I looked at this patch, and frankly I'm rather dismayed. It's a mess. Thank you for the kind words. :-) To start at the bottom level, the changes to PGSemaphoreLock broke it, and seem probably unnecessary anyway. As coded, calling the condition checker breaks handling of error returns from semop(), unless the checker is careful to preserve errno, which LmgrTimeoutCondition isn't (and really shouldn't need to be anyway). I would call it an oversight, nothing more. More, if the checker does return true, it causes PGSemaphoreLock to utterly violate its contract: it returns to the caller without having acquired the semaphore, and without even telling the caller so. Worse, if we *did* acquire the semaphore, we might still exit via this path, since the placement of the condition check call ignores the comment a few lines up: * Once we acquire the lock, we do NOT check for an interrupt before * returning. The caller needs to be able to record ownership of the lock * before any interrupt can be accepted. We could possibly fix all this with a redesigned API contract for PGSemaphoreLock, but frankly I do not see a good reason to be tinkering with it at all. We never needed to get it involved with deadlock check handling, and I don't see why that needs to change for lock timeouts. One very good reason why monkeying with PGSemaphoreLock is wrong is that on some platforms a SIGALRM interrupt won't interrupt the semop() call, and thus control would never reach the checker anyway. If we're going to throw an error, it must be thrown from the interrupt handler. Then please, explain to me, how on Earth can the current deadlock_timeout can report the error? Sure, I can see the PG_TRY() ... PG_END_TRY() block in lock.c but as far as I can see, nothing in the CheckDeadLock() - DeadLockCheck() - DeadLockCheckRecurse() path diverts the code to return to a different address from the signal handler, i.e. there is no elog(ERROR) or ereport(ERROR) even in the DS_HARD_DEADLOCK case, so nothing calls siglongjmp(). So logically the code shouldn't end up in the PG_CATCH() branch. semop() will only get an errno = EINTR when returning from the signal handler and would loop again. Then what makes it return beside being able to lock the semaphore? The conditions in ProcSleep() that e.g. print the lock stats work somehow. The whole lmgrtimeout module seems to me to be far more mechanism than is warranted, for too little added functionality. In the first place, there is nothing on the horizon suggesting that we need to let any plug-in code get control here, and if anything the delicacy of what's going on leads me to not wish to expose such a possibility. In the second place, it isn't adding any actually useful functionality, it's just agglomerating some checks. The minimum thing I would want it to do is avoid calling timeout.c multiple times, which is what would happen right now (leading to four extra syscalls per lock acquisition, which is enough new overhead to constitute a strong objection to committing this patch at all). On the whole I think we could forget lmgrtimeout and just hardwire the lock timeout and deadlock check cases. But in any case we're going to need support in timeout.c for enabling/disabling multiple timeouts at once without extra setitimer calls. OK, so you prefer the previous hardcoding PGSemaphoreTimedLock() that makes every LockAcquire() check its return code and the detailed error message about failed to lock the given object? I will add new functions to timeout.c to remove many timeout sources at once to decrease the amount of syscalls needed. I'm also not thrilled about the way in which the existing deadlock checking code has been hacked up. As an example, you added this to DeadLockReport(): + if (!DeadLockTimeoutCondition()) + return; which again causes it to violate its contract, namely to report a deadlock, in the most fundamental way -- existing callers aren't expecting it to return *at all*. Existing caller*s*? There is only one caller. The reasoning behind this change was that if the code reaches ReportLmgrTimeoutError() in lock.c then at least one timeout triggered and one *Report function in the chain will do ereport(ERROR). *THAT* would trigger te siglongjmp() ending up in PG_CATCH(). The added lines in DeadLockReport() ensures that the deadlock error is not reported if it didn't trigger. Surely we can decouple the deadlock and lock timeout cases better than that; or at least if we can't it's a delusion to propose anything like lmgrtimeout in the first place. There's considerable lack of attention to updating comments, too. For instance in WaitOnLock you only bothered to update the comment immediately adjacent to the changed code, and not the two comment blocks above that, which
Re: [HACKERS] Configuration include directory
On 21.09.2012 00:10, Selena Deckelmann wrote: Hello! I've spent a little time with this patch and have attached revision 6. Thanks, Noah, for a fantastically detailed review. The only thing I didn't do that Noah suggested was run pgindent on guc-file.l. A cursory search did not reveal source compatible with my operating system for 'indent'. If someone points me to it, I'd happily also comply with the request to reindent. And document how to do that on my platform(s). :) I did just remove the references to the Apache project etc. I agree that providing best practices is good, but I'm skeptical about including best practices piecemeal. Adding it to earlier tutorial sections would probably be a bit more visible IMO. This seems pretty much ready to commit. One tiny detail that I'd like to clarify: the docs say: Multiple files within an include directory are ordered by an alphanumeric sorting, so that ones beginning with numbers are considered before those starting with letters. To be more precise, the patch uses strcmp() for the comparisons. That's also what apache seems to do, although I couldn't find it being mentioned explicitly in their docs. It's true that numbers are sorted before letters, but should we also mention that upper-case letters are sorted before lower-case ones, and that sorting of non-ASCII characters depends on the encoding, in often surprising ways? Is there a better term for what strcmp() does? ASCII order? Is there precedence somewhere else in the PostgreSQL codebase or docs for that? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for option in pg_resetxlog for restore from WAL files
On 18.07.2012 16:47, Amit kapila wrote: Patch implementing the design in below mail chain is attached with this mail. This patch copies the ReadRecord() function and a bunch of related functions from xlog.c into pg_resetxlog.c. There's a separate patch in the current commitfest to make that code reusable, without having to copy-paste it to every tool that wants to parse the WAL. See https://commitfest.postgresql.org/action/patch_view?id=860. This patch should be refactored to make use of that framework, as soon as it's committed. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 64-bit API for large object
Here is 64-bit API for large object version 2 patch. I checked this patch. It can be applied onto the latest master branch without any problems. My comments are below. 2012/9/11 Tatsuo Ishii is...@postgresql.org: Ok, here is the patch to implement 64-bit API for large object, to allow to use up to 4TB large objects(or 16TB if BLCKSZ changed to 32KB). The patch is based on Jeremy Drake's patch posted on September 23, 2005 (http://archives.postgresql.org/pgsql-hackers/2005-09/msg01026.php) and reasonably updated/edited to adopt PostgreSQL 9.3 by Nozomi Anzai for the backend part and Yugo Nagata for the rest(including documentation patch). Here are changes made in the patch: 1) Frontend lo_* libpq functions(fe-lobj.c)(Yugo Nagata) lo_initialize() gathers backend 64-bit large object handling function's oid, namely lo_lseek64, lo_tell64, lo_truncate64. If client calls lo_*64 functions and backend does not support them, lo_*64 functions return error to caller. There might be an argument since calls to lo_*64 functions can automatically be redirected to 32-bit older API. I don't know this is worth the trouble though. I think it should definitely return an error code when user tries to use lo_*64 functions towards the backend v9.2 or older, because fallback to 32bit API can raise unexpected errors if application intends to seek the area over than 2GB. Currently lo_initialize() throws an error if one of oids are not available. I doubt we do the same way for 64-bit functions since this will make 9.3 libpq unable to access large objects stored in pre-9.2 PostgreSQL servers. It seems to me the situation to split the case of pre-9.2 and post-9.3 using a condition of conn-sversion = 90300. Fixed so, and tested it by deleteing the lo_tell64's row from pg_proc. To pass 64-bit integer to PQfn, PQArgBlock is used like this: int *ptr is a pointer to 64-bit integer and actual data is placed somewhere else. There might be other way: add new member to union u to store 64-bit integer: typedef struct { int len; int isint; union { int*ptr;/* can't use void (dec compiler barfs) */ int integer; int64 bigint; /* 64-bit integer */ } u; } PQArgBlock; I'm a little bit worried about this way because PQArgBlock is a public interface. I'm inclined to add a new field for the union; that seems to me straight forward approach. For example, the manner in lo_seek64() seems to me confusable. It set 1 on isint field even though pointer is delivered actually. + argv[1].isint = 1; + argv[1].len = 8; + argv[1].u.ptr = (int *) len; Your proposal was not adopted per discussion. Also we add new type pg_int64: #ifndef NO_PG_INT64 #define HAVE_PG_INT64 1 typedef long long int pg_int64; #endif in postgres_ext.h per suggestion from Tom Lane: http://archives.postgresql.org/pgsql-hackers/2005-09/msg01062.php I'm uncertain about context of this discussion. Does it make matter if we include stdint.h and use int64_t instead of the self defined data type? Your proposal was not adopted per discussion. Per discussion, endiannness translation was moved to fe-lobj.c. 2) Backend lo_* functions (be-fsstubs.c)(Nozomi Anzai) Add lo_lseek64, lo_tell64, lo_truncate64 so that they can handle 64-bit seek position and data length. loread64 and lowrite64 are not added because if a program tries to read/write more than 2GB at once, it would be a sign that the program need to be re-designed anyway. I think it is a reasonable. 3) Backend inv_api.c functions(Nozomi Anzai) No need to add new functions. Just extend them to handle 64-bit data. BTW , what will happen if older 32-bit libpq accesses large objects over 2GB? lo_read and lo_write: they can read or write lobjs using 32-bit API as long as requested read/write data length is smaller than 2GB. So I think we can safely allow them to access over 2GB lobjs. lo_lseek: again as long as requested offset is smaller than 2GB, there would be no problem. lo_tell:if current seek position is beyond 2GB, returns an error. Even though iteration of lo_lseek() may move the offset to 4TB, it also makes unavailable to use lo_tell() to obtain the current offset, so I think it is reasonable behavior. However, error code is not an appropriate one. + if (INT_MAX offset) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), +errmsg(invalid large-object descriptor: %d, fd))); + PG_RETURN_INT32(-1); + } According to the manpage of lseek(2) EOVERFLOW The resulting file
Re: [HACKERS] Patch for option in pg_resetxlog for restore from WAL files
On Monday, September 24, 2012 2:30 PM Heikki Linnakangas wrote: On 18.07.2012 16:47, Amit kapila wrote: Patch implementing the design in below mail chain is attached with this mail. This patch copies the ReadRecord() function and a bunch of related functions from xlog.c into pg_resetxlog.c. There's a separate patch in the current commitfest to make that code reusable, without having to copy-paste it to every tool that wants to parse the WAL. See https://commitfest.postgresql.org/action/patch_view?id=860. This patch should be refactored to make use of that framework, as soon as it's committed. Sure. Thanks for the feedback. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation
From: Heikki Linnakangas [mailto:heikki(dot)linnakangas(at)enterprisedb(dot)com] Sent: Monday, August 27, 2012 5:58 PM To: Amit kapila On 27.08.2012 15:18, Amit kapila wrote: I have implemented the WAL Reduction Patch for the case of HOT Update as pointed out by Simon and Robert. In this patch it only goes for Optimized WAL in case of HOT Update with other restrictions same as in previous patch. The performance numbers for this patch are attached in this mail. It has improved by 90% if the page has fillfactor 80. Now going forward I have following options: a. Upload the patch in Open CF for WAL Reduction which contains reductution for HOT and non-HOT updates. b. Upload the patch in Open CF for WAL Reduction which contains reductution for HOT updates. c. Upload both the patches as different versions. Let's do it for HOT updates only. Simon Robert made good arguments on why this is a bad idea for non-HOT updates. Okay, I shall do it that way. So now I shall send information about all the testing I have done for this Patch and then Upload it in CF. Rebased version of patch based on latest code. With Regards, Amit Kapila. wal_update_changes_v2.patch Description: wal_update_changes_v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: auto updatable views [Review of Patch]
On Sunday, September 23, 2012 12:33 AM Dean Rasheed wrote: On 18 September 2012 14:23, Amit kapila amit.kap...@huawei.com wrote: Please find the review of the patch. Thanks for the review. Attached is an updated patch, and I've include some responses to specific review comments below. I have verified your updated patch. It works fine and according to me it is ready for committer to check this patch. I have updated in CF page as Ready for Committer. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes
Hi, Problem 1: concurrency: Testcase: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) Session 2: BEGIN; SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 3: DROP INDEX CONCURRENTLY test_drop_concurrently_data; (in-progress) Session 2: INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); COMMIT; Session 1: SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) SET enable_bitmapscan = false; SET enable_indexscan = false; SELECT * FROM test_drop_concurrently WHERE data = 34343; (2 rows) Explanation: index_drop does: indexForm-indisvalid = false; /* make unusable for queries */ indexForm-indisready = false; /* make invisible to changes */ Setting indisready = false is problematic because that prevents index updates which in turn breaks READ COMMITTED semantics. I think there need to be one more phase that waits for concurrent users of the index to finish before setting indisready = false. Problem 2: undroppable indexes: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 2: DROP INDEX CONCURRENTLY test_drop_concurrently_data; waiting ^CCancel request sent ERROR: canceling statement due to user request Session 1: ROLLBACK; DROP TABLE test_drop_concurrently; SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass, indisvalid, indisready FROM pg_index WHERE indexrelid = 'test_drop_concurrently_data'::regclass; indexrelid | indrelid | indexrelid | indrelid | indisvalid | indisready +--+-+--++ 24703 |24697 | test_drop_concurrently_data | 24697| f | f (1 row) DROP INDEX test_drop_concurrently_data; ERROR: could not open relation with OID 24697 Haven't looked at this one at all. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes
On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote: Problem 2: undroppable indexes: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 2: DROP INDEX CONCURRENTLY test_drop_concurrently_data; waiting ^CCancel request sent ERROR: canceling statement due to user request Session 1: ROLLBACK; DROP TABLE test_drop_concurrently; SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass, indisvalid, indisready FROM pg_index WHERE indexrelid = 'test_drop_concurrently_data'::regclass; indexrelid | indrelid | indexrelid | indrelid | indisvalid | indisready +--+-+--+-- --+ 24703 |24697 | test_drop_concurrently_data | 24697| f | f (1 row) DROP INDEX test_drop_concurrently_data; ERROR: could not open relation with OID 24697 Haven't looked at this one at all. Thats because it has to commit transactions inbetween to make the catalog changes visible. Unfortunately at that point it already deleted the dependencies in deleteOneObject... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' works. --locale='zh_CN.UTF8' also works. But still the question is, should the encoding name be case sensitive? PostgreSQL treats encoding names as case insensitive. But it depends on the operating system whether locale names are case sensitive. I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, old and new cluster lc_ctype values do not match\n); if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0) pg_log(PG_FATAL, old and new cluster encoding values do not match\n); } -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On 9/24/12 8:55 AM, Bruce Momjian wrote: I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, old and new cluster lc_ctype values do not match\n); I seem to recall that at some point in the distant past, somehow some Linux distributions changed the canonical spelling of locale names from xx_YY.UTF-8 to xx_YY.utf8. So if people are upgrading old PostgreSQL instances that use the old spelling, pg_upgrade will probably fail. A fix might be to take the locale name you find in pg_control and run it through setlocale() to get the new canonical name. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 09:06:04AM -0400, Peter Eisentraut wrote: On 9/24/12 8:55 AM, Bruce Momjian wrote: I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, old and new cluster lc_ctype values do not match\n); I seem to recall that at some point in the distant past, somehow some Linux distributions changed the canonical spelling of locale names from xx_YY.UTF-8 to xx_YY.utf8. So if people are upgrading old PostgreSQL instances that use the old spelling, pg_upgrade will probably fail. A fix might be to take the locale name you find in pg_control and run it through setlocale() to get the new canonical name. Or we could just remove dashes from the name before comparisons. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL in the French news
Dear devs, For your information, PostgreSQL appears explicitely in the list of Free software cited by the French Prime Minister, Jean-Marc Ayrault, in his Circulaire (i.e. a kind of instruction to civil servants) signed last week. See on page 9 and 18 of the pdf (not the pages of the document which are shifted). Obviously, the document is in French... http://circulaire.legifrance.gouv.fr/pdf/2012/09/cir_35837.pdf The spirit of the document does not seem to make the use of free software mandatory, but to require that it is at least considered for any project in which the administration (i.e. the public sector, not the government as in the USA) is involved. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
于 2012/9/24 20:55, Bruce Momjian 写道: On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' works. --locale='zh_CN.UTF8' also works. But still the question is, should the encoding name be case sensitive? PostgreSQL treats encoding names as case insensitive. But it depends on the operating system whether locale names are case sensitive. I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, old and new cluster lc_ctype values do not match\n); if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0) pg_log(PG_FATAL, old and new cluster encoding values do not match\n); } strange. not sure what happened. I reviewed the log and here is what I did: 1. initdb without encoding/locale parameter: $ initdb The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale zh_CN.UTF-8. The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.UTF-8 The default text search configuration will be set to simple. 2. Run pg_upgrade: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_collate values do not match Failure, exiting 3. initdb with --lc-collate: $ initdb --lc-collate=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locales COLLATE: zh_CN.utf8 CTYPE: zh_CN.UTF-8 MESSAGES: zh_CN.UTF-8 MONETARY: zh_CN.UTF-8 NUMERIC: zh_CN.UTF-8 TIME: zh_CN.UTF-8 The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.UTF-8 The default text search configuration will be set to simple. 4. try pg_upgrade again: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_ctype values do not match Failure, exiting 5. Run initdb with all those locale settings: $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale zh_CN.utf8. The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.utf8 The default text search configuration will be set to simple. 6. Run pg_upgrade. this time it worked. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
Bruce Momjian br...@momjian.us writes: I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: Or we could just remove dashes from the name before comparisons. That would merely move the breakage somewhere else. I think you are already assuming far too much about the OS' interpretation of locale names by assuming they are case-insensitive. Assuming that dashes aren't significant seems 100% wrong. FWIW, what I found out last time I touched this code is that on many systems setlocale doesn't bother to return a canonicalized spelling; it just gives back the string you gave it. It might be worth doing what Peter suggests, just to be consistent with what we are doing elsewhere, but I'm not sure how much it will help. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caught up
Excerpts from Thom Brown's message of sáb sep 22 14:58:04 -0300 2012: On 3 September 2012 15:20, Bruce Momjian br...@momjian.us wrote: On a related note, I apologize that many of these items didn't make it into 9.2, though they are committed for 9.3. The good news is that most of my work was in documentation improvements and clarifications, that could be backpatched to 9.2. I did not bring this up earlier because I didn't want to distract the work of making the improvements. If anyone wants to look at backpatching some of these doc changes into 9.2, I will not object. ;-) I am attaching a partial list of doc changes that might be considered. (I need to improve my commit messages that reference earlier commits by including the old commit tag; my apologies.) No one has mentioned backpatching so perhaps these are all too minor, which is fine. That's a huge number of amendments. Thanks for giving these some attention Bruce. FWIW I tried to backpatch the changes listed by Bruce and they all seem to apply fine to 9.2 so maybe we should do that -- I see no reason not to. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuration include directory
Heikki Linnakangas hlinnakan...@vmware.com writes: This seems pretty much ready to commit. One tiny detail that I'd like to clarify: the docs say: Multiple files within an include directory are ordered by an alphanumeric sorting, so that ones beginning with numbers are considered before those starting with letters. To be more precise, the patch uses strcmp() for the comparisons. Just say it sorts the file names according to C locale rules. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 09:59:02PM +0800, Rural Hunter wrote: 于 2012/9/24 20:55, Bruce Momjian 写道: On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' works. --locale='zh_CN.UTF8' also works. But still the question is, should the encoding name be case sensitive? PostgreSQL treats encoding names as case insensitive. But it depends on the operating system whether locale names are case sensitive. I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, old and new cluster lc_ctype values do not match\n); if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0) pg_log(PG_FATAL, old and new cluster encoding values do not match\n); } strange. not sure what happened. I reviewed the log and here is what I did: 1. initdb without encoding/locale parameter: $ initdb The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale zh_CN.UTF-8. The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.UTF-8 The default text search configuration will be set to simple. 2. Run pg_upgrade: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_collate values do not match Failure, exiting 3. initdb with --lc-collate: $ initdb --lc-collate=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locales COLLATE: zh_CN.utf8 CTYPE: zh_CN.UTF-8 MESSAGES: zh_CN.UTF-8 MONETARY: zh_CN.UTF-8 NUMERIC: zh_CN.UTF-8 TIME: zh_CN.UTF-8 The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.UTF-8 The default text search configuration will be set to simple. 4. try pg_upgrade again: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_ctype values do not match Failure, exiting 5. Run initdb with all those locale settings: $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale zh_CN.utf8. The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.utf8 The default text search configuration will be set to simple. 6. Run pg_upgrade. this time it worked. OK, that is good information. pg_upgrade gets the locale and encoding from the template0 database settings: SELECT datcollate, datctype FROM pg_catalog.pg_database WHERE datname = 'template0' ); If your operating system locale/encoding names changed after the initdb of the old cluster, this would not be reflected in template0. I think Peter is right that this might be as dash issue, utf8 vs utf-8. Look at the initdb output: 3. initdb with --lc-collate: $ initdb --lc-collate=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locales COLLATE:
Re: [HACKERS] Switching timeline over streaming replication
On Tuesday, September 11, 2012 10:53 PM Heikki Linnakangas wrote: I've been working on the often-requested feature to handle timeline changes over streaming replication. At the moment, if you kill the master and promote a standby server, and you have another standby server that you'd like to keep following the new master server, you need a WAL archive in addition to streaming replication to make it cross the timeline change. Streaming replication will just error out. Having a WAL archive is usually a good idea in complex replication scenarios anyway, but it would be good to not require it. Confirm my understanding of this feature: This feature is for case when standby-1 who is going to be promoted to master has archive mode 'on'. As in that case only its timeline will change. If above is right, then there can be other similar scenario's where it can be used: Scenario-1 (1 Master, 1 Stand-by) 1. Master (archive_mode=on) goes down. 2. Master again comes up 3. Stand-by tries to follow it Now in above scenario also due to timeline mismatch it gives error, but your patch should fix it. Some parts of this patch are just refactoring that probably make sense regardless of the new functionality. For example, I split off the timeline history file related functions to a new file, timeline.c. That's not very much code, but it's fairly isolated, and xlog.c is massive, so I feel that anything that we can move off from xlog.c is a good thing. I also moved off the two functions RestoreArchivedFile() and ExecuteRecoveryCommand(), to a separate file. Those are also not much code, but are fairly isolated. If no-one objects to those changes, and the general direction this work is going to, I'm going split off those refactorings to separate patches and commit them separately. I also made the timeline history file a bit more detailed: instead of recording just the WAL segment where the timeline was changed, it now records the exact XLogRecPtr. That was required for the walsender to know the switchpoint, without having to parse the XLOG records (it reads and parses the history file, instead) IMO separating timeline history file related functions to a new file is good. However I am not sure about splitting for RestoreArchivedFile() and ExecuteRecoveryCommand() into separate file. How about splitting for all Archive related functions: static void XLogArchiveNotify(const char *xlog); static void XLogArchiveNotifySeg(XLogSegNo segno); static bool XLogArchiveCheckDone(const char *xlog); static bool XLogArchiveIsBusy(const char *xlog); static void XLogArchiveCleanup(const char *xlog); .. .. In any case, it will be better if you can split it into multiple patches: 1. Having new functionality of Switching timeline over streaming replication 2. Refactoring related changes. It can make my testing and review for new feature patch little easier. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 10:13:45AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: Or we could just remove dashes from the name before comparisons. That would merely move the breakage somewhere else. I think you are already assuming far too much about the OS' interpretation of locale names by assuming they are case-insensitive. Assuming that dashes aren't significant seems 100% wrong. FWIW, what I found out last time I touched this code is that on many systems setlocale doesn't bother to return a canonicalized spelling; it just gives back the string you gave it. It might be worth doing what Peter suggests, just to be consistent with what we are doing elsewhere, but I'm not sure how much it will help. This comment in initdb.c doesn't sound hopeful: * If successful, and canonname isn't NULL, a malloc'd copy of the locale's * canonical name is stored there. This is especially useful for figuring out * what locale name means (ie, the environment value). (Actually, * it seems that on most implementations that's the only thing it's good for; * we could wish that setlocale gave back a canonically spelled version of * the locale name, but typically it doesn't.) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
Bruce Momjian br...@momjian.us writes: On Mon, Sep 24, 2012 at 10:13:45AM -0400, Tom Lane wrote: FWIW, what I found out last time I touched this code is that on many systems setlocale doesn't bother to return a canonicalized spelling; it just gives back the string you gave it. It might be worth doing what Peter suggests, just to be consistent with what we are doing elsewhere, but I'm not sure how much it will help. This comment in initdb.c doesn't sound hopeful: * If successful, and canonname isn't NULL, a malloc'd copy of the locale's * canonical name is stored there. This is especially useful for figuring out * what locale name means (ie, the environment value). (Actually, * it seems that on most implementations that's the only thing it's good for; * we could wish that setlocale gave back a canonically spelled version of * the locale name, but typically it doesn't.) Yeah, I wrote that. We can hope that the OP is running on a platform where setlocale does canonicalize the name, in which case doing the same thing in pg_upgrade that initdb does would fix his problem. But I'm not going to predict success. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch to note which system catalogs have oids
On 09/23/2012 10:14:33 PM, Tom Lane wrote: Karl O. Pinc k...@meme.com writes: The attached patch documents the oid column of those system catalogs having an oid. I think this is fundamentally wrong, or at least misleading, because it documents OID as if it were an ordinary column. Somebody who did select * from pg_class and didn't see any oid in the result would think the docs were wrong. Ok. When I went looking at querying the system catalogs I got confused some time ago because oids were not listed along with the other columns. (It didn't help that the catalog I was looking at had another column of type oid.) It's possible that it's worth expending a boilerplate paragraph in each of those pages to say this catalog has OIDs (or that it doesn't). But this isn't the way. How about modifying the (printed) table layout as attached? It begins each (printed) table documenting each catalog with a Has OID column Yes/No. Also, I note that pg_constraint and pg_collation are not collated properly in the docs. (Constraint comes before collation in the docs, although everything else is sorted by name.) A second patch (applied on top of the first) fixes this. Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f999190..2dfb40f 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -332,6 +332,19 @@ table titlestructnamepg_aggregate/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -415,6 +428,19 @@ table titlestructnamepg_am/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -671,6 +697,19 @@ table titlestructnamepg_amop/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -807,6 +846,19 @@ table titlestructnamepg_amproc/structname Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -890,6 +942,19 @@ table titlestructnamepg_attrdef/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -967,6 +1032,19 @@ table titlestructnamepg_attribute/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1247,6 +1325,19 @@ table titlestructnamepg_authid/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=3 thead row @@ -1377,6 +1468,19 @@ table titlestructnamepg_auth_members/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1450,6 +1554,19 @@ table titlestructnamepg_cast/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1565,6 +1682,19 @@ table titlestructnamepg_class/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1877,6 +2007,19 @@ table titlestructnamepg_event_trigger/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1972,6 +2115,19 @@ table titlestructnamepg_constraint/ Columns/title + tgroup cols=1 +thead + row + entryHas OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -2238,6
Re: [HACKERS] Configuration include directory
On 24.09.2012 17:24, Tom Lane wrote: Heikki Linnakangashlinnakan...@vmware.com writes: This seems pretty much ready to commit. One tiny detail that I'd like to clarify: the docs say: Multiple files within an include directory are ordered by an alphanumeric sorting, so that ones beginning with numbers are considered before those starting with letters. To be more precise, the patch uses strcmp() for the comparisons. Just say it sorts the file names according to C locale rules. Hmm, that's preceise, but I don't think an average user necessarily knows what the C locale is. I think I'll go with: Multiple files within an include directory are processed in filename order. The filenames are ordered by C locale rules, ie. numbers before letters, and uppercase letters before lowercase ones. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
于 2012/9/24 22:26, Bruce Momjian 写道: On Mon, Sep 24, 2012 at 09:59:02PM +0800, Rural Hunter wrote: 于 2012/9/24 20:55, Bruce Momjian 写道: On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' works. --locale='zh_CN.UTF8' also works. But still the question is, should the encoding name be case sensitive? PostgreSQL treats encoding names as case insensitive. But it depends on the operating system whether locale names are case sensitive. I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, old and new cluster lc_ctype values do not match\n); if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0) pg_log(PG_FATAL, old and new cluster encoding values do not match\n); } strange. not sure what happened. I reviewed the log and here is what I did: 1. initdb without encoding/locale parameter: $ initdb The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale zh_CN.UTF-8. The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.UTF-8 The default text search configuration will be set to simple. 2. Run pg_upgrade: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_collate values do not match Failure, exiting 3. initdb with --lc-collate: $ initdb --lc-collate=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locales COLLATE: zh_CN.utf8 CTYPE: zh_CN.UTF-8 MESSAGES: zh_CN.UTF-8 MONETARY: zh_CN.UTF-8 NUMERIC: zh_CN.UTF-8 TIME: zh_CN.UTF-8 The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.UTF-8 The default text search configuration will be set to simple. 4. try pg_upgrade again: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_ctype values do not match Failure, exiting 5. Run initdb with all those locale settings: $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale zh_CN.utf8. The default database encoding has accordingly been set to UTF8. initdb: could not find suitable text search configuration for locale zh_CN.utf8 The default text search configuration will be set to simple. 6. Run pg_upgrade. this time it worked. OK, that is good information. pg_upgrade gets the locale and encoding from the template0 database settings: SELECT datcollate, datctype FROM pg_catalog.pg_database WHERE datname = 'template0' ); If your operating system locale/encoding names changed after the initdb of the old cluster, this would not be reflected in template0. No. It's not changed. look at my system settings: LANG=zh_CN.UTF-8 $ cat /var/lib/locales/supported.d/local zh_CN.UTF-8 UTF-8 I think the problem is on the options when I installed pgsql(both 9.1 and 9.2) Select the locale to be used by the new database cluster. Locale [1] [Default locale] [2] C [3] POSIX [4] zh_CN.utf8 [5] zh_HK.utf8 [6] zh_SG.utf8 [7] zh_TW.utf8
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
Bruce Momjian br...@momjian.us writes: Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? IIRC, we didn't try to canonicalize locale names at all before 9.2. That initdb code you're quoting is of fairly recent vintage. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 11:24:04AM -0400, Peter Eisentraut wrote: On 9/24/12 10:13 AM, Tom Lane wrote: FWIW, what I found out last time I touched this code is that on many systems setlocale doesn't bother to return a canonicalized spelling; it just gives back the string you gave it. It might be worth doing what Peter suggests, just to be consistent with what we are doing elsewhere, but I'm not sure how much it will help. It might not have anything to do with the current problem, but if initdb canonicalizes locale names, then pg_upgrade also has to. Otherwise, whenever an operating system changes its locale canonicalization rules, pg_upgrade will fail. Agreed. I will work on that soon. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 11:22:22AM -0400, Peter Eisentraut wrote: On 9/24/12 11:04 AM, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? IIRC, we didn't try to canonicalize locale names at all before 9.2. That initdb code you're quoting is of fairly recent vintage. initdb has changed POSIX to C with glibc at least since 8.3. The code you're quoting is just a refactoring, AFAICT. Frankly, I assumed the values assigned in pg_database for template0 were canonical. Tom is saying that canonicalization behavior changed between 9.1 to 9.2, and the user is reporting this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
Excerpts from Daniele Varrazzo's message of dom sep 23 22:02:51 -0300 2012: On Mon, Sep 24, 2012 at 12:23 AM, Michael Paquier michael.paqu...@gmail.com wrote: As proposed by Masahiko, a single organization grouping all the tools (one repository per tool) would be enough. Please note that github can also host documentation. Bug tracker would be tool-dedicated in this case. From this PoV, pgFoundry allows your tool to be under http://yourtool.projects.postgresql.org instead of under a more generic namespace: I find it a nice and cozy place in the url space where to put your project. If pgFoundry will be dismissed I hope at least a hosting service for static pages will remain. I don't think that has been offered. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
Bruce Momjian br...@momjian.us writes: On Mon, Sep 24, 2012 at 11:22:22AM -0400, Peter Eisentraut wrote: initdb has changed POSIX to C with glibc at least since 8.3. The code you're quoting is just a refactoring, AFAICT. Frankly, I assumed the values assigned in pg_database for template0 were canonical. Tom is saying that canonicalization behavior changed between 9.1 to 9.2, and the user is reporting this. It was not just a refactoring: we now pass the locale names through setlocale() which we didn't before. See commit c7cea267de3ca05b29a57b9d113b95ef3793c8d8. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] External Replication
m...@rpzdesign.com m...@rpzdesign.com writes: You may want to consider changing the command TRIGGER into a command FILTER and possibly post processing TRIGGER that is determined to be called INSIDE the FILTER. Or some way to pass information between the FILTER and the post processing trigger. The only current event supported by the system is the ddl_command_start one. We mean to add some more, and triggers wanting to communicate data in between ddl_command_start and ddl_command_end (for example) will have to use something like a table. Also, something information as to whether a series of statements was ROLLED BACK would be helpful. Event Triggers are not an autonomous transaction: any effect they have in the database is rolled-backed when the main transaction is rolled backed. You can use LISTEN/NOTIFY or PGQ that both know how to handle that semantics. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On 9/24/12 11:04 AM, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? IIRC, we didn't try to canonicalize locale names at all before 9.2. That initdb code you're quoting is of fairly recent vintage. initdb has changed POSIX to C with glibc at least since 8.3. The code you're quoting is just a refactoring, AFAICT. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On Sat, Sep 22, 2012 at 3:25 AM, Satoshi Nagayasu sn...@uptime.jp wrote: To solve this problem, I would like to have some umbrella project. It would be called pg dba utils, or something like this. This umbrella project may contain several third-party tools (pg_reorg, pg_rman, pg_filedump, xlogdump, etc, etc...) as its sub-modules. Great idea! +1 Roberto Mello -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 11:04:32AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? IIRC, we didn't try to canonicalize locale names at all before 9.2. That initdb code you're quoting is of fairly recent vintage. Ah, so that would explain the change he is seeing. I will work on a patch. I am working on the information_schema patch now. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On 21 September 2012 08:42, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Sep 21, 2012 at 1:00 PM, Hitoshi Harada umi.tan...@gmail.com wrote: I'm not familiar with pg_reorg, but I wonder why we need a separate program for this task. I know pg_reorg is ok as an external program per se, but if we could optimize CLUSTER (or VACUUM which I'm a little pessimistic about) in the same way, it's much nicer than having additional binary + extension. Isn't it possible to do the same thing above within the CLUSTER command? Maybe CLUSTER .. CONCURRENTLY? CLUSTER might be more adapted in this case as the purpose is to reorder the table. The same technique used by pg_reorg (aka table coupled with triggers) could lower the lock access of the table. Also, it could be possible to control each sub-operation in the same fashion way as CREATE INDEX CONCURRENTLY. By the way, whatever the operation, VACUUM or CLUSTER used, I got a couple of doubts: 1) isn't it be too costly for a core operation as pg_reorg really needs many temporary objects? Could be possible to reduce the number of objects created if added to core though... 2) Do you think the current CLUSTER is enough and are there wishes to implement such an optimization directly in core? For me, the Postgres user interface should include * REINDEX CONCURRENTLY * CLUSTER CONCURRENTLY * ALTER TABLE CONCURRENTLY and also that autovacuum would be expanded to include REINDEX and CLUSTER, renaming it to automaint. The actual implementation mechanism for those probably looks something like pg_reorg, but I don't see it as preferable to include the utility directly into core, though potentially some of the underlying code might be. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On 9/24/12 10:13 AM, Tom Lane wrote: FWIW, what I found out last time I touched this code is that on many systems setlocale doesn't bother to return a canonicalized spelling; it just gives back the string you gave it. It might be worth doing what Peter suggests, just to be consistent with what we are doing elsewhere, but I'm not sure how much it will help. It might not have anything to do with the current problem, but if initdb canonicalizes locale names, then pg_upgrade also has to. Otherwise, whenever an operating system changes its locale canonicalization rules, pg_upgrade will fail. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] External Replication
Dmitri: Thanks for the response. I am resigned to just patch each major release with my own pile of hook code that I can quickly graft into the code base, currently grafting 9.2.0. My strategy is to let the PG code base float around with all the work of the fine hackers on this list, maybe debate a couple of things with some friendly types, but really just put my effort into logic piled into external replication daemon which will NOT change due to anything in the PG core. If one day, the code base actually feeds me the event information I need, maybe I will change it. And I have not seen anybody request my hook code but a few have responded that the are working on things in the code base, release date unknown. Cheers, marco On 9/24/2012 10:20 AM, Dimitri Fontaine wrote: m...@rpzdesign.com m...@rpzdesign.com writes: You may want to consider changing the command TRIGGER into a command FILTER and possibly post processing TRIGGER that is determined to be called INSIDE the FILTER. Or some way to pass information between the FILTER and the post processing trigger. The only current event supported by the system is the ddl_command_start one. We mean to add some more, and triggers wanting to communicate data in between ddl_command_start and ddl_command_end (for example) will have to use something like a table. Also, something information as to whether a series of statements was ROLLED BACK would be helpful. Event Triggers are not an autonomous transaction: any effect they have in the database is rolled-backed when the main transaction is rolled backed. You can use LISTEN/NOTIFY or PGQ that both know how to handle that semantics. Regards, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes
On 24 September 2012 06:27, Andres Freund and...@2ndquadrant.com wrote: Problem 1: concurrency: Problem 2: undroppable indexes: Thanks for posting. I'll think some more before replying. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 10:45:34PM +0800, Rural Hunter wrote: If your operating system locale/encoding names changed after the initdb of the old cluster, this would not be reflected in template0. No. It's not changed. look at my system settings: LANG=zh_CN.UTF-8 $ cat /var/lib/locales/supported.d/local zh_CN.UTF-8 UTF-8 I think the problem is on the options when I installed pgsql(both 9.1 and 9.2) Select the locale to be used by the new database cluster. Locale [1] [Default locale] [2] C [3] POSIX [4] zh_CN.utf8 [5] zh_HK.utf8 [6] zh_SG.utf8 [7] zh_TW.utf8 Please choose an option [1] : 4 I chose 4 instead of 1. I guess the default locale(option 1) is with dash. Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? I can send you a patch to test if the setlocale canonicalization works. Can you test it if I send it? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XLogInsert scaling, revisited
On Fri, Sep 21, 2012 at 12:29 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: I've been slowly continuing to work that I started last winder to make XLogInsert scale better. I have tried quite a few different approaches since then, and have settled on the attached. This is similar but not exactly the same as what I did in the patches I posted earlier. The basic idea, like before, is to split WAL insertion into two phases: 1. Reserve the right amount of WAL. This is done while holding just a spinlock. Thanks to the changes I made earlier to the WAL format, the space calculations are now much simpler and the critical section boils down to almost just CurBytePos += size_of_wal_record. See ReserveXLogInsertLocation() function. 2. Copy the WAL record to the right location in the WAL buffers. This slower part can be done mostly in parallel. The difficult part is tracking which insertions are currently in progress, and being able to wait for an insertion to finish copying the record data in place. I'm using a small number (7 at the moment) of WAL insertion slots for that. The first thing that XLogInsert does is to grab one of the slots. Each slot is protected by a LWLock, and XLogInsert reserves a slot by acquiring its lock. It holds the lock until it has completely finished copying the WAL record in place. In each slot, there's an XLogRecPtr that indicates how far the current inserter has progressed with its insertion. Typically, for a short record that fits on a single page, it is updated after the insertion is finished, but if the insertion needs to wait for a WAL buffer to become available, it updates the XLogRecPtr before sleeping. To wait for all insertions up to a point to finish, you scan all the insertion slots, and observe that the XLogRecPtrs in them are = the point you're interested in. The number of slots is a tradeoff: more slots allow more concurrency in inserting records, but makes it slower to determine how far it can be safely flushed. I did some performance tests with this, on an 8-core HP Proliant server, in a VM running under VMware vSphere 5.1. The tests were performed with Greg Smith's pgbench-tools kit, with one of two custom workload scripts: 1. Insert 1000 rows in each transaction. This is exactly the sort of workload where WALInsertLock currently becomes a bottleneck. Without the the patch, the test scales very badly, with about 420 TPS with a single client, peaking only at 520 TPS with two clients. With the patch, it scales up to about 1200 TPS, with 7 clients. I believe the test becomes I/O limited at that point; looking at iostat output while the test is running shows about 200MB/s of writes, and that is roughly what the I/O subsystem of this machine can do, according to a simple test with 'dd ...; sync. Or perhaps having more insertion slots would allow it to go higher - the patch uses exactly 7 slots at the moment. http://hlinnaka.iki.fi/xloginsert-scaling/results-1k/ 2. Insert only 10 rows in each transaction. This simulates an OLTP workload with fairly small transactions. The patch doesn't make a huge difference with that workload. It performs somewhat worse with 4-16 clients, but then somewhat better with 16 clients. The patch adds some overhead to flushing the WAL, I believe that's what's causing the slowdown with 4-16 clients. But with more clients, the WALInsertLock bottleneck becomes more significant, and you start to see a benefit again. http://hlinnaka.iki.fi/xloginsert-scaling/results-10/ Overall, the results look pretty good. I'm going to take a closer look at the slowdown in the second test. I think it might be fixable with some changes to how WaitInsertionsToFinish() and WALWriteLock work together, although I'm not sure how exactly it ought to work. Comments, ideas? Sounds good. The patch could be applied cleanly and the compile could be successfully done. But when I ran initdb, I got the following assertion error: -- $ initdb -D data --locale=C --encoding=UTF-8 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale C. The default text search configuration will be set to english. creating directory data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB creating configuration files ... ok creating template1 database in data/base/1 ... ok initializing pg_authid ... ok initializing dependencies ... TRAP: FailedAssertion(!(((uint64) currpos) % 8192 = (((intptr_t) ((sizeof(XLogPageHeaderData))) + ((8) - 1)) ~((intptr_t) ((8) - 1))) || rdata_len == 0), File: xlog.c, Line: 1363) sh: line 1: 29537 Abort trap: 6 /dav/hoge/bin/postgres --single -F -O -c search_path=pg_catalog -c exit_on_error=true template1 /dev/null child process exited with exit code 134
Re: [HACKERS] [PoC] load balancing in libpq
2012/09/24 1:07, Christopher Browne wrote: We historically have connection pooling as an external thing; with the high degree to which people keep implementing and reimplementing this, I think *something* more than we have ought to be built in. This, with perhaps better implementation, might be an apropos start. Parallel with LDAP: it takes very much this approach, where configuration typically offers a list of LDAP servers. I am not certain if OpenLDAP does round robin on the list, or if it tries targets in order, stopping when it succeeds. A decent debate fits in, there. I could see this being implemented instead via something alongside PGSERVICE; that already offers a well-defined way to capture a registry of connection configuration. Specifying a list of service names would allow the command line configuration to remain short and yet very flexible. Thanks for the comment. As you pointed out, I think it would be a start point to implement new simple load-balancing stuff. That's what I actually intended. My clients often ask me easier way to take advantage of replication and load-balancing. I know there are several considerations to be discussed, such as API compatibility issue, but it would be worth having in the core (or around the core). And I also know many people are struggling with load-balancing and master-failover things for the PostgreSQL replication. If those people are trying implementing their own load-balancing stuff in their apps again and again, it's time to consider implementing it to deliver and/or leverage with the potential of PostgreSQL replication. Regards, On 2012-09-23 10:01 AM, Euler Taveira eu...@timbira.com mailto:eu...@timbira.com wrote: On 23-09-2012 07:50, Satoshi Nagayasu wrote: I have just written the first PoC code to enable load balancing in the libpq library. Your POC is totally broken. Just to point out two problems: (i) semicolon (;) is a valid character for any option in the connection string and (ii) you didn't think about PQsetdb[Login](), PQconnectdbParams() and PQconnectStartParams(). If you want to pursue this idea, you should think a way to support same option multiple times (one idea is host1, host2, etc). Isn't it easier to add support on your application or polling software? -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org mailto:pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
2012/09/25 0:15, Simon Riggs wrote: On 21 September 2012 08:42, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Sep 21, 2012 at 1:00 PM, Hitoshi Harada umi.tan...@gmail.com wrote: I'm not familiar with pg_reorg, but I wonder why we need a separate program for this task. I know pg_reorg is ok as an external program per se, but if we could optimize CLUSTER (or VACUUM which I'm a little pessimistic about) in the same way, it's much nicer than having additional binary + extension. Isn't it possible to do the same thing above within the CLUSTER command? Maybe CLUSTER .. CONCURRENTLY? CLUSTER might be more adapted in this case as the purpose is to reorder the table. The same technique used by pg_reorg (aka table coupled with triggers) could lower the lock access of the table. Also, it could be possible to control each sub-operation in the same fashion way as CREATE INDEX CONCURRENTLY. By the way, whatever the operation, VACUUM or CLUSTER used, I got a couple of doubts: 1) isn't it be too costly for a core operation as pg_reorg really needs many temporary objects? Could be possible to reduce the number of objects created if added to core though... 2) Do you think the current CLUSTER is enough and are there wishes to implement such an optimization directly in core? For me, the Postgres user interface should include * REINDEX CONCURRENTLY * CLUSTER CONCURRENTLY * ALTER TABLE CONCURRENTLY and also that autovacuum would be expanded to include REINDEX and CLUSTER, renaming it to automaint. The actual implementation mechanism for those probably looks something like pg_reorg, but I don't see it as preferable to include the utility directly into core, though potentially some of the underlying code might be. I think it depends on what trade-off we can see. AFAIK, basically, rebuilding tables and/or indexes has a trade-off between lock-free and disk-space. So, if we have enough disk space to build a temporary table/index when rebuilding a table/index, concurrently would be a great option, and I would love it to have in core. Regards, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Mon, Sep 24, 2012 at 11:04:32AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? IIRC, we didn't try to canonicalize locale names at all before 9.2. That initdb code you're quoting is of fairly recent vintage. OK, I have developed two patches. The first fixes the problem of toast tables having oid FirstNormalObjectId due to recreating the information_schema as outlined in the 9.1 release notes. In fact, there are several cases this fixes, but information_schema was the one reported. The basic problem is that TOAST tables can't be restricted by schema -- you have to gather the relations, and then get the toast tables. The good news is that pg_upgrade caught its own bug and threw an error. I was able to test this patch by testing the information_schema recreation, and I checked to see the regression database had the expected info.c relation count. The second patch canonicalizes the old cluster's collation and ctype values pulled from the template0 database. I was recreate the fix my Debian Squeeze system. Can someone suggestion a way? I updated pg_database on the old 9.1 cluster to be en_US.UTF8, while the new cluster defaults to en_US.UTF-8, but pg_upgrade kept them the same after the setlocale() call and pg_upgrade threw a mismatch error. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c new file mode 100644 index 74b13e7..9d08f41 *** a/contrib/pg_upgrade/info.c --- b/contrib/pg_upgrade/info.c *** get_rel_infos(ClusterInfo *cluster, DbIn *** 269,302 */ snprintf(query, sizeof(query), ! SELECT c.oid, n.nspname, c.relname, ! c.relfilenode, c.reltablespace, %s FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid ! LEFT OUTER JOIN pg_catalog.pg_tablespace t ! ON c.reltablespace = t.oid ! WHERE relkind IN ('r','t', 'i'%s) AND /* exclude possible orphaned temp tables */ ((n.nspname !~ '^pg_temp_' AND n.nspname !~ '^pg_toast_temp_' AND ! n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade') AND c.oid = %u) OR (n.nspname = 'pg_catalog' AND ! relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) )) ! /* we preserve pg_class.oid so we sort by it to match old/new */ ! ORDER BY 1;, ! /* 9.2 removed the spclocation column */ ! (GET_MAJOR_VERSION(cluster-major_version) = 901) ? ! t.spclocation : pg_catalog.pg_tablespace_location(t.oid) AS spclocation, /* see the comment at the top of old_8_3_create_sequence_script() */ (GET_MAJOR_VERSION(old_cluster.major_version) = 803) ? : , 'S', - /* this oid allows us to skip system toast tables */ FirstNormalObjectId, /* does pg_largeobject_metadata need to be migrated? */ (GET_MAJOR_VERSION(old_cluster.major_version) = 804) ? : , 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index'); res = executeQueryOrDie(conn, %s, query); ntups = PQntuples(res); --- 269,327 */ snprintf(query, sizeof(query), ! CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid ! WHERE relkind IN ('r', 'i'%s) AND /* exclude possible orphaned temp tables */ ((n.nspname !~ '^pg_temp_' AND n.nspname !~ '^pg_toast_temp_' AND ! /* skip pg_toast because toast index have relkind == 'i', not 't' */ ! n.nspname NOT IN ('pg_catalog', 'information_schema', ! 'binary_upgrade', 'pg_toast') AND c.oid = %u) OR (n.nspname = 'pg_catalog' AND ! relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) ));, /* see the comment at the top of old_8_3_create_sequence_script() */ (GET_MAJOR_VERSION(old_cluster.major_version) = 803) ? : , 'S', FirstNormalObjectId, /* does pg_largeobject_metadata need to be migrated? */ (GET_MAJOR_VERSION(old_cluster.major_version) = 804) ? : , 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index'); + PQclear(executeQueryOrDie(conn, %s, query)); + + /* + * Get TOAST tables and indexes; we have to gather the TOAST tables in + * later steps because we can't schema-qualify TOAST tables. + */ + PQclear(executeQueryOrDie(conn, + INSERT INTO info_rels + SELECT reltoastrelid + FROM info_rels i JOIN pg_catalog.pg_class c + ON i.reloid = c.oid)); +
Re: [HACKERS] pg_reorg in core?
On Mon, Sep 24, 2012 at 10:17 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Excerpts from Daniele Varrazzo's message of dom sep 23 22:02:51 -0300 2012: On Mon, Sep 24, 2012 at 12:23 AM, Michael Paquier michael.paqu...@gmail.com wrote: As proposed by Masahiko, a single organization grouping all the tools (one repository per tool) would be enough. Please note that github can also host documentation. Bug tracker would be tool-dedicated in this case. From this PoV, pgFoundry allows your tool to be under http://yourtool.projects.postgresql.org instead of under a more generic namespace: I find it a nice and cozy place in the url space where to put your project. If pgFoundry will be dismissed I hope at least a hosting service for static pages will remain. I don't think that has been offered. But I don't think it's necessarily the case that pgFoundry is getting dismissed, either. I got a note from Marc Fournier not too long ago (sent to some probably-not-small set of people with pgFoundry accounts) indicating that they were planning to upgrade gForge as far as they could, and then switch to FusionForge http://fusionforge.org/, which is evidently the successor. It shouldn't be assumed that the upgrade process will be easy or quick. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuration include directory
On 25/09/12 02:41, Heikki Linnakangas wrote: On 24.09.2012 17:24, Tom Lane wrote: Heikki Linnakangashlinnakan...@vmware.com writes: This seems pretty much ready to commit. One tiny detail that I'd like to clarify: the docs say: Multiple files within an include directory are ordered by an alphanumeric sorting, so that ones beginning with numbers are considered before those starting with letters. To be more precise, the patch uses strcmp() for the comparisons. Just say it sorts the file names according to C locale rules. Hmm, that's preceise, but I don't think an average user necessarily knows what the C locale is. I think I'll go with: Multiple files within an include directory are processed in filename order. The filenames are ordered by C locale rules, ie. numbers before letters, and uppercase letters before lowercase ones. - Heikki Even I can understand that! :-) More to the point: are fullstops '.' sorted before digits? Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuration include directory
On Thu, Sep 20, 2012 at 02:10:58PM -0700, Selena Deckelmann wrote: The only thing I didn't do that Noah suggested was run pgindent on guc-file.l. A cursory search did not reveal source compatible with my operating system for 'indent'. If someone points me to it, I'd happily also comply with the request to reindent. And document how to do that on my platform(s). :) For future reference, src/tools/pgindent/README points to the pg_bsd_indent sources. If pg_bsd_indent fails to build and run, post the details. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] External Replication
Excerpts from m...@rpzdesign.com's message of lun sep 24 14:24:31 -0300 2012: And I have not seen anybody request my hook code but a few have responded that the are working on things in the code base, release date unknown. Well, typically that's not how our development works -- people here don't *request* your changes. Instead, you submit them for inclusion, people here criticize them a lot, it morphs from feedback (if you have the energy and a strong enough skin), and eventually after much pain and many tears it gets committed. The result is typically much more solid than whatever you can build in your own dark corner, but of course it takes longer. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
Hi, 2012-09-22 20:49 keltezéssel, Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: new version with a lot more cleanup is attached. I looked at this patch, and frankly I'm rather dismayed. It's a mess. I hope you won't find this one a mess. I tried to address all your complaints. [rather long diatribe on modifying PGSemaphoreLock improperly] I have returned to the previous version that used PGSemaphoreTimedLock and this time I save and restore errno around calling the timeout condition checker. The whole lmgrtimeout module seems to me to be far more mechanism than is warranted, for too little added functionality. [...] lmgrtimeout is no more, back to hardcoding things. The minimum thing I would want it to do is avoid calling timeout.c multiple times, which is what would happen right now (leading to four extra syscalls per lock acquisition, which is enough new overhead to constitute a strong objection to committing this patch at all). I have added enable_multiple_timeouts() and disable_multiple_timeouts() that minimize the number setitimer() calls. There's considerable lack of attention to updating comments, too. For instance in WaitOnLock you only bothered to update the comment immediately adjacent to the changed code, and not the two comment blocks above that, which both have specific references to deadlocks being the reason for failure. I modified the comment in question. I hope the wording is right. Also, the per statement mode for lock timeout doesn't seem to be any such thing, because it's implemented like this: +case LOCK_TIMEOUT_PER_STMT: +enable_timeout_at(LOCK_TIMEOUT, +TimestampTzPlusMilliseconds( +GetCurrentStatementStartTimestamp(), +LockTimeout)); +break; That doesn't provide anything like you can spend at most N milliseconds waiting for locks during a statement. What it is is if you happen to be waiting for a lock N milliseconds after the statement starts, or if you attempt to acquire any lock more than N milliseconds after the statement starts, you lose instantly. I don't think that definition actually adds any useful functionality compared to setting statement_timeout to N milliseconds, and it's certainly wrongly documented. To do what the documentation implies would require tracking and adding up the time spent waiting for locks during a statement. Which might be a good thing to do, especially if the required gettimeofday() calls could be shared with what timeout.c probably has to do anyway at start and stop of a lock wait. But this code doesn't do it. The code now properly accumulates the time spent in waiting for LOCK_TIMEOUT. This means that if STATEMENT_TIMEOUT and LOCK_TIMEOUT are set to the same value, STATEMENT_TIMEOUT will trigger because it considers the time as one contiguous unit, LOCK_TIMEOUT only accounts the time spent in waiting, not the time spent with useful work. This means that LOCK_TIMEOUT doesn't need any special code in its handler function, it's a NOP. The relation between timeouts is only handled by the timeout.c module. Lastly, I'm not sure where is the best place to be adding the control logic for this, but I'm pretty sure postinit.c is not it. It oughta be somewhere under storage/lmgr/, no? The above change means that there is no control logic outside of storage/lmgr now. I temporarily abandoned the idea of detailed error reporting on the object type and name/ID. WaitOnLock() reports canceling statement due to lock timeout and LockAcquire() kept its previous semantics. This can be quickly revived in case of demand, it would be another ~15K patch. I hope you can find another time slot in this CF to review this one. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ 2-lock_timeout-v26.patch.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PoC] load balancing in libpq
On 23 September 2012 05:50, Satoshi Nagayasu sn...@uptime.jp wrote: I have just written the first PoC code to enable load balancing in the libpq library. This libpq enhancement is intended to allow PostgreSQL users to take advantage of the replication in easier way. With using this patch, PQconnectdb() function accepts multiple connection info strings, and pick one of them by round-robin basis to connect. It's certainly worth thinking about. New ideas are good, if they have some justification. A load balancing solution that only works at connection level isn't much of a solution, IMHO. If you have multiple connection strings in the client program, its trivial to put them in an array and pick one randomly, then use that. So this is already available as a solution in user space. Also, any client based solution presumes we can ship lists of cluster metadata to clients, which we have no solution for, so it places a much greater burden on the administrator. Load balancing config needs to be held more centrally. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
For me, the Postgres user interface should include * REINDEX CONCURRENTLY I don't see why we don't have REINDEX CONCURRENTLY now. When I was writing out the instructions for today's update, I was thinking we already have all the commands for this. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote: For me, the Postgres user interface should include * REINDEX CONCURRENTLY I don't see why we don't have REINDEX CONCURRENTLY now. Same reason for everything on (anyone's) TODO list. Lack of vision is not holding us back, we just need the vision to realise it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On 9/24/12 3:43 PM, Simon Riggs wrote: On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote: For me, the Postgres user interface should include * REINDEX CONCURRENTLY I don't see why we don't have REINDEX CONCURRENTLY now. Same reason for everything on (anyone's) TODO list. Yes, I'm just pointing out that it would be a very small patch for someone, and that AFAIK it didn't make it on the TODO list yet. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Oid registry
This rather overdue mail arises out the developer's meeting back in May, where we discussed an item I raised suggesting an Oid registry. The idea came from some difficulties I encountered when I did the backport of the JSON work we did in 9.2 to 9.1, but has wider application. Say someone writes an extension that defines type A. You want to write an extension that takes advantage of that type, but it's difficult if you don't know the type's Oid, and of course currently there is no way of knowing for sure what it is unless it's a builtin type. So the proposal is to have an Oid registry, in which authors could in effect reserve an Oid (or a couple of Oids) for a type. We would guarantee that these Oids would be reserved in just the same way Oids for builtins are reserved, and #define symbolic constants for the reserved Oids. To make that viable, we'd need to extend the CREATE commands for any objects we could reserve Oids for to allow for the specification of the Oids, somewhat like: CREATE TYPE newtype ( ) WITH (TYPEOID = 123456, TYPEARRAYOID = 234567); I'm not sure what objects we would need this for other than types, but CREATE TYPE would be a good start anyway. Another suggestion that was made during the discussion was that we should also reserve a range of Oids for private use, and guarantee that those would never be allocated, somewhat analogously to RFC1918 IP addresses. thoughts? If there is general agreement I want to get working on this fairly soon. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On Tuesday, September 25, 2012 12:55:35 AM Josh Berkus wrote: On 9/24/12 3:43 PM, Simon Riggs wrote: On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote: For me, the Postgres user interface should include * REINDEX CONCURRENTLY I don't see why we don't have REINDEX CONCURRENTLY now. Same reason for everything on (anyone's) TODO list. Yes, I'm just pointing out that it would be a very small patch for someone, and that AFAIK it didn't make it on the TODO list yet. Its not *that* small. 1. You need more than you can do with CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY because the index can e.g. be referenced by a foreign key constraint. So you need to replace the existing index oid with a new one by swapping the relfilenodes of both after verifying several side conditions (indcheckxmin, indisvalid, indisready). It would probably have to look like: - build new index with indisready = false - newindex.indisready = true - wait - newindex.indisvalid = true - wait - swap(oldindex.relfilenode, newindex.relfilenode) - oldindex.indisvalid = false - wait - oldindex.indisready = false - wait - drop new index with old relfilenode Every wait indicates an externally visible state which you might encounter/need to cleanup... To make it viable to use that systemwide it might be necessary to batch the individual steps together for multiple indexes because all that waiting is going to suck if you do it for every single table in the database while you also have longrunning queries... 2. no support for concurrent on system tables (not easy for shared catalogs) 3. no support for the indexes of exlusion constraints (not hard I think) Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes
Hi, On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote: Hi, Problem 1: concurrency: Testcase: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) Session 2: BEGIN; SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 3: DROP INDEX CONCURRENTLY test_drop_concurrently_data; (in-progress) Session 2: INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); COMMIT; Session 1: SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) SET enable_bitmapscan = false; SET enable_indexscan = false; SELECT * FROM test_drop_concurrently WHERE data = 34343; (2 rows) Explanation: index_drop does: indexForm-indisvalid = false; /* make unusable for queries */ indexForm-indisready = false; /* make invisible to changes */ Setting indisready = false is problematic because that prevents index updates which in turn breaks READ COMMITTED semantics. I think there need to be one more phase that waits for concurrent users of the index to finish before setting indisready = false. The attached patch fixes this issue. Haven't looked at the other one in detail yet. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 8891fcd59496483793aecc21a096fc0119369e73 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Tue, 25 Sep 2012 01:41:29 +0200 Subject: [PATCH] Fix concurrency issues in concurrent index drops Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and* indisready. Thats problematic if some transaction is still looking at the index and another transction makes changes. See the example below. Now we do the drop in three stages, just as a concurrent index build. First unset indivalid, wait, unset indisready, wait, drop index. Example: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) Session 2: BEGIN; SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 3: DROP INDEX CONCURRENTLY test_drop_concurrently_data; (in-progress) Session 2: INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); COMMIT; Session 1: SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) SET enable_bitmapscan = false; SET enable_indexscan = false; SELECT * FROM test_drop_concurrently WHERE data = 34343; (2 rows) --- src/backend/catalog/index.c | 99 ++--- 1 file changed, 84 insertions(+), 15 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index a61b424..3e1794f 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1318,6 +1318,10 @@ index_drop(Oid indexId, bool concurrent) * table lock strong enough to prevent all queries on the table from * proceeding until we commit and send out a shared-cache-inval notice * that will make them update their index lists. + * + * In the concurrent case we make sure that nobody can be looking at the + * indexes by dropping the index in multiple steps, so we don't need a full + * fledged AccessExlusiveLock yet. */ heapId = IndexGetRelation(indexId, false); if (concurrent) @@ -1338,7 +1342,19 @@ index_drop(Oid indexId, bool concurrent) /* * Drop Index concurrently is similar in many ways to creating an index - * concurrently, so some actions are similar to DefineIndex() + * concurrently, so some actions are similar to DefineIndex() just in the + * reverse order. + * + * First we unset indisvalid so queries starting afterwards don't use the + * index to answer queries anymore. We have to keep indisready = true + * though so transactions that are still using the index can continue to + * see valid index contents. E.g. when they are using READ COMMITTED mode, + * and another transactions that started later commits makes changes and + * commits, they need to see those new tuples in the index. + * + * After all transactions that could possibly have used it for queries + * ended we can unset indisready and wait till nobody could be updating it + * anymore. */ if (concurrent) { @@ -1357,21 +1373,14 @@ index_drop(Oid indexId, bool concurrent) elog(ERROR, cache lookup failed for index %u, indexId);
[HACKERS] Patch: incorrect array offset in backend replication tar header
While researching the way streaming replication works I was examining the construction of the tar file header. By comparing documentation on the tar header format from various sources I certain the following patch should be applied to so the group identifier is put into thee header properly. While I realize that wikipedia isn't always the best source of information, the header offsets seem to match the other documentation I've found. The format is just easier to read on wikipedia http://en.wikipedia.org/wiki/Tar_(file_format)#File_header Here is the trivial patch: diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 4aaa9e3..524223e 100644 --- a/src/backend/replication/basebackup.c +++ b/src/backend/replication/basebackup.c @@ -871,7 +871,7 @@ _tarWriteHeader(const char *filename, const char *linktarget, sprintf(h[108], %07o , statbuf-st_uid); /* Group 8 */ - sprintf(h[117], %07o , statbuf-st_gid); + sprintf(h[116], %07o , statbuf-st_gid); /* File size 12 - 11 digits, 1 space, no NUL */ if (linktarget != NULL || S_ISDIR(statbuf-st_mode)) -- Brian -- /* insert witty comment here */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL in the French news
On Mon, Sep 24, 2012 at 10:37 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: Dear devs, For your information, PostgreSQL appears explicitely in the list of Free software cited by the French Prime Minister, Jean-Marc Ayrault, in his Circulaire (i.e. a kind of instruction to civil servants) signed last week. See on page 9 and 18 of the pdf (not the pages of the document which are shifted). Obviously, the document is in French... http://circulaire.legifrance.**gouv.fr/pdf/2012/09/cir_35837.**pdfhttp://circulaire.legifrance.gouv.fr/pdf/2012/09/cir_35837.pdf The spirit of the document does not seem to make the use of free software mandatory, but to require that it is at least considered for any project in which the administration (i.e. the public sector, not the government as in the USA) is involved. +1. Such news are nice for Postgres. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] DROP INDEX CONCURRENTLY is not really concurrency safe leaves around undroppable indexes
On Monday, September 24, 2012 01:37:59 PM Andres Freund wrote: On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote: Problem 2: undroppable indexes: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 2: DROP INDEX CONCURRENTLY test_drop_concurrently_data; waiting ^CCancel request sent ERROR: canceling statement due to user request Session 1: ROLLBACK; DROP TABLE test_drop_concurrently; SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass, indisvalid, indisready FROM pg_index WHERE indexrelid = 'test_drop_concurrently_data'::regclass; indexrelid | indrelid | indexrelid | indrelid | indisvalid | indisready +--+-+--+ -- --+ 24703 |24697 | test_drop_concurrently_data | 24697| f | f (1 row) DROP INDEX test_drop_concurrently_data; ERROR: could not open relation with OID 24697 Haven't looked at this one at all. Thats because it has to commit transactions inbetween to make the catalog changes visible. Unfortunately at that point it already deleted the dependencies in deleteOneObject... Seems to be solvable with some minor reshuffling in deleteOneObject. We can only perform the deletion of outgoing pg_depend records *after* we have dropped the object with doDeletion in the concurrent case. As the actual drop of the relation happens in the same transaction that will then go on to drop the dependency records that seems to be fine. I don't see any problems with that right now, will write a patch tomorrow. We will see if thats problematic... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: incorrect array offset in backend replication tar header
Actually I found one other issue while continuing my investigation. The insertion of the 'ustar' and version '00' has the '00' version at the wrong offset. The patch is attached. -- Brian On Mon, Sep 24, 2012 at 7:51 PM, Brian Weaver cmdrcluel...@gmail.com wrote: While researching the way streaming replication works I was examining the construction of the tar file header. By comparing documentation on the tar header format from various sources I certain the following patch should be applied to so the group identifier is put into thee header properly. While I realize that wikipedia isn't always the best source of information, the header offsets seem to match the other documentation I've found. The format is just easier to read on wikipedia http://en.wikipedia.org/wiki/Tar_(file_format)#File_header Here is the trivial patch: diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 4aaa9e3..524223e 100644 --- a/src/backend/replication/basebackup.c +++ b/src/backend/replication/basebackup.c @@ -871,7 +871,7 @@ _tarWriteHeader(const char *filename, const char *linktarget, sprintf(h[108], %07o , statbuf-st_uid); /* Group 8 */ - sprintf(h[117], %07o , statbuf-st_gid); + sprintf(h[116], %07o , statbuf-st_gid); /* File size 12 - 11 digits, 1 space, no NUL */ if (linktarget != NULL || S_ISDIR(statbuf-st_mode)) -- Brian -- /* insert witty comment here */ -- /* insert witty comment here */ pg-tar.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: incorrect array offset in backend replication tar header
Um I apologize for the third e-mail on the topic. It seems that my C coding is a bit rusty from years of neglect. No sooner had I hit the send button then I realized that trying to embed a null character in a string might not work, especially when it's followed by two consecutive zeros. Here is a safer fix which is more in line with the rest of the code in the file. I guess this is what I get for being cleaver. Patch is attached -- /* insert witty comment here */ pg-tar.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
于 2012/9/24 22:57, Bruce Momjian 写道: On Mon, Sep 24, 2012 at 10:45:34PM +0800, Rural Hunter wrote: If your operating system locale/encoding names changed after the initdb of the old cluster, this would not be reflected in template0. No. It's not changed. look at my system settings: LANG=zh_CN.UTF-8 $ cat /var/lib/locales/supported.d/local zh_CN.UTF-8 UTF-8 I think the problem is on the options when I installed pgsql(both 9.1 and 9.2) Select the locale to be used by the new database cluster. Locale [1] [Default locale] [2] C [3] POSIX [4] zh_CN.utf8 [5] zh_HK.utf8 [6] zh_SG.utf8 [7] zh_TW.utf8 Please choose an option [1] : 4 I chose 4 instead of 1. I guess the default locale(option 1) is with dash. Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Yes, that's true. The upgrade is fine with both fresh installs(9.1 and 9.2) with option above(without-dash). The problem only happens when I inited the 9.2 db with default locale(I guess that one has the dash). Just wondering why pg installer provides options without dash. Did we change the way we canonicalize the locale between 9.1 and 9.2? I can send you a patch to test if the setlocale canonicalization works. Can you test it if I send it? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch to note which system catalogs have oids
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: I think this is fundamentally wrong, or at least misleading, because it documents OID as if it were an ordinary column. Somebody who did select * from pg_class and didn't see any oid in the result would think the docs were wrong. Given that it's been quite some time since we defaulted to including OIDs in tables, and the high level of confusion that individuals trying to join pg_class and pg_namespace together go through due to select * not including the oid column, I wonder if perhaps we should consider changing that. Might be possible to do for just the catalog tables (to minimize the risk of breaking poorly-written applications), or provide a GUC to control including the oid column in select *. It's possible that it's worth expending a boilerplate paragraph in each of those pages to say this catalog has OIDs (or that it doesn't). But this isn't the way. I'm afraid I disagree with this. The oid column, in the system catalog, is user-facing and I like having it included as a column in the table in the docs, so users know what to use when doing joins. Including something in the boilerplate about it not being shown by default (or in the description in the table) might be alright, if we don't change that. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Oid registry
Andrew Dunstan and...@dunslane.net writes: ... So the proposal is to have an Oid registry, in which authors could in effect reserve an Oid (or a couple of Oids) for a type. We would guarantee that these Oids would be reserved in just the same way Oids for builtins are reserved, and #define symbolic constants for the reserved Oids. To make that viable, we'd need to extend the CREATE commands for any objects we could reserve Oids for to allow for the specification of the Oids, somewhat like: CREATE TYPE newtype ( ) WITH (TYPEOID = 123456, TYPEARRAYOID = 234567); Well, of course, it's not clear how reserved an OID can be if you provide SQL syntax that allows any Joe Blow to create a type with that OID. The possibilities for security holes are interesting to say the least, especially if applications blindly assume that oid 123456 *must* be the type they think it is. Another suggestion that was made during the discussion was that we should also reserve a range of Oids for private use, and guarantee that those would never be allocated, somewhat analogously to RFC1918 IP addresses. Likewise, would never be allocated and we'll provide syntax that does it trivially seem a bit at odds. Another point to think about is that it's a few years too late to guarantee that any particular OID above 16384 is unused; we can't do that now without possibly breaking pg_upgrade-ability of existing databases. While we could hand out some subrange of the OIDs below that, there's not exactly a huge amount of space available. If there is general agreement I want to get working on this fairly soon. Turning this into something actually useful seems to me to be a bit harder than meets the eye. I'm not opposed to it in principle, but the management aspect seems much more difficult than the technical aspect. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote: This rather overdue mail arises out the developer's meeting back in May, where we discussed an item I raised suggesting an Oid registry. The idea came from some difficulties I encountered when I did the backport of the JSON work we did in 9.2 to 9.1, but has wider application. Say someone writes an extension that defines type A. You want to write an extension that takes advantage of that type, but it's difficult if you don't know the type's Oid, Could you fill the rest of us in with some technical details about why this might be necessary and what it aims to achieve? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plpgsql gram.y make rule
I wanted to refactor the highly redundant flex and bison rules throughout the source into common pattern rules. (Besides saving some redundant code, this could also help some occasionally flaky code in pgxs modules.) The only outlier that breaks this is in plpgsql pl_gram.c: gram.y I would like to either rename the intermediate file(s) to gram.{c,h}, or possibly rename the source file to pl_gram.y. Any preferences or other comments? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: incorrect array offset in backend replication tar header
Brian Weaver cmdrcluel...@gmail.com writes: While researching the way streaming replication works I was examining the construction of the tar file header. By comparing documentation on the tar header format from various sources I certain the following patch should be applied to so the group identifier is put into thee header properly. Yeah, this is definitely wrong. While I realize that wikipedia isn't always the best source of information, the header offsets seem to match the other documentation I've found. The format is just easier to read on wikipedia The authoritative specification can be found in the pax page in the POSIX spec, which is available here: http://pubs.opengroup.org/onlinepubs/9699919799/ I agree that the 117 number is bogus, and also that the magic ustar string is written incorrectly. What's more, it appears that the latter error has been copied from pg_dump (but the 117 seems to be just a new bug in pg_basebackup). I wonder what else might be wrong hereabouts :-( Will sit down and take a closer look. I believe what we need to do about this is: 1. fix pg_dump and pg_basebackup output to conform to spec. 2. make sure pg_restore will accept both conformant and previous-generation files. Am I right in believing that we don't have any code that's expected to read pg_basebackup output? We just feed it to tar, no? I'm a bit concerned about backwards compatibility issues. It looks to me like existing versions of pg_restore will flat out reject files that have a spec-compliant ustar\0 MAGIC field. Is it going to be sufficient if we fix this in minor-version updates, or are we going to need to have a switch that tells pg_dump to emit the incorrect old format? (Ick.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql gram.y make rule
Peter Eisentraut pete...@gmx.net writes: I wanted to refactor the highly redundant flex and bison rules throughout the source into common pattern rules. (Besides saving some redundant code, this could also help some occasionally flaky code in pgxs modules.) The only outlier that breaks this is in plpgsql pl_gram.c: gram.y I would like to either rename the intermediate file(s) to gram.{c,h}, or possibly rename the source file to pl_gram.y. Any preferences or other comments? Hmmm ... it's annoyed me for a long time that that file is named the same as the core backend's gram.y. So renaming to pl_gram.y might be better. On the other hand I have very little confidence in git's ability to preserve change history if we do that. Has anyone actually done a file rename in a project with lots of history, and how well did it turn out? (For instance, does git blame still provide any useful tracking of pre-rename changes? If you try to cherry-pick a patch against the new file into a pre-rename branch, does it work?) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: incorrect array offset in backend replication tar header
Tom, I'm still investigating and I have been looking at various sources. I have checked lots of pages on the web and I was just looking at the libarchive source from github. I found an interesting sequence in libarchive that implies that the 'ustar00\0' marks the header as GNU Tar format. Here are lines 321 through 329 of 'archive_read_support_format_tar.c' from libarchive 321 /* Recognize POSIX formats. */ 322 if ((memcmp(header-magic, ustar\0, 6) == 0) 323 (memcmp(header-version, 00, 2) == 0)) 324 bid += 56; 325 326 /* Recognize GNU tar format. */ 327 if ((memcmp(header-magic, ustar , 6) == 0) 328 (memcmp(header-version, \0, 2) == 0)) 329 bid += 56; I'm wondering if the original committer put the 'ustar00\0' string in by design? Regardless I'll look at it more tomorrow 'cause I'm calling it a night. I need to send a note to the libarchive folks too because I *think* I found a potential buffer overrun in one of their octal conversion routines. -- Brian On Mon, Sep 24, 2012 at 10:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: Brian Weaver cmdrcluel...@gmail.com writes: While researching the way streaming replication works I was examining the construction of the tar file header. By comparing documentation on the tar header format from various sources I certain the following patch should be applied to so the group identifier is put into thee header properly. Yeah, this is definitely wrong. While I realize that wikipedia isn't always the best source of information, the header offsets seem to match the other documentation I've found. The format is just easier to read on wikipedia The authoritative specification can be found in the pax page in the POSIX spec, which is available here: http://pubs.opengroup.org/onlinepubs/9699919799/ I agree that the 117 number is bogus, and also that the magic ustar string is written incorrectly. What's more, it appears that the latter error has been copied from pg_dump (but the 117 seems to be just a new bug in pg_basebackup). I wonder what else might be wrong hereabouts :-( Will sit down and take a closer look. I believe what we need to do about this is: 1. fix pg_dump and pg_basebackup output to conform to spec. 2. make sure pg_restore will accept both conformant and previous-generation files. Am I right in believing that we don't have any code that's expected to read pg_basebackup output? We just feed it to tar, no? I'm a bit concerned about backwards compatibility issues. It looks to me like existing versions of pg_restore will flat out reject files that have a spec-compliant ustar\0 MAGIC field. Is it going to be sufficient if we fix this in minor-version updates, or are we going to need to have a switch that tells pg_dump to emit the incorrect old format? (Ick.) regards, tom lane -- /* insert witty comment here */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
On 09/24/2012 09:37 PM, Peter Eisentraut wrote: On Mon, 2012-09-24 at 18:59 -0400, Andrew Dunstan wrote: This rather overdue mail arises out the developer's meeting back in May, where we discussed an item I raised suggesting an Oid registry. The idea came from some difficulties I encountered when I did the backport of the JSON work we did in 9.2 to 9.1, but has wider application. Say someone writes an extension that defines type A. You want to write an extension that takes advantage of that type, but it's difficult if you don't know the type's Oid, Could you fill the rest of us in with some technical details about why this might be necessary and what it aims to achieve? Well, an obvious case is how record_to_json handles fields. If it knows nothing about the type all it can do is output the string value. That doesn't work well for types such as hstore. If it could reliably recognize a field as an hstore it might well be able to do lots better. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch to note which system catalogs have oids
On 09/24/2012 09:38:53 AM, Karl O. Pinc wrote: On 09/23/2012 10:14:33 PM, Tom Lane wrote: Karl O. Pinc k...@meme.com writes: The attached patch documents the oid column of those system catalogs having an oid. I think this is fundamentally wrong, or at least misleading, because it documents OID as if it were an ordinary column. How about modifying the (printed) table layout as attached? It begins each (printed) table documenting each catalog with a Has OID column Yes/No. Changed text from Has OID column to Keyed with an OID column since this explains more and there's no worry about horizontal space. I like having the documentation of oid be part of the (printed) table describing the columns, in some way or another, since that's where the eye is drawn when looking for column documentation. Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f999190..2dfb40f 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -332,6 +332,19 @@ table titlestructnamepg_aggregate/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -415,6 +428,19 @@ table titlestructnamepg_am/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -671,6 +697,19 @@ table titlestructnamepg_amop/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -807,6 +846,19 @@ table titlestructnamepg_amproc/structname Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -890,6 +942,19 @@ table titlestructnamepg_attrdef/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -967,6 +1032,19 @@ table titlestructnamepg_attribute/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1247,6 +1325,19 @@ table titlestructnamepg_authid/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=3 thead row @@ -1377,6 +1468,19 @@ table titlestructnamepg_auth_members/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1450,6 +1554,19 @@ table titlestructnamepg_cast/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1565,6 +1682,19 @@ table titlestructnamepg_class/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1877,6 +2007,19 @@ table titlestructnamepg_event_trigger/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryNo/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -1972,6 +2115,19 @@ table titlestructnamepg_constraint/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody + /tgroup + tgroup cols=4 thead row @@ -2238,6 +2394,19 @@ table titlestructnamepg_collation/ Columns/title + tgroup cols=1 +thead + row + entryKeyed with an OID column/entry + /row +/thead +tbody + row + entryYes/entry + /row +/tbody +
Re: [HACKERS] Patch: incorrect array offset in backend replication tar header
Brian Weaver cmdrcluel...@gmail.com writes: Here are lines 321 through 329 of 'archive_read_support_format_tar.c' from libarchive 321 /* Recognize POSIX formats. */ 322 if ((memcmp(header-magic, ustar\0, 6) == 0) 323 (memcmp(header-version, 00, 2) == 0)) 324 bid += 56; 325 326 /* Recognize GNU tar format. */ 327 if ((memcmp(header-magic, ustar , 6) == 0) 328 (memcmp(header-version, \0, 2) == 0)) 329 bid += 56; I'm wondering if the original committer put the 'ustar00\0' string in by design? The second part of that looks to me like it matches ustar \0, not ustar00\0. I think the pg_dump coding is just wrong. I've already noticed that its code for writing the checksum is pretty brain-dead too :-( Note that according to the wikipedia page, tar programs typically accept files as pre-POSIX format if the checksum is okay, regardless of what is in the magic field; and the fields that were added by POSIX are noncritical so we'd likely never notice that they were being ignored. (In fact, looking closer, pg_dump isn't even filling those fields anyway, so the fact that it's not producing a compliant magic field may be a good thing ...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On Tue, Sep 25, 2012 at 8:13 AM, Andres Freund and...@2ndquadrant.comwrote: On Tuesday, September 25, 2012 12:55:35 AM Josh Berkus wrote: On 9/24/12 3:43 PM, Simon Riggs wrote: On 24 September 2012 17:36, Josh Berkus j...@agliodbs.com wrote: For me, the Postgres user interface should include * REINDEX CONCURRENTLY I don't see why we don't have REINDEX CONCURRENTLY now. Same reason for everything on (anyone's) TODO list. Yes, I'm just pointing out that it would be a very small patch for someone, and that AFAIK it didn't make it on the TODO list yet. Its not *that* small. 1. You need more than you can do with CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY because the index can e.g. be referenced by a foreign key constraint. So you need to replace the existing index oid with a new one by swapping the relfilenodes of both after verifying several side conditions (indcheckxmin, indisvalid, indisready). It would probably have to look like: - build new index with indisready = false - newindex.indisready = true - wait - newindex.indisvalid = true - wait - swap(oldindex.relfilenode, newindex.relfilenode) - oldindex.indisvalid = false - wait - oldindex.indisready = false - wait - drop new index with old relfilenode Every wait indicates an externally visible state which you might encounter/need to cleanup... Could you clarify what do you mean here by cleanup? I am afraid I do not get your point here. 2. no support for concurrent on system tables (not easy for shared catalogs) Doesn't this exclude all the tables that are in the schema catalog? 3. no support for the indexes of exclusion constraints (not hard I think) This just consists in a check of indisready in pg_index. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Oid registry
On 09/24/2012 09:24 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: ... So the proposal is to have an Oid registry, in which authors could in effect reserve an Oid (or a couple of Oids) for a type. We would guarantee that these Oids would be reserved in just the same way Oids for builtins are reserved, and #define symbolic constants for the reserved Oids. To make that viable, we'd need to extend the CREATE commands for any objects we could reserve Oids for to allow for the specification of the Oids, somewhat like: CREATE TYPE newtype ( ) WITH (TYPEOID = 123456, TYPEARRAYOID = 234567); Well, of course, it's not clear how reserved an OID can be if you provide SQL syntax that allows any Joe Blow to create a type with that OID. The possibilities for security holes are interesting to say the least, especially if applications blindly assume that oid 123456 *must* be the type they think it is. I think this would probably be a superuser-only facility. Another suggestion that was made during the discussion was that we should also reserve a range of Oids for private use, and guarantee that those would never be allocated, somewhat analogously to RFC1918 IP addresses. Likewise, would never be allocated and we'll provide syntax that does it trivially seem a bit at odds. Another point to think about is that it's a few years too late to guarantee that any particular OID above 16384 is unused; we can't do that now without possibly breaking pg_upgrade-ability of existing databases. While we could hand out some subrange of the OIDs below that, there's not exactly a huge amount of space available. we seem to have a fair bit of leeway between to top numbered Oid as reported by unused_oids (4332) and 16384. I would expect a private range of a few hundred to be more than adequate, and a range for registered Oids of a couple of thousand to last for many years. I'm expecting the use of this over quite a few years to be numbered in tens, not thousands. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch to note which system catalogs have oids
On 09/24/2012 08:18:00 PM, Stephen Frost wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: It's possible that it's worth expending a boilerplate paragraph in each of those pages to say this catalog has OIDs (or that it doesn't). But this isn't the way. I'm afraid I disagree with this. The oid column, in the system catalog, is user-facing and I like having it included as a column in the table in the docs, so users know what to use when doing joins. Including something in the boilerplate about it not being shown by default (or in the description in the table) might be alright, if we don't change that. Having information about oid included in the (printed) table under a separate heading, as in v2 and v3 of this patch, is something of a compromise. It's hard to visualize from the sgml so it might be worth building the docs and viewing with a file:/// url. The trouble is that it's visually ugly because the two parts of the table are of separate widths. There is almost surely a way to change this in the xsl transformation to html/etc., but I would probably do a bad job of it and can't speak to the sanity of maintaining such a thing. (So it's probably a bad idea.) Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Tue, Sep 25, 2012 at 08:41:19AM +0800, Rural Hunter wrote: I think the problem is on the options when I installed pgsql(both 9.1 and 9.2) Select the locale to be used by the new database cluster. Locale [1] [Default locale] [2] C [3] POSIX [4] zh_CN.utf8 [5] zh_HK.utf8 [6] zh_SG.utf8 [7] zh_TW.utf8 Please choose an option [1] : 4 I chose 4 instead of 1. I guess the default locale(option 1) is with dash. Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Yes, that's true. The upgrade is fine with both fresh installs(9.1 and 9.2) with option above(without-dash). The problem only happens when I inited the 9.2 db with default locale(I guess that one has OK, that is good to know. I developed the attached C program that does the setlocale canonical test. On Debian Squeeze, I could not see any change: if I pass en_US.UTF-8, I get en_US.UTF-8 returned; if I pass en_US.UTF8, I get en_US.UTF8 returned. Can anyone test this and find a case where the local is canonicalized? Run it this way: $ canonical LC_COLLATE = 3 LC_CTYPE = 0 $ canonical 0 en_US.UTF8 en_US.UTF8 We are looking for cases where the second argument produces a non-matching locale name as output. I have also attached a patch that reports the mismatching locale or encoding names --- this should at least help with debugging and show that a dash is the problem. the dash). Just wondering why pg installer provides options without dash. No idea. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + #include stdio.h #include locale.h #include stdlib.h #include string.h int main(int argc, char **argv) { char *save; char *res; int category; if (argc == 1) { printf(LC_COLLATE = %d\n, LC_COLLATE); printf(LC_CTYPE = %d\n, LC_CTYPE); return 0; } category = atoi(argv[1]); save = setlocale(category, NULL); if (!save) { printf(failed to get the current locale\n); return 0; } /* 'save' may be pointing at a modifiable scratch variable, so copy it. */ save = strdup(save); /* set the locale with setlocale, to see if it accepts it. */ res = setlocale(category, argv[2]); if (!res) { printf(failed to get system local name for \%s\\n, res); return 0; } res = strdup(res); /* restore old value. */ if (!setlocale(category, save)) { printf(failed to restore old locale \%s\\n, save); return 0; } free(save); puts(res); return 0; } diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c new file mode 100644 index beb177d..e4fec34 *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *** static void *** 406,421 check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { ! /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, ! old and new cluster lc_collate values do not match\n); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, ! old and new cluster lc_ctype values do not match\n); if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0) pg_log(PG_FATAL, ! old and new cluster encoding values do not match\n); } --- 406,428 check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl) { ! /* ! * These are often defined with inconsistent case, so use pg_strcasecmp(). ! * They also often use inconsistent hyphenation, which we cannot fix, e.g. ! * UTF-8 vs. UTF8, so at least we display the mismatching values. ! */ if (pg_strcasecmp(oldctrl-lc_collate, newctrl-lc_collate) != 0) pg_log(PG_FATAL, ! lc_collate cluster values do not match: old \%s\, new \%s\\n, ! oldctrl-lc_collate, newctrl-lc_collate); if (pg_strcasecmp(oldctrl-lc_ctype, newctrl-lc_ctype) != 0) pg_log(PG_FATAL, ! lc_ctype cluster values do not match: old \%s\, new \%s\\n, ! oldctrl-lc_ctype, newctrl-lc_ctype); if (pg_strcasecmp(oldctrl-encoding, newctrl-encoding) != 0) pg_log(PG_FATAL, ! encoding cluster values do not match: old \%s\, new \%s\\n, ! oldctrl-encoding, newctrl-encoding); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
Andrew Dunstan and...@dunslane.net writes: On 09/24/2012 09:24 PM, Tom Lane wrote: Another point to think about is that it's a few years too late to guarantee that any particular OID above 16384 is unused; we can't do that now without possibly breaking pg_upgrade-ability of existing databases. While we could hand out some subrange of the OIDs below that, there's not exactly a huge amount of space available. we seem to have a fair bit of leeway between to top numbered Oid as reported by unused_oids (4332) and 16384. It's not nearly that simple. Per transam.h: *OIDs 1- are reserved for manual assignment (see the files *in src/include/catalog/). * *OIDS 1-16383 are reserved for assignment during initdb *using the OID generator. (We start the generator at 1.) * *OIDs beginning at 16384 are assigned from the OID generator *during normal multiuser operation. (We force the generator up to *16384 as soon as we are in normal operation.) pg_database.datlastsysoid is 12907 as of HEAD, so we're using about 2900 OIDs that are assigned during initdb. We should expect continued growth in that number. For comparison, it was 10791 in 8.1, which was the first version following this assignment scheme --- so the number of auto-assigned OIDs has more than tripled in the last seven years. And there's not room for another tripling. So I think it's entirely likely that we'll have to reduce FirstBootstrapObjectId to a smaller number in the foreseeable future, or else increase FirstNormalObjectId which would be a pg_upgrade breaking move. Meanwhile, we do continue to eat manually-assigned OIDs at a nontrivial pace as well. So, yeah, we could reserve a couple hundred OIDs for a scheme like this and (probably) not regret it later. But a couple thousand would scare me ... and I'm not exactly convinced that a couple hundred is enough, if there's any demand out there at all. I suggest we could consider handing out reserved OIDs one or two at a time using the current manual assignment process, *without* any notion of a reserved block of OIDs. That is, when Joe Developer needs a couple of OIDs for joes_whizzy_datatype, he comes to us and we say sure, you can have 4333 and 4334, which we then memorialize in a text file in include/catalog/. This doesn't preclude doing something differently later of course, but it avoids setting aside OID address space that is likely to be underutilized and fairly desperately needed at some future time. As you might guess from this, I'm not enamored of the reserve some private OID space part at all. There's not been any demand for that that I've noticed. In pretty much any development context, you could use OIDs up around 4 billion for such purposes and not have any problem at all --- it's only trying to *guarantee* they're not used anywhere out in the field that is problematic. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
Andrew Dunstan and...@dunslane.net writes: On 09/24/2012 09:37 PM, Peter Eisentraut wrote: Could you fill the rest of us in with some technical details about why this might be necessary and what it aims to achieve? Well, an obvious case is how record_to_json handles fields. If it knows nothing about the type all it can do is output the string value. That doesn't work well for types such as hstore. If it could reliably recognize a field as an hstore it might well be able to do lots better. Um ... that is an entirely unconvincing use case. We would not put any code into core that knows specifically about a non-core datatype, or at least I sure hope we'd not do such a klugy thing. It would be far better to invent an extensibility scheme (plug-in of some sort) for record_to_json. My recollection of the PGCon discussion is that people wanted to allow client-side code to hard-wire type OIDs for add-on types, in more or less the same way that things like JDBC know that 25 is text. That's not unreasonable, since the alternatives are complicated and not all that reliable. I'm not sure the usage applies to anything except data types though, so at least for starters I'd only want to accept reservations of OIDs for data types. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching timeline over streaming replication
On Monday, September 24, 2012 9:08 PM m...@rpzdesign.com wrote: What a disaster waiting to happen. Maybe the only replication should be master-master replication so there is no need to sequence timelines or anything, all servers are ready masters, no backups or failovers. If you really do not want a master serving, then it should only be handled in the routing of traffic to that server and not the replication logic itself. The only thing that ever came about from failovers was the failure to turn over. The above is opinion only. This feature is for users who want to use master-standby configurations. What do you mean by : then it should only be handled in the routing of traffic to that server and not the replication logic itself. Do you have any idea other than proposed implementation or do you see any problem in currently proposed solution? On 9/24/2012 7:33 AM, Amit Kapila wrote: On Tuesday, September 11, 2012 10:53 PM Heikki Linnakangas wrote: I've been working on the often-requested feature to handle timeline changes over streaming replication. At the moment, if you kill the master and promote a standby server, and you have another standby server that you'd like to keep following the new master server, you need a WAL archive in addition to streaming replication to make it cross the timeline change. Streaming replication will just error out. Having a WAL archive is usually a good idea in complex replication scenarios anyway, but it would be good to not require it. Confirm my understanding of this feature: This feature is for case when standby-1 who is going to be promoted to master has archive mode 'on'. As in that case only its timeline will change. If above is right, then there can be other similar scenario's where it can be used: Scenario-1 (1 Master, 1 Stand-by) 1. Master (archive_mode=on) goes down. 2. Master again comes up 3. Stand-by tries to follow it Now in above scenario also due to timeline mismatch it gives error, but your patch should fix it. Some parts of this patch are just refactoring that probably make sense regardless of the new functionality. For example, I split off the timeline history file related functions to a new file, timeline.c. That's not very much code, but it's fairly isolated, and xlog.c is massive, so I feel that anything that we can move off from xlog.c is a good thing. I also moved off the two functions RestoreArchivedFile() and ExecuteRecoveryCommand(), to a separate file. Those are also not much code, but are fairly isolated. If no-one objects to those changes, and the general direction this work is going to, I'm going split off those refactorings to separate patches and commit them separately. I also made the timeline history file a bit more detailed: instead of recording just the WAL segment where the timeline was changed, it now records the exact XLogRecPtr. That was required for the walsender to know the switchpoint, without having to parse the XLOG records (it reads and parses the history file, instead) IMO separating timeline history file related functions to a new file is good. However I am not sure about splitting for RestoreArchivedFile() and ExecuteRecoveryCommand() into separate file. How about splitting for all Archive related functions: static void XLogArchiveNotify(const char *xlog); static void XLogArchiveNotifySeg(XLogSegNo segno); static bool XLogArchiveCheckDone(const char *xlog); static bool XLogArchiveIsBusy(const char *xlog); static void XLogArchiveCleanup(const char *xlog); .. .. In any case, it will be better if you can split it into multiple patches: 1. Having new functionality of Switching timeline over streaming replication 2. Refactoring related changes. It can make my testing and review for new feature patch little easier. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
于 2012/9/25 11:00, Bruce Momjian 写道: On Tue, Sep 25, 2012 at 08:41:19AM +0800, Rural Hunter wrote: I think the problem is on the options when I installed pgsql(both 9.1 and 9.2) Select the locale to be used by the new database cluster. Locale [1] [Default locale] [2] C [3] POSIX [4] zh_CN.utf8 [5] zh_HK.utf8 [6] zh_SG.utf8 [7] zh_TW.utf8 Please choose an option [1] : 4 I chose 4 instead of 1. I guess the default locale(option 1) is with dash. Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Yes, that's true. The upgrade is fine with both fresh installs(9.1 and 9.2) with option above(without-dash). The problem only happens when I inited the 9.2 db with default locale(I guess that one has OK, that is good to know. I developed the attached C program that does the setlocale canonical test. On Debian Squeeze, I could not see any change: if I pass en_US.UTF-8, I get en_US.UTF-8 returned; if I pass en_US.UTF8, I get en_US.UTF8 returned. Can anyone test this and find a case where the local is canonicalized? Run it this way: $ canonical LC_COLLATE = 3 LC_CTYPE = 0 $ canonical 0 en_US.UTF8 en_US.UTF8 We are looking for cases where the second argument produces a non-matching locale name as output. It matches on my system(ubuntu 10.10 server): $ ./canonical LC_COLLATE = 3 LC_CTYPE = 0 $ ./canonical 0 zh_CN.UTF-8 zh_CN.UTF-8 $ ./canonical 0 zh_CN.UTF8 zh_CN.UTF8 $ ./canonical 0 zh_CN.utf8 zh_CN.utf8 $ ./canonical 0 zh_CN.utf-8 zh_CN.utf-8 I tested the checker with the patch: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks - Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok lc_collate cluster values do not match: old zh_CN.utf8, new zh_CN.UTF-8 Failure, exiting zh_CN.utf8 is provided by the installer and zh_CN.UTF-8 is my system default. I have also attached a patch that reports the mismatching locale or encoding names --- this should at least help with debugging and show that a dash is the problem. the dash). Just wondering why pg installer provides options without dash. No idea. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
On Mon, Sep 24, 2012 at 8:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: On 09/24/2012 09:37 PM, Peter Eisentraut wrote: Could you fill the rest of us in with some technical details about why this might be necessary and what it aims to achieve? Well, an obvious case is how record_to_json handles fields. If it knows nothing about the type all it can do is output the string value. That doesn't work well for types such as hstore. If it could reliably recognize a field as an hstore it might well be able to do lots better. Um ... that is an entirely unconvincing use case. We would not put any code into core that knows specifically about a non-core datatype, or at least I sure hope we'd not do such a klugy thing. It would be far better to invent an extensibility scheme (plug-in of some sort) for record_to_json. I don't think (and not hope) the core should know about external data type, but I have certainly seen a lot of use cases where an external project wants to know about another external data type. Say, if plv8 wants to convert hstore into a javascript object. It is arbitrary for users to define such a function that accepts hstore as arguments, but how does plv8 know the input is actually hstore? Of course you can look up type name conlusting SysCache and see if the type name is hstore, but it's expensive to do it for every function invocation, so caching the hstore's oid in plv8 is the current workaround, which is not truly safe because the hstore's oid may change while caching. I can tell similar stories around array creation which needs element type oid. The core code can do some special stuff by using pre-defined type or function oid macros, and I guess it is reasonable for the external developers to hope to do such things. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch to note which system catalogs have oids
On 09/23/2012 08:57:45 PM, Karl O. Pinc wrote: The attached patch documents the oid column of those system catalogs having an oid. Don't use the first version of this patch (oid_doc.patch) without discarding the last hunk. The last hunk introduces an error by duplicating the documentation of the pg_roles.oid column. (I am reluctant to post a revised version since there's already 3 versions floating around representing 2 different approaches and no consensus as to which approach, if any, should be taken.) Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers