Re: [HACKERS] Synchronization levels in SR
On Tue, Sep 7, 2010 at 6:02 AM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, 2010-09-06 at 22:32 +0200, Boszormenyi Zoltan wrote: (in commit) write wal record release locks/etc xact2 can proceed from here wait for sync ack In the first case, the contention is obviously increased. With this, we are creating more idle time in the server instead of letting other transactions do their jobs as soon as possible. The second method was implemented in my patch. Are there any drawbacks with this? Then I respectfully suggest that you're releasing locks too early. Your proposal would allow a 2nd user to see the results of the 1st user's transaction before the 1st user knew about whether it had committed or not. I know why you want that, but I don't think its right. Agreed. That's why I put the wait before ProcArrayEndTransaction() is called. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF16 surrogate pairs in UTF8 encoding
On 9/7/10, Peter Eisentraut pete...@gmx.net wrote: On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote: We combine the surrogate pair components to a single code point and encode that in UTF-8. We don't encode the components separately; that would be wrong. Oh, OK. Should the docs make that a bit clearer? Done. This is confusing: (When surrogate pairs are used when the server encoding is literalUTF8/, they are first combined into a single code point that is then encoded in UTF-8.) So something else happens if encoding is not UTF8? I think this part can be simply removed, it does not add anything. Or say that surrogate pairs are only allowed in UTF8 encoding. Reason is that you cannot encode 0..7F codepoints with them, and only those are allowed to be given numerically. But this is already mentioned before. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Tom Lane wrote: Well, even if the goal is to faithfully represent the bogus history shown by CVS, cvs2git isn't doing a good job of it. Them's fightin' words :-) In the case of src/bin/pg_dump/po/it.po, the CVS history claims that the version added to REL8_4_STABLE on 2010-05-13 is a child of the mainline version 1.7 committed on 2010-02-19. Therefore, according to CVS the file existed on the branch from 2010-02-19, not 2010-02-28 as claimed by the cvs2git translation. Incorrect. The CVS history implies three user-initiated events in this neighborhood: 2010.02.19: version 1.7 committed to trunk unknown date: file added to branch REL8_4_STABLE (1.7.6) 2010.05.13: file modified on branch REL8_4_STABLE to create 1.7.6.1 The CVS history gives no reason to assume that the middle event happened on 2010-02-19, or on 2010-05-13, or on any other particular date. *If* you trust the timestamps (which cvs2git treats sceptically because they are often wrong), then you can say with certainty that the intermediate event happened sometime between the two numbered commits. It is cvs2git policy to try to group add-branch-tag-to-file events together if such grouping is consistent with the nearby commit dates. The files contrib/xml2/expected/xml2.out and contrib/xml2/sql/xml2.sql have the following constraints: contrib/xml2/expected/xml2.out: 2010.02.28: 1.1 unknown date: file added to branch REL8_4_STABLE (1.1.2) 2010.03.01: 1.1.2.1 contrib/xml2/sql/xml2.sql 2010.02.28: 1.1 unknown date: file added to branch REL8_4_STABLE (1.1.2) 2010.03.01: 1.1.2.1 Since there is a date range (2010-02-28 - 2010-03-01) consistent with all of the constraints, cvs2git picks a date in that range for a commit that adds all three files to branch REL8_4_STABLE. I did some cvs co operations to check this and cvs does indeed retrieve the file between 02-19 and 02-28, but not before 02-19. So I don't think you can defend the cvs2git behavior by claiming that it's an exact translation. CVS is using the same incomplete data as cvs2svn and, just like cvs2git, it has to pick a date out of its hat. It happens to choose a different date than cvs2git. *Neither CVS nor cvs2git can be sure when the file was really added to the branch, and neither is more likely to be correct than the other.* (Actually, cvs2git is arguably more likely to be correct because it uses information from multiple files in its heuristic whereas CVS considers information for only the single file.) Robert Haas wrote: One thing I'm not quite clear on is how cvs2git thinks CVS should look given what we actually did vs. how it actually does look, The crux of the problem is that there is a plethora of hypothetical true histories that are consistent with the incomplete data recorded by CVS. cvs2svn/cvs2git picks a history that is 1. Correct, which I define to mean that the chosen history is not contradicted by the CVS data (with deviations allowed only when the CVS data is internally inconsistent). Any problems with this criterion are considered serious bugs. But (1) still leaves a vast number of possible histories. So a secondary goal is to choose a history that is 2. Plausible, meaning that it the history is believable given the way that people typically develop software in a typical CVS project. This is necessarily subjective and depends a lot on project culture and policies. (A cvs2git written from scratch for the pgsql project would undoubtedly be more mindful of your project's policies.) Improvements on this criterion are also constrained by performance requirements. Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
Fujii Masao írta: On Tue, Sep 7, 2010 at 6:02 AM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, 2010-09-06 at 22:32 +0200, Boszormenyi Zoltan wrote: (in commit) write wal record release locks/etc xact2 can proceed from here wait for sync ack In the first case, the contention is obviously increased. With this, we are creating more idle time in the server instead of letting other transactions do their jobs as soon as possible. The second method was implemented in my patch. Are there any drawbacks with this? Then I respectfully suggest that you're releasing locks too early. Your proposal would allow a 2nd user to see the results of the 1st user's transaction before the 1st user knew about whether it had committed or not. I know why you want that, but I don't think its right. Agreed. That's why I put the wait before ProcArrayEndTransaction() is called. Then there is no use to implement individual sync/async replicated transactions, period. An async replicated transaction that waits for a sync replicated transaction because of locks will become implicitely sync. It just waits for another transactions' sync ack. Best regards, Zoltán Böszörményi Regards, -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF16 surrogate pairs in UTF8 encoding
On ons, 2010-09-08 at 10:18 +0300, Marko Kreen wrote: On 9/7/10, Peter Eisentraut pete...@gmx.net wrote: On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote: We combine the surrogate pair components to a single code point and encode that in UTF-8. We don't encode the components separately; that would be wrong. Oh, OK. Should the docs make that a bit clearer? Done. This is confusing: (When surrogate pairs are used when the server encoding is literalUTF8/, they are first combined into a single code point that is then encoded in UTF-8.) So something else happens if encoding is not UTF8? Then you can't specify surrogate pairs because they are outside of the ASCII range, per constraint mentioned earlier in the paragraph. I think this part can be simply removed, it does not add anything. Or say that surrogate pairs are only allowed in UTF8 encoding. Reason is that you cannot encode 0..7F codepoints with them, and only those are allowed to be given numerically. But this is already mentioned before. Well, Tom wanted an additional explanation. I personally agree with you; this is not the place to explain encoding and Unicode internals, when really the code only does what it's supposed to. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 7:04 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Then there is no use to implement individual sync/async replicated transactions, period. An async replicated transaction that waits for a sync replicated transaction because of locks will become implicitely sync. It just waits for another transactions' sync ack. Hmm.. it's the same with async transaction (i.e., synchronous_commit = false) and sync one (synchronous_commit = true). Async transaction cannot take the lock held by sync one until the sync has flushed the WAL. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF16 surrogate pairs in UTF8 encoding
On 9/8/10, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-09-08 at 10:18 +0300, Marko Kreen wrote: On 9/7/10, Peter Eisentraut pete...@gmx.net wrote: On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote: We combine the surrogate pair components to a single code point and encode that in UTF-8. We don't encode the components separately; that would be wrong. Oh, OK. Should the docs make that a bit clearer? Done. This is confusing: (When surrogate pairs are used when the server encoding is literalUTF8/, they are first combined into a single code point that is then encoded in UTF-8.) So something else happens if encoding is not UTF8? Then you can't specify surrogate pairs because they are outside of the ASCII range, per constraint mentioned earlier in the paragraph. I think this part can be simply removed, it does not add anything. Or say that surrogate pairs are only allowed in UTF8 encoding. Reason is that you cannot encode 0..7F codepoints with them, and only those are allowed to be given numerically. But this is already mentioned before. Well, Tom wanted an additional explanation. I personally agree with you; this is not the place to explain encoding and Unicode internals, when really the code only does what it's supposed to. Ah OK, I had the impression you changed wording before that too, so then this addition seemed unnecessary. But seems you only changed formatting. Anyway, this when makes it weird. Maybe more concise version: To repeat, surrogate pairs are combined to single character and then encoded, not stored separately. Although it does seem unnecessary. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
Fujii Masao írta: On Wed, Sep 8, 2010 at 7:04 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Then there is no use to implement individual sync/async replicated transactions, period. An async replicated transaction that waits for a sync replicated transaction because of locks will become implicitely sync. It just waits for another transactions' sync ack. Hmm.. it's the same with async transaction (i.e., synchronous_commit = false) and sync one (synchronous_commit = true). Async transaction cannot take the lock held by sync one until the sync has flushed the WAL. You are right. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 6:52 AM, Boszormenyi Zoltan z...@cybertec.at wrote: Fujii Masao írta: On Wed, Sep 8, 2010 at 7:04 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Then there is no use to implement individual sync/async replicated transactions, period. An async replicated transaction that waits for a sync replicated transaction because of locks will become implicitely sync. It just waits for another transactions' sync ack. Hmm.. it's the same with async transaction (i.e., synchronous_commit = false) and sync one (synchronous_commit = true). Async transaction cannot take the lock held by sync one until the sync has flushed the WAL. You are right. I still don't see why it matters whether you wait before or after releasing locks. As soon as the transaction is marked committed in CLOG, other transactions can potentially see its effects. Holding on to all the locks might mitigate that somewhat, but it's not going to eliminate the problem. And in any event, there is ALWAYS a window of time during which the client doesn't know the transaction has committed but other transactions can potentially see its effects. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 8:43 PM, Robert Haas robertmh...@gmail.com wrote: I still don't see why it matters whether you wait before or after releasing locks. As soon as the transaction is marked committed in CLOG, other transactions can potentially see its effects. AFAIR, even if CLOG has been updated, until the transaction is marked as no longer running in PGPROC, probably other transactions cannot see its effects. But, if it's not true, I'd make the transaction wait for replication before CLOG update. And in any event, there is ALWAYS a window of time during which the client doesn't know the transaction has committed but other transactions can potentially see its effects. Yep. The problem here is that synchronous replication is likely to make the window very big. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On 09/08/2010 12:04 PM, Boszormenyi Zoltan wrote: Then there is no use to implement individual sync/async replicated transactions, period. I disagree. Different transactions have different priorities for latency vs. failure-resistance. An async replicated transaction that waits for a sync replicated transaction because of locks will become implicitely sync. Sure. But how often do your transactions wait for another one because of locks? What do we have MVCC for? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, 2010-09-08 at 12:04 +0200, Boszormenyi Zoltan wrote: I know why you want that, but I don't think its right. Agreed. That's why I put the wait before ProcArrayEndTransaction() is called. Then there is no use to implement individual sync/async replicated transactions, period. An async replicated transaction that waits for a sync replicated transaction because of locks will become implicitely sync. It just waits for another transactions' sync ack. You aren't making any sense. You have made a general observation and deduced something specific about replication from it. Most transactions are not blocked by locks, especially in well designed applications, so the argument is not relevant to replication. If *any* two transactions wait upon each other then t2 will always wait until t1 has completed. If t1 is slow then any tuning you do on t2 will likely be wasted. If you are concerned about performance you should first remove the dependency between t1 and t2. The above observation isn't sufficient to conclude that tuning of t2 should not happen via the tuning feature Simon has suggested. It's not sufficient to conclude much, if anything. As it turns out, in the scenario you outline t2 *would* return faster because you had marked it as async. But it would wait behind t1, as you say. So the performance gain will be clear and measurable. Even so, it would be best to tune the problem (lock contention) not moan that the tool you're choosing to use using (tuning replication) is at fault for being inappropriate to the problem. Mixing sync and async transactions is useful and it's a simple matter to come up with real examples where it would benefit, as well as easily testable workloads using pgbench. For example, customer table updates (sync) alongside chat messages (async). -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 8:30 AM, Fujii Masao masao.fu...@gmail.com wrote: And in any event, there is ALWAYS a window of time during which the client doesn't know the transaction has committed but other transactions can potentially see its effects. Yep. The problem here is that synchronous replication is likely to make the window very big. So what? If the correctness of your application depends on the *amount of time* this window lasts, it's already broken. It seems like you're arguing that we should artificially increase lock contention to guard against possible race conditions in user applications. That doesn't make any sense to me, so one of us is confused. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] english parser in text search: support for multiple words in the same position
For the headline generation to work properly, email/file/url/host need to become skip tokens. Updating the patch with that change. -Sushant. On Sat, 2010-09-04 at 13:25 +0530, Sushant Sinha wrote: Updating the patch with emitting parttoken and registering it with snowball config. -Sushant. On Fri, 2010-09-03 at 09:44 -0400, Robert Haas wrote: On Wed, Sep 1, 2010 at 2:42 AM, Sushant Sinha sushant...@gmail.com wrote: I have attached a patch that emits parts of a host token, a url token, an email token and a file token. Further, it makes sure that a host/url/email/file token and the first part-token are at the same position in tsvector. You should probably add this patch here: https://commitfest.postgresql.org/action/commitfest_view/open Index: src/backend/snowball/snowball.sql.in === RCS file: /projects/cvsroot/pgsql/src/backend/snowball/snowball.sql.in,v retrieving revision 1.6 diff -u -r1.6 snowball.sql.in --- src/backend/snowball/snowball.sql.in 27 Oct 2007 16:01:08 - 1.6 +++ src/backend/snowball/snowball.sql.in 7 Sep 2010 01:46:55 - @@ -22,6 +22,6 @@ WITH _ASCDICTNAME_; ALTER TEXT SEARCH CONFIGURATION _CFGNAME_ ADD MAPPING -FOR word, hword_part, hword +FOR word, hword_part, hword, parttoken WITH _NONASCDICTNAME_; Index: src/backend/tsearch/ts_parse.c === RCS file: /projects/cvsroot/pgsql/src/backend/tsearch/ts_parse.c,v retrieving revision 1.17 diff -u -r1.17 ts_parse.c --- src/backend/tsearch/ts_parse.c 26 Feb 2010 02:01:05 - 1.17 +++ src/backend/tsearch/ts_parse.c 7 Sep 2010 01:46:55 - @@ -19,7 +19,7 @@ #include tsearch/ts_utils.h #define IGNORE_LONGLEXEME 1 - +#define COMPLEX_TOKEN(x) ( x == 4 || x == 5 || x == 6 || x == 18 || x == 17 || x == 18 || x == 19) /* * Lexize subsystem */ @@ -407,8 +407,6 @@ { TSLexeme *ptr = norms; - prs-pos++; /* set pos */ - while (ptr-lexeme) { if (prs-curwords == prs-lenwords) @@ -429,6 +427,10 @@ prs-curwords++; } pfree(norms); + + if (!COMPLEX_TOKEN(type)) +prs-pos++; /* set pos */ + } } while (type 0); Index: src/backend/tsearch/wparser_def.c === RCS file: /projects/cvsroot/pgsql/src/backend/tsearch/wparser_def.c,v retrieving revision 1.33 diff -u -r1.33 wparser_def.c --- src/backend/tsearch/wparser_def.c 19 Aug 2010 05:57:34 - 1.33 +++ src/backend/tsearch/wparser_def.c 7 Sep 2010 01:46:56 - @@ -23,7 +23,7 @@ /* Define me to enable tracing of parser behavior */ -/* #define WPARSER_TRACE */ +//#define WPARSER_TRACE /* Output token categories */ @@ -51,8 +51,9 @@ #define SIGNEDINT 21 #define UNSIGNEDINT 22 #define XMLENTITY 23 +#define PARTTOKEN 24 -#define LASTNUM 23 +#define LASTNUM 24 static const char *const tok_alias[] = { , @@ -78,7 +79,8 @@ float, int, uint, - entity + entity, + parttoken }; static const char *const lex_descr[] = { @@ -105,7 +107,8 @@ Decimal notation, Signed integer, Unsigned integer, - XML entity + XML entity, +Part of file/url/host/email }; @@ -249,7 +252,8 @@ TParserPosition *state; bool ignore; bool wanthost; - + int partstop; + TParserState afterpart; /* silly char */ char c; @@ -617,8 +621,41 @@ } return 1; } +static int +p_ispartbingo(TParser *prs) +{ + int ret = 0; + if (prs-partstop 0) + { + ret = 1; + if (prs-partstop = prs-state-posbyte) + { + prs-state-state = prs-afterpart; + prs-partstop = 0; + } + else + prs-state-state = TPS_Base; + } + return ret; +} +static int +p_ispart(TParser *prs) +{ + if (prs-partstop 0) + return 1; + else + return 0; +} +static int +p_ispartEOF(TParser *prs) +{ + if (p_ispart(prs) p_isEOF(prs)) + return 1; + else + return 0; +} /* deliberately suppress unused-function complaints for the above */ void _make_compiler_happy(void); void @@ -688,6 +725,21 @@ } static void +SpecialPart(TParser *prs) +{ + prs-partstop = prs-state-posbyte; + prs-state-posbyte -= prs-state-lenbytetoken; + prs-state-poschar -= prs-state-lenchartoken; + prs-afterpart = TPS_Base; +} +static void +SpecialUrlPart(TParser *prs) +{ + SpecialPart(prs); + prs-afterpart = TPS_InURLPathStart; +} + +static void SpecialVerVersion(TParser *prs) { prs-state-posbyte -= prs-state-lenbytetoken; @@ -1057,6 +1109,7 @@ {p_iseqC, '-', A_PUSH, TPS_InSignedIntFirst, 0, NULL}, {p_iseqC, '+', A_PUSH, TPS_InSignedIntFirst, 0, NULL}, {p_iseqC, '', A_PUSH, TPS_InXMLEntityFirst, 0, NULL}, + {p_ispart, 0, A_NEXT, TPS_InSpace, 0, NULL}, {p_iseqC, '~', A_PUSH, TPS_InFileTwiddle, 0, NULL}, {p_iseqC, '/', A_PUSH, TPS_InFileFirst, 0, NULL}, {p_iseqC, '.', A_PUSH, TPS_InPathFirstFirst, 0, NULL}, @@ -1065,9 +1118,11 @@ static const TParserStateActionItem actionTPS_InNumWord[] = { + {p_ispartEOF, 0,
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 10:07 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 8, 2010 at 8:30 AM, Fujii Masao masao.fu...@gmail.com wrote: And in any event, there is ALWAYS a window of time during which the client doesn't know the transaction has committed but other transactions can potentially see its effects. Yep. The problem here is that synchronous replication is likely to make the window very big. So what? If the correctness of your application depends on the *amount of time* this window lasts, it's already broken. It seems like you're arguing that we should artificially increase lock contention to guard against possible race conditions in user applications. That doesn't make any sense to me, so one of us is confused. Yep ;) On second thought, the problem here is that the effects of the transaction marked as committed but still waiting for replication can disappear after failover. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 9:32 AM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Sep 8, 2010 at 10:07 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 8, 2010 at 8:30 AM, Fujii Masao masao.fu...@gmail.com wrote: And in any event, there is ALWAYS a window of time during which the client doesn't know the transaction has committed but other transactions can potentially see its effects. Yep. The problem here is that synchronous replication is likely to make the window very big. So what? If the correctness of your application depends on the *amount of time* this window lasts, it's already broken. It seems like you're arguing that we should artificially increase lock contention to guard against possible race conditions in user applications. That doesn't make any sense to me, so one of us is confused. Yep ;) On second thought, the problem here is that the effects of the transaction marked as committed but still waiting for replication can disappear after failover. Ah! I think that's right. So the scenario we're trying to guard against something like this. A customer makes a withdrawal of money from an ATM; their bank balance is debited. The transaction tries to commit. After the transaction becomes visible to other backends but before WAL is reaches the standby, another transaction begins and reads the customer's balance. Naturally, they get the new, lower balance. Crash, master dead. Failover. If another transcation begins and reads the customer's balance again, it's back to the old value. So we have a phantom transaction: it appeared as committed and then vanished again. So that means we have to make sure that none of the effects of a transaction can be seen until WAL is fsync'd on the master AND the slave has acked. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
On Tue, Sep 7, 2010 at 2:14 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Hi, Robert Haas írta: 2010/9/3 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at: i tried this one with 5000 unindexed tables (just one col): test=# \timing Timing is on. test=# prepare x(int4) AS select * from t_data order by id desc; PREPARE Time: 361.552 ms you will see similar or higher runtimes in case of 500 partitions and a handful of indexes. I'd like to see (1) a script to reproduce your test environment (as Stephen also requested) and (2) gprof or oprofile results. attached are the test scripts, create_tables.sql and childtables.sql. The following query takes 4.7 seconds according to psql with \timing on: EXPLAIN SELECT * FROM qdrs WHERE streamstart BETWEEN '2010-04-06' AND '2010-06-25' ORDER BY streamhash; Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF16 surrogate pairs in UTF8 encoding
Marko Kreen mark...@gmail.com writes: Although it does seem unnecessary. The reason I asked for this to be spelled out is that ordinarily, a backslash escape \nnn is a very low-level thing that will insert exactly what you say. To me it's quite unexpected that the system would editorialize on that to the extent of replacing two UTF16 surrogate characters by a single code point. That's necessary for correctness because our underlying storage is UTF8, but it's not obvious that it will happen. (As a counterexample, if our underlying storage were UTF16, then very different things would need to happen for the exact same SQL input.) I think a lot of people will have this same question when reading this para, which is why I asked for an explanation there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
hello ... no, we have not checked memory consumption. there is still some stuff left to optimize away - it seems we are going close to O(n^2) somewhere. equal is called really often in our sample case as well: ach sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls s/call s/call name 18.87 0.80 0.80 4812 0.00 0.00 make_canonical_pathkey 15.33 1.45 0.65 4811 0.00 0.00 get_eclass_for_sort_expr 14.15 2.05 0.60 8342410 0.00 0.00 equal 6.13 2.31 0.26 229172 0.00 0.00 SearchCatCache 3.66 2.47 0.16 5788835 0.00 0.00 _equalList 3.07 2.60 0.13 1450043 0.00 0.00 hash_search_with_hash_value 2.36 2.70 0.10 2272545 0.00 0.00 AllocSetAlloc 2.12 2.79 0.09 811460 0.00 0.00 hash_any 1.89 2.87 0.08 3014983 0.00 0.00 list_head 1.89 2.95 0.08 574526 0.00 0.00 _bt_compare 1.77 3.02 0.08 11577670 0.00 0.00 list_head 1.42 3.08 0.06 1136 0.00 0.00 tzload 0.94 3.12 0.04 2992373 0.00 0.00 AllocSetFreeIndex look at the number of calls ... equal is scary ... make_canonical_pathkey is fixed it seems. get_eclass_for_sort_expr seems a little more complex to fix. great you like it ... regards, hans On Sep 8, 2010, at 3:54 PM, Robert Haas wrote: On Tue, Sep 7, 2010 at 2:14 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Hi, Robert Haas írta: 2010/9/3 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at: i tried this one with 5000 unindexed tables (just one col): test=# \timing Timing is on. test=# prepare x(int4) AS select * from t_data order by id desc; PREPARE Time: 361.552 ms you will see similar or higher runtimes in case of 500 partitions and a handful of indexes. I'd like to see (1) a script to reproduce your test environment (as Stephen also requested) and (2) gprof or oprofile results. attached are the test scripts, create_tables.sql and childtables.sql. The following query takes 4.7 seconds according to psql with \timing on: EXPLAIN SELECT * FROM qdrs WHERE streamstart BETWEEN '2010-04-06' AND '2010-06-25' ORDER BY streamhash; Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
* Hans-Jürgen Schönig (postg...@cybertec.at) wrote: no, we have not checked memory consumption. there is still some stuff left to optimize away - it seems we are going close to O(n^2) somewhere. equal is called really often in our sample case as well: Did the mail with the scripts, etc, get hung up due to size or something..? I didn't see it on the mailing list nor in the archives.. If so, could you post them somewhere so others could look..? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] git: uh-oh
Michael Haggerty mhag...@alum.mit.edu writes: Tom Lane wrote: Well, even if the goal is to faithfully represent the bogus history shown by CVS, cvs2git isn't doing a good job of it. Them's fightin' words :-) Yeah ;-), but they were mainly directed at Robert, who AIUI was asserting that the behavior of cvs co -D ought to be taken as defining what the CVS history means. I don't particularly buy that, and clearly you don't either. Incorrect. The CVS history implies three user-initiated events in this neighborhood: 2010.02.19: version 1.7 committed to trunk unknown date: file added to branch REL8_4_STABLE (1.7.6) 2010.05.13: file modified on branch REL8_4_STABLE to create 1.7.6.1 Right. The problem I've got is that cvs2git takes unknown as meaning I can do whatever I want, the more random the better. It would seem to me to be good software engineering to recognize that you don't have enough information and to provide some way for cvs2git's users to modify its behavior on this point. Anyway I think the solution path for us is probably going to be to retroactively add the information, along the lines suggested by Max. I was hoping that somebody would have tried a conversion by now with the partial patch I suggested last night, but maybe I'm going to have to do it myself. Where can I find the version of cvs2git we're using? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, 2010-09-08 at 09:50 -0400, Robert Haas wrote: So that means we have to make sure that none of the effects of a transaction can be seen until WAL is fsync'd on the master AND the slave has acked. Yes, that's right. And I like your example; one for the docs. There is a slight complexity there: An application might connect to the standby and see the changes made by the transaction, even though the master has not yet been notified, but will be in a moment. I don't see that as an issue though, but worth mentioning cos its just the Byzantine Generals problem. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF16 surrogate pairs in UTF8 encoding
On 9/8/10, Tom Lane t...@sss.pgh.pa.us wrote: Marko Kreen mark...@gmail.com writes: Although it does seem unnecessary. The reason I asked for this to be spelled out is that ordinarily, a backslash escape \nnn is a very low-level thing that will insert exactly what you say. To me it's quite unexpected that the system would editorialize on that to the extent of replacing two UTF16 surrogate characters by a single code point. That's necessary for correctness because our underlying storage is UTF8, but it's not obvious that it will happen. (As a counterexample, if our underlying storage were UTF16, then very different things would need to happen for the exact same SQL input.) I think a lot of people will have this same question when reading this para, which is why I asked for an explanation there. Ok, but I still don't like the whens. How about: -6-digit form technically makes this unnecessary. (When surrogate -pairs are used when the server encoding is literalUTF8/, they -are first combined into a single code point that is then encoded -in UTF-8.) +6-digit form technically makes this unnecessary. (Surrogate +pairs are not stored directly, but combined into a single +code point that is then encoded in UTF-8.) -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
here is the patch again. we accidentally attached a wrong test file to the original posting so it grew to big. we had to revoke it from the moderator (this happens if you code from 8am to 10pm). here is just the patch - it is nice and small. you can easily test it by making yourself a nice parent table, many subtables (hundreds or thousands) and a decent number of indexes per partition. then run PREPARE with \timing to see what happens. you should get scary planning times. the more potential indexes and tables the more scary it will be. using this wonderful RB tree the time for this function call goes down to basically zero. i hope this is something which is useful to some folks out there. many thanks, hans canon-pathkeys-as-rbtree-3-ctxdiff.patch Description: Binary data On Sep 8, 2010, at 4:18 PM, Stephen Frost wrote: * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: no, we have not checked memory consumption. there is still some stuff left to optimize away - it seems we are going close to O(n^2) somewhere. equal is called really often in our sample case as well: Did the mail with the scripts, etc, get hung up due to size or something..? I didn't see it on the mailing list nor in the archives.. If so, could you post them somewhere so others could look..? Thanks, Stephen -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On Wed, Sep 8, 2010 at 16:21, Tom Lane t...@sss.pgh.pa.us wrote: Michael Haggerty mhag...@alum.mit.edu writes: Tom Lane wrote: Well, even if the goal is to faithfully represent the bogus history shown by CVS, cvs2git isn't doing a good job of it. Them's fightin' words :-) Yeah ;-), but they were mainly directed at Robert, who AIUI was asserting that the behavior of cvs co -D ought to be taken as defining what the CVS history means. I don't particularly buy that, and clearly you don't either. Incorrect. The CVS history implies three user-initiated events in this neighborhood: 2010.02.19: version 1.7 committed to trunk unknown date: file added to branch REL8_4_STABLE (1.7.6) 2010.05.13: file modified on branch REL8_4_STABLE to create 1.7.6.1 Right. The problem I've got is that cvs2git takes unknown as meaning I can do whatever I want, the more random the better. It would seem to me to be good software engineering to recognize that you don't have enough information and to provide some way for cvs2git's users to modify its behavior on this point. Anyway I think the solution path for us is probably going to be to retroactively add the information, along the lines suggested by Max. I was hoping that somebody would have tried a conversion by now with the partial patch I suggested last night, but maybe I'm going to have to do it myself. Where can I find the version of cvs2git we're using? I'm using svn trunk revision 5244 from http://cvs2svn.tigris.org/svn/cvs2svn/trunk. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 08, 2010 at 03:22:46PM +0100, Simon Riggs wrote: On Wed, 2010-09-08 at 09:50 -0400, Robert Haas wrote: So that means we have to make sure that none of the effects of a transaction can be seen until WAL is fsync'd on the master AND the slave has acked. Yes, that's right. And I like your example; one for the docs. There is a slight complexity there: An application might connect to the standby and see the changes made by the transaction, even though the master has not yet been notified, but will be in a moment. I don't see that as an issue though, but worth mentioning cos its just the Byzantine Generals problem. For completeness, a reference to the aforementioned Byzantine Generals: http://en.wikipedia.org/wiki/Byzantine_fault_tolerance Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Magnus Hagander mag...@hagander.net writes: On Wed, Sep 8, 2010 at 16:21, Tom Lane t...@sss.pgh.pa.us wrote: Where can I find the version of cvs2git we're using? I'm using svn trunk revision 5244 from http://cvs2svn.tigris.org/svn/cvs2svn/trunk. [ blink... ] That URL seems to want a password. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On Wed, Sep 8, 2010 at 16:44, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Wed, Sep 8, 2010 at 16:21, Tom Lane t...@sss.pgh.pa.us wrote: Where can I find the version of cvs2git we're using? I'm using svn trunk revision 5244 from http://cvs2svn.tigris.org/svn/cvs2svn/trunk. [ blink... ] That URL seems to want a password. Oh, right, it does. It'll tell you that on the website, but I forgot it :-) Username guest, blank password. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Wed, Sep 8, 2010 at 10:22 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2010-09-08 at 09:50 -0400, Robert Haas wrote: So that means we have to make sure that none of the effects of a transaction can be seen until WAL is fsync'd on the master AND the slave has acked. Yes, that's right. And I like your example; one for the docs. There is a slight complexity there: An application might connect to the standby and see the changes made by the transaction, even though the master has not yet been notified, but will be in a moment. I don't see that as an issue though, but worth mentioning cos its just the Byzantine Generals problem. I think that's OK too, because there's no way we can guarantee that the transaction becomes visible exactly simultaneously on both nodes. What we do need to guarantee is that it is known committed on both nodes before it becomes visible on either, so that even if there is a crash or failover it can't uncommit itself. So the order of events must be: - fsync WAL on master - send WAL to slave - wait for ack from slave - allow transaction's effects to become visible on master If the slave is only guaranteeing *receipt* of the WAL rather than fsync or replay of the WAL, then there is still a possibility of a disappearing transaction if the master and standby fail simultaneously AND a failover then occurs. So don't pick that mode if a disappearing transaction will result in someone dying or your $20B company going bankrupt or ... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
* Robert Haas (robertmh...@gmail.com) wrote: Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open Would be good to have the patch updated to be against HEAD before posting to the commitfest. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] plan time of MASSIVE partitioning ...
On Sep 8, 2010, at 4:57 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open Would be good to have the patch updated to be against HEAD before posting to the commitfest. Thanks, Stephen we will definitely provide something which is for HEAD. but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. regards, hans -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
* Hans-Jürgen Schönig (postg...@cybertec.at) wrote: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. An 18% increase is certainly nice, provided it doesn't slow down or break other things.. I'm looking through the patch now actually and I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] plan time of MASSIVE partitioning ...
2010/9/8 Hans-Jürgen Schönig postg...@cybertec.at: On Sep 8, 2010, at 4:57 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open Would be good to have the patch updated to be against HEAD before posting to the commitfest. we will definitely provide something which is for HEAD. but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. Just remember that four small patches (say) are apt to get committed faster than one big one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
* Robert Haas (robertmh...@gmail.com) wrote: 2010/9/8 Hans-Jürgen Schönig postg...@cybertec.at: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. Just remember that four small patches (say) are apt to get committed faster than one big one. Indeed, but code like this makes me wonder if this is really working the way it's supposed to: + val1 = (long)pk_left-pk_eclass; + val2 = (long)pk_right-pk_eclass; + + if (val1 val2) + return -1; + else if (val1 val2) + return 1; Have you compared how big the tree gets to the size of the list it's supposed to be replacing..? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] plan time of MASSIVE partitioning ...
Excerpts from Stephen Frost's message of mié sep 08 11:26:55 -0400 2010: * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. An 18% increase is certainly nice, provided it doesn't slow down or break other things.. I'm looking through the patch now actually and I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. I don't understand the layering between pg_tree and rbtree. Why does it exist at all? At first I thought this was another implementation of rbtrees, but then I noticed it sits on top of it. Is this really necessary? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
=?iso-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at writes: here is the patch again. This patch seems remarkably documentation-free. What is it trying to accomplish and what is it doing to the planner data structures? (Which do have documentation BTW.) Also, what will it do to runtime in normal cases where the pathkeys list isn't that long? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Stephen Frost írta: * Robert Haas (robertmh...@gmail.com) wrote: 2010/9/8 Hans-Jürgen Schönig postg...@cybertec.at: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. Just remember that four small patches (say) are apt to get committed faster than one big one. Indeed, but code like this makes me wonder if this is really working the way it's supposed to: + val1 = (long)pk_left-pk_eclass; + val2 = (long)pk_right-pk_eclass; + + if (val1 val2) + return -1; + else if (val1 val2) + return 1; The original code checked for pointers being equal among other conditions. This was an (almost) straight conversion to a comparison function for rbtree. Do you mean casting the pointer to long? Yes, e.g. on 64-bit Windows it wouldn't work. Back to plain pointer comparison. Have you compared how big the tree gets to the size of the list it's supposed to be replacing..? No, but I think it's obvious. Now we have one TreeCell where we had one ListCell. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Alvaro Herrera írta: Excerpts from Stephen Frost's message of mié sep 08 11:26:55 -0400 2010: * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have to attack this as well. An 18% increase is certainly nice, provided it doesn't slow down or break other things.. I'm looking through the patch now actually and I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. I don't understand the layering between pg_tree and rbtree. Why does it exist at all? At first I thought this was another implementation of rbtrees, but then I noticed it sits on top of it. Is this really necessary? No, if it's acceptable to omit PlannerInfo from outfuncs.c. Or at least its canon_pathkeys member. Otherwise yes, it's necessary. We need to store (Node *) in a fast searchable way. This applies to anything else that may need to be converted from list to tree to decrease planning time. Like ec_members in EquivalenceClass. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Stephen Frost sfr...@snowman.net writes: Indeed, but code like this makes me wonder if this is really working the way it's supposed to: + val1 = (long)pk_left-pk_eclass; + val2 = (long)pk_right-pk_eclass; Ugh. Casting a pointer to long? We do have portable ways to do what this is trying to do, but that is not one. (For example, this is guaranteed to misbehave on 64-bit Windows.) Offhand I think PointerGetDatum might be the best way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Excerpts from Kevin Grittner's message of vie sep 03 19:06:17 -0400 2010: Tom Lane t...@sss.pgh.pa.us wrote: +1 for adding parens; we might want to make a function of it someday. How about IsolationUsesXactSnapshot Patch attached. I find this name confusing :-( Doesn't a READ COMMITTED transaction use transaction snapshots as well? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Boszormenyi Zoltan z...@cybertec.at writes: This applies to anything else that may need to be converted from list to tree to decrease planning time. Like ec_members in EquivalenceClass. AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this hack is of any use for EquivalenceClass, even if the lists were likely to be long which they aren't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Kevin Grittner's message of vie sep 03 19:06:17 -0400 2010: How about IsolationUsesXactSnapshot I find this name confusing :-( Doesn't a READ COMMITTED transaction use transaction snapshots as well? AFAIR it doesn't keep the first snapshot around. If it did, most of your work on snapshot list trimming would have been useless, no? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Stephen Frost sfr...@snowman.net writes: I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. There seems to be a lot of unnecessary palloc/pfree traffic in this implementation. Getting rid of that might help the speed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Excerpts from Tom Lane's message of mié sep 08 12:12:31 -0400 2010: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Kevin Grittner's message of vie sep 03 19:06:17 -0400 2010: How about IsolationUsesXactSnapshot I find this name confusing :-( Doesn't a READ COMMITTED transaction use transaction snapshots as well? AFAIR it doesn't keep the first snapshot around. If it did, most of your work on snapshot list trimming would have been useless, no? That's my point precisely. The name IsolationUsesXactSnapshot makes it sound like it applies to any transaction that uses snapshots for isolation, doesn't it? How about IsolationUses1stXactSnapshot, or something else that makes it clearer that there's a difference between that and read committed transactions? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Kevin Grittner's message of vie sep 03 19:06:17 -0400 2010: How about IsolationUsesXactSnapshot I find this name confusing :-( Doesn't a READ COMMITTED transaction use transaction snapshots as well? AFAIR it doesn't keep the first snapshot around. If it did, most of your work on snapshot list trimming would have been useless, no? Technically, serializable uses a single transaction snapshot and read committed uses statement snapshots. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of mié sep 08 12:12:31 -0400 2010: AFAIR it doesn't keep the first snapshot around. If it did, most of your work on snapshot list trimming would have been useless, no? That's my point precisely. The name IsolationUsesXactSnapshot makes it sound like it applies to any transaction that uses snapshots for isolation, doesn't it? I don't think so, at least not when compared to the alternative IsolationUsesStmtSnapshot. How about IsolationUses1stXactSnapshot This just seems longer, not really better. In particular, we have *always* adhered to the phraseology that a transaction snapshot is the first one taken in a transaction, so I don't see exactly why it's confusing you now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: This applies to anything else that may need to be converted from list to tree to decrease planning time. Like ec_members in EquivalenceClass. AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this hack is of any use for EquivalenceClass, even if the lists were likely to be long which they aren't. regards, tom lane No, for EquivalanceClass-ec_member, I need to do something funnier, like implement compare(Node *, Node *) and use that instead of equal(Node *, Node *)... Something like nodeToString() on both Node * and strcmp() the resulting strings. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Tom Lane írta: Stephen Frost sfr...@snowman.net writes: I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. There seems to be a lot of unnecessary palloc/pfree traffic in this implementation. Getting rid of that might help the speed. regards, tom lane This was a first WIP implementation, I will look into it, thanks. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Boszormenyi Zoltan z...@cybertec.at writes: Tom Lane írta: AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this hack is of any use for EquivalenceClass, even if the lists were likely to be long which they aren't. No, for EquivalanceClass-ec_member, I need to do something funnier, like implement compare(Node *, Node *) and use that instead of equal(Node *, Node *)... Something like nodeToString() on both Node * and strcmp() the resulting strings. Well, (a) that doesn't work (hint: there are fields in nodes that are intentionally ignored by equal()), and (b) I still don't believe that there's an actual bottleneck there. ECs generally aren't very big. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Hi, On 09/06/2010 11:03 PM, Tom Lane wrote: I don't entirely see the point of opening ourselves up to the risk of using a pselect that's not safe under the hood. It should be possible to reliably determine the platforms that provide an atomic pselect(). For those, I'm hesitant to use a trick, where pselect() clearly provides a simpler and more official alternative. Especially considering that those platforms form the vast majority for running Postgres on. What I'm most concerned about is the write() syscall within the signal handler. If that fails for another reason than those covered, we miss the signal. As Heikki points out in the comment, it's hard to deal with such a failure. Regarding the exact implementation, the positioning of drainSelfPipe in Heikki's implementation seems strange to me. Most descriptions of the self-pipe trick [1] [2] [4] put the drainSelfPipe() just after the select(), where you can be sure there actually is something to read. (Except [3], which recommends putting it inside the signal handler, which I find even more frightening). Maybe you can read more than one byte at a time in drainSelfPipe(), to save some syscalls? Talking about the trick itself again: I found a lot of descriptions and mentioning of the self-pipe trick, but so far I only found an unknown window manager [5] and the custom inetd that's mentioned in the LWN article [4] which really use that trick. Digging deeper revealed that there's a sigsafe library [6] as well as the bglibs [7] which both seems to use the self-pipe trick as well (of which the later doesn't even care about the write()'s return value in the signal handler). None of these two libraries seems to be used in any project of relevance. Overall I got the impression that people like to describe the trick, because it sounds so nifty and clever. However, I'd feel more comfortable if I knew there were some medium to large projects that actually use that trick. But AFAICT not even Bernstein's qmail does. In any case, on most modern platforms poll() is preferable to any variant of select(). Only Linux provides a ppoll() variant. And poll() itself doesn't replace pselect(). Overall, I'm glad this gets addressed. Note that this is a long standing issue for Postgres-R and it's covered with a lengthy comment in its TODO file [8]. Regards Markus Wanner [1] D. J. Bernstein, The self-pipe trick http://cr.yp.to/docs/selfpipe.html [2] Emile van Bergen, Avoiding races with Unix signals and select() http://www.xs4all.nl/~evbergen/unix-signals.html [3] Alex Pennace, Safe UNIX Signal Handling Tips http://osiris.978.org/~alex/safesignalhandling.html [4] LWN Article: The new pselect() system call, mentions the self-pipe trick in a comment http://lwn.net/Articles/176911/ [5] Karmen: a window manager http://freshmeat.net/projects/karmen [6] sigsafe library http://www.slamb.org/projects/sigsafe/ [7] Bruce Guenter, one stop library package http://untroubled.org/bglibs/ [8] Postgres-R TOOD entry http://git.postgres-r.org/?p=Postgres-R;a=blob;f=src/backend/replication/TODO;h=7bfc37ee9629943b9ff052d571b9d933ab38a0a8;hb=HEAD#l12 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On 08/09/10 20:36, Markus Wanner wrote: On 09/06/2010 11:03 PM, Tom Lane wrote: I don't entirely see the point of opening ourselves up to the risk of using a pselect that's not safe under the hood. It should be possible to reliably determine the platforms that provide an atomic pselect(). For those, I'm hesitant to use a trick, where pselect() clearly provides a simpler and more official alternative. Especially considering that those platforms form the vast majority for running Postgres on. Perhaps, but I'm equally concerned that having different implementations for different platforms means that all implementations get less testing than if we use only one. Because of that I'm actually reluctant to even use poll() where available instead of select(). At least in the first phase, until someone demonstrates that there's a measurable difference in performance. We only call poll/select when we're about to sleep, so it's not really that performance critical anyway. What I'm most concerned about is the write() syscall within the signal handler. If that fails for another reason than those covered, we miss the signal. As Heikki points out in the comment, it's hard to deal with such a failure. Yeah, there isn't much you can do about it. Perhaps you could set a mayday flag (a global boolean variable) if it fails, and check that in the main loop, elogging a warning there instead. But I don't think we need to go to such lengths, realistically the write() will never fail or you have bigger problems. Maybe you can read more than one byte at a time in drainSelfPipe(), to save some syscalls? Perhaps, although it should be very rare to have more than one byte in the pipe. SetLatch doesn't write another byte if the latch is already set, so you only get multiple bytes in the pipe if many processes set the latch at the same instant. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Magnus Hagander mag...@hagander.net writes: I'm using svn trunk revision 5244 from http://cvs2svn.tigris.org/svn/cvs2svn/trunk. Just to make sure everybody is on the same page: I've installed svn revision 5270, which is the version currently available from that URL, and is also what Max indicated he was using in his test conversion. Suggest you update too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 08/09/10 20:36, Markus Wanner wrote: It should be possible to reliably determine the platforms that provide an atomic pselect(). For those, I'm hesitant to use a trick, where pselect() clearly provides a simpler and more official alternative. Especially considering that those platforms form the vast majority for running Postgres on. Perhaps, but I'm equally concerned that having different implementations for different platforms means that all implementations get less testing than if we use only one. There's that, and there's also that Markus' premise is full of holes. Exactly how will you determine that pselect is safe at compile time? Even if you correctly determine that, how can you be sure that the finished executable will only be run against a version of libc that has a safe implementation? Considering that we know that major platforms such as FreeBSD have changed their implementations *very* recently, it seems foolish to assume that an executable built on a machine with corrected pselect could not be run on one with an older implementation. Because of that I'm actually reluctant to even use poll() where available instead of select(). At least in the first phase, until someone demonstrates that there's a measurable difference in performance. select() is demonstrably a loser whenever the process has a lot of open files. Also, we have plenty of experience with substituting poll() for select(), so I'm not too worried about copy-and-pasting such code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: Tom Lane írta: AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this hack is of any use for EquivalenceClass, even if the lists were likely to be long which they aren't. No, for EquivalanceClass-ec_member, I need to do something funnier, like implement compare(Node *, Node *) and use that instead of equal(Node *, Node *)... Something like nodeToString() on both Node * and strcmp() the resulting strings. Well, (a) that doesn't work (hint: there are fields in nodes that are intentionally ignored by equal()), Then this compare() needs to work like equal(), which can ignore the fields that are ignored by equal(), too. nodeToString would need more space anyway and comparing non-equal Nodes can return the !0 result faster. and (b) I still don't believe that there's an actual bottleneck there. ECs generally aren't very big. Actually, PlannerInfo-eq_classes needs to be a Tree somehow, the comparator function is not yet final in my head. equal() is called over 8 million times with or without our patch: without % cumulative self self total time seconds secondscalls s/call s/call name 18.87 0.80 0.80 4812 0.00 0.00 make_canonical_pathkey 15.33 1.45 0.65 4811 0.00 0.00 get_eclass_for_sort_expr 14.15 2.05 0.60 8342410 0.00 0.00 equal 6.13 2.31 0.26 229172 0.00 0.00 SearchCatCache 3.66 2.47 0.16 5788835 0.00 0.00 _equalList 3.07 2.60 0.13 1450043 0.00 0.00 hash_search_with_hash_value 2.36 2.70 0.10 2272545 0.00 0.00 AllocSetAlloc 2.12 2.79 0.09 811460 0.00 0.00 hash_any 1.89 2.87 0.08 3014983 0.00 0.00 list_head 1.89 2.95 0.08 574526 0.00 0.00 _bt_compare 1.77 3.02 0.08 11577670 0.00 0.00 list_head 1.42 3.08 0.06 1136 0.00 0.00 tzload 0.94 3.12 0.04 2992373 0.00 0.00 AllocSetFreeIndex 0.94 3.16 0.0491427 0.00 0.00 _bt_checkkeys ... with % cumulative self self total time seconds secondscalls s/call s/call name 24.51 0.88 0.88 4811 0.00 0.00 get_eclass_for_sort_expr 14.21 1.39 0.51 8342410 0.00 0.00 equal 8.22 1.69 0.30 5788835 0.00 0.00 _equalList 5.29 1.88 0.19 229172 0.00 0.00 SearchCatCache 2.51 1.97 0.09 1136 0.00 0.00 tzload 2.23 2.05 0.08 3014983 0.00 0.00 list_head 2.23 2.13 0.08 2283130 0.00 0.00 AllocSetAlloc 2.09 2.20 0.08 811547 0.00 0.00 hash_any 2.09 2.28 0.08 11577670 0.00 0.00 list_head 1.95 2.35 0.07 1450180 0.00 0.00 hash_search_with_hash_value 1.39 2.40 0.05 640690 0.00 0.00 _bt_compare 1.39 2.45 0.05 157944 0.00 0.00 LockAcquireExtended 1.39 2.50 0.0511164 0.00 0.00 AllocSetCheck 1.11 2.54 0.04 3010547 0.00 0.00 AllocSetFreeIndex 1.11 2.58 0.04 874975 0.00 0.00 AllocSetFree 1.11 2.62 0.0466211 0.00 0.00 heap_form_tuple 0.84 2.65 0.03 888128 0.00 0.00 LWLockRelease ... The number of calls are the same for equal and _equalList in both cases as you can see. And if you compare the number of AllocSetAlloc calls, it's actually smaller with our patch, so it's not that the conversion to Tree caused this. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Sergey Konoplev wrote: Hi, On 7 September 2010 20:35, Tom Lane t...@sss.pgh.pa.us wrote: How does $subject differ from what we already do? ?See http://www.postgresql.org/docs/9.0/static/plpgsql-structure.html So will it be possible to do things like this? 1. CREATE FUNCTION func_name(arg_name text) RETURNS integer AS $$ BEGIN RAISE INFO '%', func_name.arg_name; ... 2. CREATE FUNCTION func_name() RETURNS integer AS $$ DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_name.var_name; ... 3. CREATE FUNCTION func_very_very_very_very_long_name() RETURNS integer AS $$ func_alias DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_alias.var_name; ... In my testing #1 works, but #2 does not: -- #1 test= CREATE OR REPLACE FUNCTION xxx(yyy INTEGER) RETURNS void AS $$ BEGIN xxx.yyy := 4; END;$$ LANGUAGE plpgsql; CREATE FUNCTION -- #2 test= CREATE OR REPLACE FUNCTION xxx() RETURNS void AS $$ DECLARE yyy integer; BEGIN xxx.yyy := 4; END;$$ LANGUAGE plpgsql; ERROR: xxx.yyy is not a known variable LINE 3: xxx.yyy := 4; ^ #2 works only if you specify a label above the DECLARE section and use that label (not the function name) as a variable qualifier: test= CREATE OR REPLACE FUNCTION xxx() RETURNS void AS $$ zzz DECLARE yyy INTEGER; BEGIN zzz.yyy := 4; END;$$ LANGUAGE plpgsql; CREATE FUNCTION Interestingly, I can use a label that matches the function name: test= CREATE OR REPLACE FUNCTION xxx() RETURNS void AS $$ xxx DECLARE yyy INTEGER; BEGIN xxx.yyy := 4; END;$$ LANGUAGE plpgsql; CREATE FUNCTION but if you supply parameters to the function, it does not work: test= CREATE OR REPLACE FUNCTION xxx(aaa INTEGER) RETURNS void AS $$ xxx DECLARE yyy INTEGER; BEGIN xxx.yyy := 4; END;$$ LANGUAGE plpgsql; ERROR: cannot change name of input parameter yyy HINT: Use DROP FUNCTION first. so this is not something we can recommend to users. Note the text Tom quoted from our docs: http://www.postgresql.org/docs/9.0/static/plpgsql-structure.html There is actually a hidden quoteouter block/ surrounding the body of any applicationPL/pgSQL/ function. This block provides the declarations of the function's parameters (if any), as well as some special variables such as literalFOUND/literal (see xref linkend=plpgsql-statements-diagnostics). The outer block is labeled with the function's name, meaning that parameters and special variables can be qualified with the function's name. This talks about the parameters, but not about the DECLARE block. The idea of adding a label to DECLARE blocks is mentioned in our docs: http://www.postgresql.org/docs/9.0/static/plpgsql-implementation.html#PLPGSQL-VAR-SUBST Alternatively you can qualify ambiguous references to make them clear. In the above example, src.foo would be an unambiguous reference to the table column. To create an unambiguous reference to a variable, declare it in a labeled block and use the block's label (see Section 39.2). -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On Wed, Sep 8, 2010 at 20:11, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: I'm using svn trunk revision 5244 from http://cvs2svn.tigris.org/svn/cvs2svn/trunk. Just to make sure everybody is on the same page: I've installed svn revision 5270, which is the version currently available from that URL, and is also what Max indicated he was using in his test conversion. Suggest you update too. Done, thanks for the reminder. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Bruce Momjian br...@momjian.us writes: ... but if you supply parameters to the function, it does not work: test= CREATE OR REPLACE FUNCTION xxx(aaa INTEGER) RETURNS void AS $$ ERROR: cannot change name of input parameter yyy HINT: Use DROP FUNCTION first. This is failing because you tried to redeclare xxx(int) with a different name for its parameter, which is no longer allowed. It has nothing to do with the question at hand. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Bruce Momjian wrote: Sergey Konoplev wrote: 1. CREATE FUNCTION func_name(arg_name text) RETURNS integer AS $$ BEGIN RAISE INFO '%', func_name.arg_name; ... 2. CREATE FUNCTION func_name() RETURNS integer AS $$ DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_name.var_name; ... 3. CREATE FUNCTION func_very_very_very_very_long_name() RETURNS integer AS $$ func_alias DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_alias.var_name; ... I suggest that it might be reasonable to introduce a new syntax, that isn't already valid for something inside a routine, and use that as a terse way to reference the current function and/or its parameters. This may best be a simple constant syntax. For example, iff it isn't already valid for a qualified name to have a leading period/full-stop/radix-marker, then this could be introduced as a valid way to refer to the current routine. Then in the above examples you can say: RAISE INFO '%', .arg_name; RAISE INFO '%', .var_name; ... without explicitly declaring a func_alias. In a tangent, you can also use a new constant syntax (unless you have one?) to allow a routine to invoke itself without knowing its own name, which could be nice in a simple recursive routine. Maybe .(arg,arg) would do it? I would think this should be non-intrusive and useful and could go in 9.1. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
Boszormenyi Zoltan z...@cybertec.at writes: equal() is called over 8 million times with or without our patch: From where, though? You've provided not a shred of evidence that searching large ec_member lists is the problem. Also, did the test case you're using ever make it to the list? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
OK, so I tried a conversion with the it.po hack I showed before; not trying to fix any of the other instances yet, but just see what happens with the 8.4.3/8.4.4 case. It's definitely better: * Marc's 8.4.3 tag commit is now the last ancestor of REL8_4_3, and the previous commits in the branch are earlier ancestors. No more 8.4.3 as a stub branch. * it.po is shown as added, not modified, in Peter's 8.4-branch commit of 2010-05-13. * The cherrypick additions of xml2.out and xml2.sql no longer reference it.po too. But we're not quite there yet. What I find for it.po is these two commits, which immediately follow the addition of it.po on the main branch: commit fd0c9e8bbf50f65a6d03a5d5d59e19cf67c7bc94 refs/tags/REL8_4_3 Author: Peter Eisentraut pete...@gmx.net Date: Fri Feb 19 00:40:07 2010 + log addition on branch D src/bin/pg_dump/po/it.po commit f345298286359f666211c7555420d147222888bf refs/tags/REL8_4_3 Author: PostgreSQL Daemon webmas...@postgresql.org Date: Fri Feb 19 00:40:06 2010 + This commit was manufactured by cvs2svn to create branch 'REL8_4_STABLE'. Cherrypick from master 2010-02-19 00:40:05 UTC Peter Eisentraut pete...@gmx.net 'Translation updates for 9.0alpha4': src/bin/pg_dump/po/it.po A src/bin/pg_dump/po/it.po The first of these is the made-up deletion commit that I patched into it.po,v. But why are we getting the manufactured commit anyway? Max, is this what you expected to happen? Can we do better? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing plpython3u on 9.0beta3
On tis, 2010-07-13 at 20:28 -0500, Chris wrote: So if I have to explicitly set the python interpretor, how would you ever have a plpython2u and plpython3u function in the same DB (regardless of the fact that they can't run in the same session)? The manual implies you could have them both installed since it says that there's a per session limitation. After specifying the python3 interpretor, I can indeed now run plpython3u, but I (rather obviously) can't createlang plpython2u now. I would think that the plpython section of the manual may want some reference to that fact that that compile flag needs to be set. Added documentation about that. Additionally, What's New In Python 3.0 for the beta 3 docs on http://www.postgresql.org/docs/9.0/static/plpython-python23.html is dead. And fixed that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Martijn van Oosterhout klep...@svana.org writes: If the issue is just that select() doesn't get interrupted and we don't care about a couple of syscalls, would it not be better to simply use sigaction to turn on SA_RESTART just prior to the select() and turn it off just after. Or are these systems so broken that select() won't be interrupted, even if the signal handler is explicitly configured to do so? I think you mean turn *off* SA_RESTART. We'd have to do that for each signal that we were concerned about allowing to interrupt the select(), so it's more than just two added calls. Another small problem is that the latch code doesn't/shouldn't know what handlers are active, and AFAICS you can't use sigaction() to flip that flag without setting the handler address too. So while maybe we could do it that way, it'd be pretty dang messy. In my mind the main value of the Latch code will be to have a clean platform-independent API for waiting. Why all the angst about whether the implementation underneath is clean or not? It's more important that it *works* and we don't have to worry about whether it will break on platform XYZ. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Postgres 9.0.0 release scheduled
The core committee has decided that it's time to press forward with releasing 9.0. Barring catastrophic bug reports in the next week, 9.0.0 will be wrapped Thursday 9/16 for public announcement Monday 9/20. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
On Sep 8, 2010, at 3:17 PM, Darren Duncan dar...@darrenduncan.net wrote: Bruce Momjian wrote: Sergey Konoplev wrote: 1. CREATE FUNCTION func_name(arg_name text) RETURNS integer AS $$ BEGIN RAISE INFO '%', func_name.arg_name; ... 2. CREATE FUNCTION func_name() RETURNS integer AS $$ DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_name.var_name; ... 3. CREATE FUNCTION func_very_very_very_very_long_name() RETURNS integer AS $$ func_alias DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_alias.var_name; ... I suggest that it might be reasonable to introduce a new syntax, that isn't already valid for something inside a routine, and use that as a terse way to reference the current function and/or its parameters. This may best be a simple constant syntax. This has been proposed in the past and Tom has rejected it, but I agree that it would be useful. The key word in this proposal is terse. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Robert Haas wrote: On Sep 8, 2010, at 3:17 PM, Darren Duncan dar...@darrenduncan.net wrote: Bruce Momjian wrote: Sergey Konoplev wrote: 3. CREATE FUNCTION func_very_very_very_very_long_name() RETURNS integer AS $$ func_alias DECLARE var_name text := 'bla'; BEGIN RAISE INFO '%', func_alias.var_name; ... I suggest that it might be reasonable to introduce a new syntax, that isn't already valid for something inside a routine, and use that as a terse way to reference the current function and/or its parameters. This may best be a simple constant syntax. This has been proposed in the past and Tom has rejected it, but I agree that it would be useful. The key word in this proposal is terse. Absolutely. In fact I'm not particularly enamored with my .foo example suggestion because I would actually prefer for that particular syntax to be left unused and available for other possible future uses that are better thought out. I think instead that something akin to an explicit alias would both be more future-proofed and be the least surprising to existing users, as per #3. If the alias was very short, then we have something terse for usage. I should also say that this subject has some bearing on the topic of aliases or synonyms in general. In the situations where one wants an entity to be referenceable by more than one name, and knows this at the time of declaring said entity, there could be a syntax for declaring the extra names inline with the original. For example, if it wouldn't conflict with anything, one could use the | symbol (mnemonic is that means alternation in regular expressions) like this: CREATE FUNCTION func_very_very_very_very_long_name|short_name() ... ... but this could use some work since I also see that being useful for declaring synonyms inline, which are public names like the original, not just internal private names. When used for synonyms, this would still be represented in the system catalog as a function named func_very... and a synonym named short_name, this synonym being akin to a Unix soft link or a C symbolic alias in semantics. Similarly, and mainly for use with named argument syntax, a named parameter could have several names it could go by, declared with | also. Example: CREATE FUNCTION func_name(arg_name|altnm text) ... It doesn't have to be that syntax, but I demonstrated a principle, and I personally like | for the mnemonic. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Excerpts from Darren Duncan's message of mié sep 08 17:41:40 -0400 2010: For example, if it wouldn't conflict with anything, one could use the | symbol (mnemonic is that means alternation in regular expressions) like this: CREATE FUNCTION func_very_very_very_very_long_name|short_name() ... If you can name the function short_name, why not use just that in the first place? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Alvaro Herrera wrote: Excerpts from Darren Duncan's message of mié sep 08 17:41:40 -0400 2010: For example, if it wouldn't conflict with anything, one could use the | symbol (mnemonic is that means alternation in regular expressions) like this: CREATE FUNCTION func_very_very_very_very_long_name|short_name() ... If you can name the function short_name, why not use just that in the first place? More realistic examples would be either of: 1. Offer users the choice of a longer more self-describing name and a terser name. For example: function is_member_of|in (...) 2. Offer users the choice of similar length but different names. For example: function sum|add(x integer, y integer) returns integer 3. Make it easier to change your mind on a name while providing backwards compatibility for awhile. For example: function new_name|old_name (...) Personally I like the idea of developers not always having to be forced to choose among two equally good names, and making a wrapper function would be overkill for this feature. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Excerpts from Darren Duncan's message of mié sep 08 18:29:35 -0400 2010: Personally I like the idea of developers not always having to be forced to choose among two equally good names, and making a wrapper function would be overkill for this feature. While I don't agree with the idea of providing extra names that are probably mostly going to increase the confusion of someone trying to understand such a system, I think this use case would be well covered by synonyms. But these would be defined by a new SQL command, say CREATE SYNONYM, not by funny notation on the initial CREATE FUNCTION call. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function_name.parameter_name
Alvaro Herrera wrote: Excerpts from Darren Duncan's message of mié sep 08 18:29:35 -0400 2010: Personally I like the idea of developers not always having to be forced to choose among two equally good names, and making a wrapper function would be overkill for this feature. While I don't agree with the idea of providing extra names that are probably mostly going to increase the confusion of someone trying to understand such a system, I think this use case would be well covered by synonyms. But these would be defined by a new SQL command, say CREATE SYNONYM, not by funny notation on the initial CREATE FUNCTION call. Yes, and having a more general solution like CREATE SYNONYM is more important to have anyway. My | is simply a syntactic shorthand for a special case of CREATE SYNONYM, with respect to schema objects, and would parse into the same thing. I don't feel any need now for me to push this shorthand further. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On Mon, Sep 06, 2010 at 07:24:59PM +0200, Markus Wanner wrote: Do I understand correctly that the purpose of this patch is to work around the brokenness of select() on very few platforms? Or is there any additional feature that plain signals don't give us? If the issue is just that select() doesn't get interrupted and we don't care about a couple of syscalls, would it not be better to simply use sigaction to turn on SA_RESTART just prior to the select() and turn it off just after. Or are these systems so broken that select() won't be interrupted, even if the signal handler is explicitly configured to do so? Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On 08/09/10 23:07, Martijn van Oosterhout wrote: On Mon, Sep 06, 2010 at 07:24:59PM +0200, Markus Wanner wrote: Do I understand correctly that the purpose of this patch is to work around the brokenness of select() on very few platforms? Or is there any additional feature that plain signals don't give us? If the issue is just that select() doesn't get interrupted and we don't care about a couple of syscalls, would it not be better to simply use sigaction to turn on SA_RESTART just prior to the select() and turn it off just after. Or are these systems so broken that select() won't be interrupted, even if the signal handler is explicitly configured to do so? I don't know if SA_RESTART is portable. But in any case, that will do nothing about the race condition where the signal arrives just *before* the select() call. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers