Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On 5 September 2016 at 21:58, Claudio Freire wrote: How long does that part ever take? Is there any substantial gain from this? > Btw, without a further patch to prefetch pages on the backward scan > for truncate, however, my patience ran out before it finished > truncating. I haven't submitted that patch because there was an > identical patch in an older thread that was discussed and more or less > rejected since it slightly penalized SSDs. OK, thats enough context. Sorry for being forgetful on that point. Please post that new patch also. This whole idea of backwards scanning to confirm truncation seems wrong. What we want is an O(1) solution. Thinking. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for changes to recovery.conf API
On Tue, Sep 6, 2016 at 2:18 PM, Abhijit Menon-Sen wrote: > One open issue is the various assign_recovery_target_xxx functions, > which Michael noted in his earlier review: > [...] > I don't like this code, but I'm not yet sure what to replace it with. I > think we should address the underlying problem—that the UI doesn't map > cleanly to what the code wants. There's been some discussion about this > earlier, but not any consensus that I could see. By moving all the recovery parameters to GUC, we will need to define a hierarchy among the target types. I see no way to avoid that except by changing the parametrization but that would be more confusing for the users. So that will not be anymore the last one wins, but the first one listed wins in all the ones enabled that wins. I think that we could leverage that a little bit though and reduce the complexity of the patch: my best advice here is to make all those recovery_target_* parameters PGC_POSTMASTER so as they are loaded only once when the server starts, and then we define the recovery target type used in the startup process instead of trying to do so at GUC level. This does not change the fact that we'd still need a hierarchy among the target types, but it slightly reduces the complexity of the patch. And this does not prevent further enhancements by switching them later to be PGC_SIGHUP, really. I'd really like to see this improvement, but I don't think that this applies to this change, which is complicated enough, and will likely introduce its lot of bugs even after several reviews. > Do we want something like this (easy to implement and document, perhaps > not especially convenient to use): > > recovery_target = 'xid' # or 'time'/'name'/'lsn'/'immediate' > recovery_target_xid = xxx? # the only setting we care about now > recovery_target_otherthings = parsed_but_ignored > > Or something like this (a bit harder to implement): > > recovery_target = 'xid:xxx' # or 'time:xxx' etc. Interesting ideas, particularly the last one. Mixing both: recovery_target_type = 'foo' # can be 'immediate' recovery_target_value = 'value_of_foo' # does not matter for 'immediate' > Alternatively, the do-nothing option is to move the tests from guc.c to > StartupXLOG and do it only once in some defined order (which would also > break the current last-mention-wins behaviour). My vote goes in favor of that.. +else if (recovery_target_lsn != 0) +recovery_target = RECOVERY_TARGET_LSN; This needs to check for InvalidXLogRecPtr. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for changes to recovery.conf API
Hi. Here's an updated version of my patch, which now applies on top of the patch that Simon posted earlier (recovery_startup_r10_api.v1b.patch). A few notes: 1. I merged in recovery_target_lsn as a new GUC setting. 2. I fixed various minor nits in the earlier patch, not worth mentioning individually. 3. I haven't added back trigger_file, because Simon's patch removes it. I can add it back in separately after discussion (otherwise Simon's and my patches will conflict). 4. I've tested this to the extent that setting things in postgresql.conf works, and recovery.conf is still read if it exists, and so on. One open issue is the various assign_recovery_target_xxx functions, which Michael noted in his earlier review: +static void +assign_recovery_target_xid(const char *newval, void *extra) +{ + recovery_target_xid = *((TransactionId *) extra); + + if (recovery_target_xid != InvalidTransactionId) + recovery_target = RECOVERY_TARGET_XID; + else if (recovery_target_name && *recovery_target_name) + recovery_target = RECOVERY_TARGET_NAME; + else if (recovery_target_time != 0) + recovery_target = RECOVERY_TARGET_TIME; + else if (recovery_target_lsn != 0) + recovery_target = RECOVERY_TARGET_LSN; + else + recovery_target = RECOVERY_TARGET_UNSET; +} (Note how recovery_target_lsn caused this—and three other functions besides—to grow an extra branch.) I don't like this code, but I'm not yet sure what to replace it with. I think we should address the underlying problem—that the UI doesn't map cleanly to what the code wants. There's been some discussion about this earlier, but not any consensus that I could see. Do we want something like this (easy to implement and document, perhaps not especially convenient to use): recovery_target = 'xid' # or 'time'/'name'/'lsn'/'immediate' recovery_target_xid = xxx? # the only setting we care about now recovery_target_otherthings = parsed_but_ignored Or something like this (a bit harder to implement): recovery_target = 'xid:xxx' # or 'time:xxx' etc. Alternatively, the do-nothing option is to move the tests from guc.c to StartupXLOG and do it only once in some defined order (which would also break the current last-mention-wins behaviour). Thoughts? (I've added Fujii to the Cc: list, in case he has any comments, since this is based on his earlier patch.) -- Abhijit commit c20d735648f5ea867fda5afc499a63d3877536a3 Author: Abhijit Menon-Sen Date: Thu Sep 1 09:01:04 2016 +0530 Convert recovery.conf settings to GUCs Based on unite_recoveryconf_postgresqlconf_v3.patch by Fujii Masao. diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c index e4136f9..8fcb85c 100644 --- a/contrib/pg_standby/pg_standby.c +++ b/contrib/pg_standby/pg_standby.c @@ -520,7 +520,7 @@ usage(void) printf(" -w MAXWAITTIME max seconds to wait for a file (0=no limit) (default=0)\n"); printf(" -?, --help show this help, then exit\n"); printf("\n" - "Main intended use as restore_command in recovery.conf:\n" + "Main intended use as restore_command in postgresql.conf:\n" " restore_command = 'pg_standby [OPTION]... ARCHIVELOCATION %%f %%p %%r'\n" "e.g.\n" " restore_command = 'pg_standby /mnt/server/archiverdir %%f %%p %%r'\n"); diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 0f09d82..f43e41e 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -1174,8 +1174,15 @@ SELECT pg_stop_backup(); - Create a recovery command file recovery.conf in the cluster - data directory (see ). You might + Set up recovery parameters in postgresql.conf (see + and + ). + + + + + Create a recovery trigger file recovery.trigger + in the cluster data directory. You might also want to temporarily modify pg_hba.conf to prevent ordinary users from connecting until you are sure the recovery was successful. @@ -1187,7 +1194,7 @@ SELECT pg_stop_backup(); recovery be terminated because of an external error, the server can simply be restarted and it will continue recovery. Upon completion of the recovery process, the server will rename - recovery.conf to recovery.done (to prevent + recovery.trigger to recovery.done (to prevent accidentally re-entering recovery mode later) and then commence normal database operations. @@ -1203,12 +1210,11 @@ SELECT pg_stop_backup(); -The key part of all this is to set up a recovery configuration file that -describes how you want to recover and how far the recovery should -run. You can use recovery.conf.sample (normally -located in the installation's share/ directory) as a -prototype. The one thing that you absolutely must specify in -recovery.conf is the restore_command, +The key part of all this
Re: [HACKERS] Patch: Implement failover on libpq connect level.
On Tue, 6 Sep 2016 07:58:28 +0530 Mithun Cy wrote: > > Now if only one host is in connection string and it ask for read_write > connection(connect to master) I mean read_only is set 0 explicitly. > With above logic we will allow it to connect to standby?. I still > think psql connection to standby should be handled by changing the > default value of readonly to 1 (which means connect to any). It would definitely be more logical, but I haven't found easy way to do it. May be I should return to this place and rethink. I don't like and idea to explicitely ignore connection string parameter due to some condition. Really, I think, test for replication connection can be either removed from this part of code now. > Thinking > further probably replacing readonly parameter with > targetServerType=any|master (with default being any) should clear > some confusions and bring consistency since same is used in JDBC > multi host connection string. It seems that in this context change readonly=0|1 to targetServerType=any|master makes sense. > 2) > @@ -1460,33 +1538,80 @@ connectDBStart(PGconn *conn) >portstr, >(int) (UNIXSOCK_PATH_BUFLEN - 1)); > conn->options_valid = false; > + free(nodes->port); > > nodes->port was not allocated at this point. > I'll recheck it. -- Victor Wagner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: function xmltable
On 4 September 2016 at 16:06, Pavel Stehule wrote: > Hi > > minor update - using DefElem instead own private parser type I'm really glad that you're doing this and I'll take a look at it for this CF. It's quite a big patch so I expect this will take a few rounds of review and updating. Patch applies cleanly and builds cleanly on master both with and without --with-xml . Overall, I think this needs to be revised with appropriate comments. Whitespace/formatting needs fixing since it's all over the place. Documentation is insufficient (per notes below). Re identifier naming, some of this code uses XmlTable naming patterns, some uses TableExpr prefixes. Is that intended to indicate a bounary between things re-usable for other structured data ingesting functions? Do you expect a "JSONEXPR" or similar in future? That's alluded to by +/*-- + * TableExpr - used for XMLTABLE function + * + * This can be used for json_table, jsonb_table functions in future + *-- + */ +typedef struct TableExpr +{ ... If so, should this really be two patches, one to add the table expression infrastructure and another to add XMLTABLE that uses it? Also, why in that case does so much of the TableExpr code call directly into XmlTable code? It doesn't look very generic. Overall I find identifier naming to be a bit inconsisent and think it's necessary to make it clear that all the "TableExpr" stuff is for XMLTABLE specifically, if that's the case, or make the delineation clearer if not. I'd also like to see tests that exercise the ruleutils get_rule_expr parts of the code for the various XMLTABLE variants. Similarly, since this seems to add a new xpath parser, that needs comprehensive tests. Maybe re-purpose an existing xpath test data set? More detailed comments: Docs comments: The xmltable produces [a] table based on [the] passed XML value. The docs are pretty minimal and don't explain the various clauses of XMLTABLE. What is "BY REF" ? Is PATH an xpath expression? If so, is there a good cross reference link available? The PASSING clause? etc. How does XMLTABLE decide what to iterate over, and how to iterate over it? Presumably the FOR ORDINALITY clause makes a column emit a numeric counter. What standard, if any, does this conform to? Does it resemble implementations elsewhere? What limitations or unsupported features does it have relative to those standards? execEvalTableExpr seems to be defined twice, with a difference in case. This is probably not going to fly: +static Datum +execEvalTableExpr(TableExprState *tstate, +ExprContext *econtext, +bool *isNull, ExprDoneCond *isDone) +{ +static Datum +ExecEvalTableExpr(TableExprState *tstate, +ExprContext *econtext, +bool *isNull, ExprDoneCond *isDone) +{ It looks like you've split the function into a "guts" and "wrapper" part, with the error handling PG_TRY / PG_CATCH block in the wrapper. That seems reasonable for readability, but the naming isn't. A comment is needed to explain what ExecEvalTableExpr is / does. If it's XMLTABLE specific (which it looks like based on the code), its name should reflect that. This pattern is repeated elsewhere; e.g. TableExprState is really the state for an XMLTABLE expression. But PostgreSQL actually has TABLE statements, and in future we might want to support table-expressions, so I don't think this naming is appropriate. This is made worse by the lack of comments on things like the definition of TableExprState. Please use something that makes it clear it's for XMLTABLE and add appropriate comments. Formatting of variables, arguments, function signatures etc is random/haphazard and doesn't follow project convention. It's neither aligned or unaligned in the normal way, I don't understand the random spacing at all. Maybe you should try to run pgindent and then extract just the changes related to your patch? Or run your IDE/editor's indent function on your changes? Right now it's actually kind of hard to read. Do you edit with tabstop set to 1 normally or something like that? There's a general lack of comments throughout the added code. In execEvalTableExpr, why are we looping over namespaces? What's that for? Comment would be nice. Typo: Path caclulation => Path calculation What does XmlTableSetRowPath() do? It seems to copy its argument. Nothing further is done with the row_path argument after it's called by execEvalTableExpr, so what context is that memory in and do we have to worry about it if it's large? execEvalTableExpr says it's doing "path calculation". What it actually appears to do is evaluate the path expressions, if provided, and otherwise use the column name as the implied path expression. (The docs should mention that). It's wasn't immediately obvious to me what the branch around tstate->for_ordinality_col is for and what the alternate path's purpose is in terms of XMLTABLE's behavio
Re: [HACKERS] Refactoring of heapam code.
On Mon, Aug 8, 2016 at 3:13 PM, Anastasia Lubennikova < a.lubennik...@postgrespro.ru> wrote: > >> > Thank you for the review, I'll fix these problems in final version. > > Posting the first message I intended to raise the discussion. Patches > were attached mainly to illustrate the problem and to be more concrete. > I started looking at the first patch posted above, but it seems you'll rewrite these patches after discussion on the heapam refactoring ideas you posted here https://wiki.postgresql.org/wiki/HeapamRefactoring concludes. Some comments anyways on the first patch: 1. It does not apply cleanly with git-apply - many white space errors 2. I don't understand the difference between PageGetItemHeapHeaderOnly() and PageGetItemHeap(). They seem to do exactly the same thing and can be used interchangeably. 3. While I like the idea of using separate interfaces to get heap/index tuple from a page, may be they can internally use a common interface instead of duplicating what PageGetItem() does already. 4. Should we rename PageGetItemHeap() to PageGetHeapTuple() or something like that? 5. If we do this, we should probably have corresponding wrapper functions/macros for remaining callers of PageGetItem() I also looked at the refactoring design doc. Looks like a fine approach to me, but the API will need much more elaborate discussions. I am not sure if the interfaces as presented are enough to support everything that even heapam can do today. And much more thoughts will be required to ensure we don't miss out things that new primary access method may need. A few points regarding how the proposed API maps to heapam: - How do we support parallel scans on heap? - Does the primary AM need to support locking of rows? - Would there be separate API to interpret the PAM tuple itself? (something that htup_details.h does today). It seems natural that every PAM will have its own interpretation of tuple structure and we would like to hide that inside PAM implementation. - There are many upper modules that need access to system attributes (xmin, xmax for starters). How do you plan to handle that? You think we can provide enough abstraction so that the callers don't need to know the tuple structures of individual PAM? I don't know what to do with the CF entry itself. I could change the status to "waiting on author" but then the design draft may not get enough attention. So I'm leaving it in the current state for others to look at. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] IF (NOT) EXISTS in psql-completion
Hello, At Sun, 4 Sep 2016 12:54:57 +0200, Pavel Stehule wrote in > This patch needs rebase. Thank you. I'll rebase the following patch and repost as soon as possible. https://www.postgresql.org/message-id/20160407.211917.147996130.horiguchi.kyot...@lab.ntt.co.jp regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Comment Typo
On Tue, Sep 6, 2016 at 10:52:22AM +0900, Amit Langote wrote: > Attached fixes a typo in header comment in libpq-be.h. > > s/libpq_be.h/libpq-be.h/g Applied. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > HORIGUCHI Implementing radix tree code, then redefining the format of mapping table > to suppot radix tree, then modifying mapping generator script are needed. > > If no one oppse to this, I'll do that. +100 Great analysis and your guts. I very much appreciate your trial! Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
Hello, At Mon, 5 Sep 2016 19:38:33 +0300, Heikki Linnakangas wrote in <529db688-72fc-1ca2-f898-b0b99e300...@iki.fi> > On 09/05/2016 05:47 PM, Tom Lane wrote: > > "Tsunakawa, Takayuki" writes: > >> Before digging into the problem, could you share your impression on > >> whether PostgreSQL can support SJIS? Would it be hopeless? > > > > I think it's pretty much hopeless. > > Agreed. +1, even as a user of SJIS:) > But one thing that would help a little, would be to optimize the UTF-8 > -> SJIS conversion. It uses a very generic routine, with a binary > search over a large array of mappings. I bet you could do better than > that, maybe using a hash table or a radix tree instead of the large > binary-searched array. I'm very impressed by the idea. Mean number of iterations for binsearch on current conversion table with 8000 characters is about 13 and the table size is under 100kBytes (maybe). A three-level array with 2 byte values will take about 1.6~2MB of memory. A radix tree for UTF-8->some-encoding conversion requires about, or up to.. (using 1 byte index to point the next level) (1 * ((7f + 1) + (df - c2 + 1) * (bf - 80 + 1) + (ef - e0 + 1) * (bf - 80 + 1)^2)) = 67 kbytes. SJIS characters are 2byte length at longest so about 8000 characters takes extra 16 k Bytes. And some padding space will be added on them. As the result, radix tree seems to be promising because of small requirement of additional memory and far less comparisons. Also Big5 and other encodings including EUC-* will get benefit from it. Implementing radix tree code, then redefining the format of mapping table to suppot radix tree, then modifying mapping generator script are needed. If no one oppse to this, I'll do that. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
"Tsunakawa, Takayuki" writes: > Using multibyte-functions like mb... to process characters would solve > the problem? Well, sure. The problem is (1) finding all the places that need that (I'd estimate dozens to hundreds of places in the core code, and then there's the question of extensions); (2) preventing new non-multibyte-aware code from being introduced after you've fixed those places; and (3) the performance penalties you'd take, because a lot of those places are bottlenecks and it's much cheaper to not worry about character lengths in an inner loop. > Isn't the current implementation blocking the support of > other character sets that have similar characteristics? Sure, SJIS is not the only encoding that we consider frontend-only. See https://www.postgresql.org/docs/devel/static/multibyte.html#MULTIBYTE-CHARSET-SUPPORTED regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Heikki > But one thing that would help a little, would be to optimize the UTF-8 > -> SJIS conversion. It uses a very generic routine, with a binary search > over a large array of mappings. I bet you could do better than that, maybe > using a hash table or a radix tree instead of the large binary-searched > array. That sounds worth pursuing. Thanks! Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cache Hash Index meta page.
On Tue, Sep 6, 2016 at 12:20 AM, Mithun Cy wrote: > On Sep 2, 2016 7:38 PM, "Jesper Pedersen" > wrote: >> Could you provide a rebased patch based on Amit's v5 ? > > Please find the the patch, based on Amit's V5. > I think you want to say based on patch in the below mail: https://www.postgresql.org/message-id/CAA4eK1J6b8O4PcEPqRxNYbLVbfToNMJEEm%2Bqn0jZX31-obXrJw%40mail.gmail.com It is better if we can provide the link for a patch on which the current patch is based on, that will help people to easily identify the dependent patches. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
From: Tom Lane [mailto:t...@sss.pgh.pa.us] > "Tsunakawa, Takayuki" writes: > > Before digging into the problem, could you share your impression on > > whether PostgreSQL can support SJIS? Would it be hopeless? > > I think it's pretty much hopeless. Even if we were willing to make every > bit of code that looks for '\' and other specific at-risk characters > multi-byte aware (with attendant speed penalties), we could expect that > third-party extensions would still contain vulnerable code. More, we could > expect that new bugs of the same ilk would get introduced all the time. > Many such bugs would amount to security problems. So the amount of effort > and vigilance required seems out of proportion to the benefits. Hmm, this sounds like a death sentence. But as I don't have good knowledge of character set handling yet, I'm not completely convinced about why PostgreSQL cannot support SJIS. I wonder why and how other DBMSs support SJIS and what's the difference of the implementation. Using multibyte-functions like mb... to process characters would solve the problem? Isn't the current implementation blocking the support of other character sets that have similar characteristics? I'll learn the character set handling... > Most of the recent discussion about allowed backend encodings has run more > in the other direction, ie, "why don't we disallow everything but > UTF8 and get rid of all the infrastructure for multiple backend encodings?". > I'm not personally in favor of that, but there are very few hackers who > want to add any more overhead in this area. Personally, I totally agree. I want non-Unicode character sets to disappear from the world. But the real business doesn't seem to forgive the lack of SJIS... Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On Mon, Sep 5, 2016 at 11:34 PM, Tomas Vondra wrote: > > > On 09/05/2016 06:03 AM, Amit Kapila wrote: >> So, in short we have to compare three >> approaches here. >> >> 1) Group mode to reduce CLOGControlLock contention >> 2) Use granular locking model >> 3) Use atomic operations >> >> For approach-1, you can use patch [1]. For approach-2, you can use >> 0001-Improve-64bit-atomics-support patch[2] and the patch attached >> with this mail. For approach-3, you can use >> 0001-Improve-64bit-atomics-support patch[2] and the patch attached >> with this mail by commenting USE_CONTENT_LOCK. If the third doesn't >> work for you then for now we can compare approach-1 and approach-2. >> > > OK, I can compile all three cases - but onl with gcc 4.7 or newer. Sadly > the 4-socket 64-core machine runs Debian Jessie with just gcc 4.6 and my > attempts to update to a newer version were unsuccessful so far. > So which all patches your are able to compile on 4-socket m/c? I think it is better to measure the performance on bigger machine. >> I have done some testing of these patches for read-write pgbench >> workload and doesn't find big difference. Now the interesting test >> case could be to use few sub-transactions (may be 4-8) for each >> transaction as with that we can see more contention for >> CLOGControlLock. > > Understood. So a bunch of inserts/updates interleaved by savepoints? > Yes. > I presume you started looking into this based on a real-world > performance issue, right? Would that be a good test case? > I had started looking into it based on LWLOCK_STATS data for read-write workload (pgbench tpc-b). I think it will depict many of the real-world read-write workloads. >> >> Few points to note for performance testing, one should use --unlogged >> tables, else the WAL writing and WALWriteLock contention masks the >> impact of this patch. The impact of this patch is visible at >> higher-client counts (say at 64~128). >> > > Even on good hardware (say, PCIe SSD storage that can do thousands of > fsyncs per second)? Not sure, because it could be masked by WALWriteLock contention. > Does it then make sense to try optimizing this if > the effect can only be observed without the WAL overhead (so almost > never in practice)? > It is not that there is no improvement with WAL overhead (like one can observe that via LWLOCK_STATS apart from TPS), but it is clearly visible with unlogged tables. The situation is not that simple, because let us say we don't do anything for the remaining contention for CLOGControlLock, then when we try to reduce the contention around other locks like WALWriteLock or may be ProcArrayLock, there is a chance that contention will shift to CLOGControlLock. So, the basic idea is to get the big benefits, we need to eliminate contention around each of the locks. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
Andres Freund writes: > On September 5, 2016 7:26:42 AM PDT, Tom Lane wrote: >> The main problem I can see with this is that serial columns will have >> default expressions that are written out as >> "nextval('foo_f1_seq'::regclass)". I do not think we can afford to >> break dumps containing that, but I'm not sure how to get the regclass >> cast replaced with a regsequence cast. > Why not just continue having a pgclass entry, but no relfilenode? Yeah, maybe. I was hoping to dispense with the pg_attribute rows, but maybe that's not enough overhead to worry about. In this viewpoint, we'd keep the sequence-specific data in a pg_sequence catalog. pg_sequence rows would be extensions of the associated pg_class rows in much the same way that pg_index rows extend the pg_class entries for indexes. We should supply a view pg_sequences that performs the implied join, and encourage users to select from that rather than directly from pg_sequence (compare pg_indexes view). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Implement failover on libpq connect level.
On Mon, Sep 5, 2016 at 4:33 PM, Aleksander Alekseev < a.aleks...@postgrespro.ru> wrote: >After a brief examination I've found following ways to improve the patch. Adding to above comments. 1) + /* + * consult connection options and check if RO connection is OK + * RO connection is OK if readonly connection is explicitely + * requested or if replication option is set, or just one + * host is specified in the connect string. + */ + if ((conn->pghost==NULL || strchr(conn->pghost,',')==NULL) || + ((conn->read_only && conn->read_only[0] > '0')) + ||(conn->replication && conn->replication[0]) + ) + { Now if only one host is in connection string and it ask for read_write connection(connect to master) I mean read_only is set 0 explicitly. With above logic we will allow it to connect to standby?. I still think psql connection to standby should be handled by changing the default value of readonly to 1 (which means connect to any). Thinking further probably replacing readonly parameter with targetServerType=any|master (with default being any) should clear some confusions and bring consistency since same is used in JDBC multi host connection string. 2) @@ -1460,33 +1538,80 @@ connectDBStart(PGconn *conn) portstr, (int) (UNIXSOCK_PATH_BUFLEN - 1)); conn->options_valid = false; + free(nodes->port); nodes->port was not allocated at this point. -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com
Re: Install extensions using update scripts (was Re: [HACKERS] Remove superuser() checks from pgstattuple)
On 2016-09-05 22:24:09 -0400, Tom Lane wrote: > Ordinarily I'd be willing to stick this on the queue for the next > commitfest, but it seems like we ought to try to get it pushed now > so that Stephen can make use of the feature for his superuser changes. > Thoughts? Seems sensible to me. I can have a look at it one of the next few days if you want. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Install extensions using update scripts (was Re: [HACKERS] Remove superuser() checks from pgstattuple)
Andres Freund writes: > On September 4, 2016 6:33:30 PM PDT, Tom Lane wrote: >> I think nearly all of the >> infrastructure for this is already there in extension.c. > Yes, it doesn't sound very hard... I poked at this a bit, and indeed it's not that hard. Attached is a proposed patch (code-complete, I think, but no docs as yet). Aside from touching CREATE EXTENSION itself, this modifies the pg_available_extension_versions view to show versions that are installable on the basis of update chains. I think pg_extension_update_paths() needs no changes, though. Ordinarily I'd be willing to stick this on the queue for the next commitfest, but it seems like we ought to try to get it pushed now so that Stephen can make use of the feature for his superuser changes. Thoughts? regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fa861e6..4d82527 100644 *** a/src/backend/commands/extension.c --- b/src/backend/commands/extension.c *** typedef struct ExtensionVersionInfo *** 100,105 --- 100,106 static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, ExtensionVersionInfo *evi_target, + bool reject_indirect, bool reinitialize); static void get_available_versions_for_extension(ExtensionControlFile *pcontrol, Tuplestorestate *tupstore, *** identify_update_path(ExtensionControlFil *** 1071,1077 evi_target = get_ext_ver_info(newVersion, &evi_list); /* Find shortest path */ ! result = find_update_path(evi_list, evi_start, evi_target, false); if (result == NIL) ereport(ERROR, --- 1072,1078 evi_target = get_ext_ver_info(newVersion, &evi_list); /* Find shortest path */ ! result = find_update_path(evi_list, evi_start, evi_target, false, false); if (result == NIL) ereport(ERROR, *** identify_update_path(ExtensionControlFil *** 1086,1091 --- 1087,1095 * Apply Dijkstra's algorithm to find the shortest path from evi_start to * evi_target. * + * If reject_indirect is true, ignore paths that go through installable + * versions (presumably, caller will consider starting from such versions). + * * If reinitialize is false, assume the ExtensionVersionInfo list has not * been used for this before, and the initialization done by get_ext_ver_info * is still good. *** static List * *** 1097,1102 --- 1101,1107 find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, ExtensionVersionInfo *evi_target, + bool reject_indirect, bool reinitialize) { List *result; *** find_update_path(List *evi_list, *** 1105,1110 --- 1110,1117 /* Caller error if start == target */ Assert(evi_start != evi_target); + /* Caller error if reject_indirect and target is installable */ + Assert(!(reject_indirect && evi_target->installable)); if (reinitialize) { *** find_update_path(List *evi_list, *** 1131,1136 --- 1138,1146 ExtensionVersionInfo *evi2 = (ExtensionVersionInfo *) lfirst(lc); int newdist; + /* if reject_indirect, treat installable versions as unreachable */ + if (reject_indirect && evi2->installable) + continue; newdist = evi->distance + 1; if (newdist < evi2->distance) { *** find_update_path(List *evi_list, *** 1167,1172 --- 1177,1239 } /* + * Given a target version that is not directly installable, find the + * best installation sequence starting from a directly-installable version. + * + * evi_list: previously-collected version update graph + * evi_target: member of that list that we want to reach + * + * Returns the best starting-point version, or NULL if there is none. + * On success, *best_path is set to the path from the start point. + * + * If there's more than one possible start point, prefer shorter update paths, + * and break any ties arbitrarily on the basis of strcmp'ing the starting + * versions' names. + */ + static ExtensionVersionInfo * + find_install_path(List *evi_list, ExtensionVersionInfo *evi_target, + List **best_path) + { + ExtensionVersionInfo *evi_start = NULL; + ListCell *lc; + + /* Target should not be installable */ + Assert(!evi_target->installable); + + *best_path = NIL; + + /* Consider all installable versions as start points */ + foreach(lc, evi_list) + { + ExtensionVersionInfo *evi1 = (ExtensionVersionInfo *) lfirst(lc); + List *path; + + if (!evi1->installable) + continue; + + /* + * Find shortest path from evi1 to evi_target; but no need to consider + * paths going through other installable versions. + */ + path = find_update_path(evi_list, evi1, evi_target, true, true); + if (path == NIL) + continue; + + /* Remember best path */ + if (evi_start == NULL || + list_length(path) < list_leng
[HACKERS] Comment Typo
Attached fixes a typo in header comment in libpq-be.h. s/libpq_be.h/libpq-be.h/g Thanks, Amit diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h index ecdfbc6..b91eca5 100644 --- a/src/include/libpq/libpq-be.h +++ b/src/include/libpq/libpq-be.h @@ -1,6 +1,6 @@ /*- * - * libpq_be.h + * libpq-be.h * This file contains definitions for structures and externs used * by the postmaster during client authentication. * -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speedup twophase transactions
On Sat, Sep 3, 2016 at 10:26 PM, Michael Paquier wrote: > On Fri, Sep 2, 2016 at 5:06 AM, Simon Riggs wrote: >> On 13 April 2016 at 15:31, Stas Kelvich wrote: >> >>> Fixed patch attached. There already was infrastructure to skip currently >>> held locks during replay of standby_redo() and I’ve extended that with >>> check for >>> prepared xids. >> >> Please confirm that everything still works on current HEAD for the new >> CF, so review can start. > > The patch does not apply cleanly. Stas, could you rebase? I am > switching the patch to "waiting on author" for now. So, I have just done the rebase myself and did an extra round of reviews of the patch. Here are couple of comments after this extra lookup. LockGXactByXid() is aimed to be used only in recovery, so it seems adapted to be to add an assertion with RecoveryInProgress(). Using this routine in non-recovery code paths is risky because it assumes that a PGXACT could be missing, which is fine in recovery as prepared transactions could be moved to twophase files because of a checkpoint, but not in normal cases. We could also add an assertion of the kind gxact->locking_backend == InvalidBackendId before locking the PGXACT but as this routine is just normally used by the startup process (in non-standalone mode!) that's fine without. The handling of invalidation messages and removal of relfilenodes done in FinishGXact can be grouped together, checking only once for !RecoveryInProgress(). + * + * The same procedure happens during replication and crash recovery. * "during WAL replay" is more generic and applies here. + +next_file: + continue; + That can be removed and replaced by a "continue;". + /* +* At first check prepared tx that wasn't yet moved to disk. +*/ + LWLockAcquire(TwoPhaseStateLock, LW_SHARED); + for (i = 0; i < TwoPhaseState->numPrepXacts; i++) + { + GlobalTransaction gxact = TwoPhaseState->prepXacts[i]; + PGXACT *pgxact = &ProcGlobal->allPgXact[gxact->pgprocno]; + + if (TransactionIdEquals(pgxact->xid, xid)) + { + LWLockRelease(TwoPhaseStateLock); + return true; + } + } + LWLockRelease(TwoPhaseStateLock); This overlaps with TwoPhaseGetGXact but I'd rather keep both things separated: it does not seem worth complicating the existing interface, and putting that in cache during recovery has no meaning. I have also reworked the test format, and fixed many typos and grammar problems in the patch as well as in the tests. After review the result is attached. Perhaps a committer could get a look at it? -- Michael diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 9f55adc..eb7c339 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -45,8 +45,8 @@ * fsynced * * If COMMIT happens after checkpoint then backend reads state data from * files - * * In case of crash replay will move data from xlog to files, if that - * hasn't happened before. XXX TODO - move to shmem in replay also + * + * The same procedure happens during WAL replay. * *- */ @@ -578,6 +578,45 @@ LockGXact(const char *gid, Oid user) } /* + * LockGXactByXid + * + * Find prepared transaction by xid and lock corresponding GXACT. + * This is used during recovery as an alternative to LockGXact(), and + * should only be used in recovery. No entries found means that a checkpoint + * has moved the searched prepared transaction data to a twophase file. + * + * Returns the transaction data if found, or NULL if nothing has been locked. + */ +static GlobalTransaction +LockGXactByXid(TransactionId xid) +{ + int i; + GlobalTransaction gxact = NULL; + + Assert(RecoveryInProgress()); + + LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE); + for (i = 0; i < TwoPhaseState->numPrepXacts; i++) + { + PGXACT *pgxact; + + gxact = TwoPhaseState->prepXacts[i]; + pgxact = &ProcGlobal->allPgXact[gxact->pgprocno]; + + if (TransactionIdEquals(xid, pgxact->xid)) + { + /* ok to lock it */ + gxact->locking_backend = MyBackendId; + MyLockedGxact = gxact; + break; + } + } + LWLockRelease(TwoPhaseStateLock); + + return gxact; +} + +/* * RemoveGXact * Remove the prepared transaction from the shared memory array. * @@ -1241,9 +1280,9 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) * Reads 2PC data from xlog. During checkpoint this data will be moved to * twophase files and ReadTwoPhaseFile should be used instead. * - * Note clearly that this function accesses WAL during normal operation, similarly - * to the way WALSender or Logical Decoding would do. It does not run during - * crash recovery or standby processing. + * Note that this function accesses WAL not only during recovery but also + * during normal operation, similarly to the way WALSender or Logical + * Decoding would do. */ static void XlogReadT
Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai > Sent: Monday, September 05, 2016 12:58 PM > To: Jeevan Chalke > Cc: pgsql-hackers@postgresql.org; Etsuro Fujita > Subject: Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan > > > On Mon, Aug 29, 2016 at 7:25 AM, Kouhei Kaigai wrote: > > > > > > Hello, > > > > The attached patch adds an optional callback to support special > > optimization > > if ForeignScan/CustomScan are located under the Limit node in plan-tree. > > > > Our sort node wisely switches the behavior when we can preliminary know > > exact number of rows to be produced, because all the Sort node has to > > return is the top-k rows when it is located under the Limit node. > > It is much lightweight workloads than sorting of entire input rows when > > nrows is not small. > > > > In my case, this information is very useful because GPU can complete its > > sorting operations mostly on L1-grade memory if we can preliminary know > > the top-k value is enough small and fits to size of the fast memory. > > > > Probably, it is also valuable for Fujita-san's case because this > > information > > allows to attach "LIMIT k" clause on the remote query of postgres_fdw. > > It will reduce amount of the network traffic and remote CPU consumption > > once we got support of sort pushdown. > > > > > > > > One thing we need to pay attention is cost estimation on the planner > > stage. > > In the existing code, only create_ordered_paths() and > > create_merge_append_path() > > considers the limit clause for cost estimation of sorting. They use the > > 'limit_tuples' of PlannerInfo; we can reference the structure when > > extension > > adds ForeignPath/CustomPath, so I think we don't need a special > > enhancement > > on the planner stage. > > > Thanks for your comments. > > > I believe this hook is gets called at execution time. > > So to push LIMIT clause like you said above we should use "limit_tuples" at > > the time > > of planning and then use this hook to optimize at runtime, right? > > > Yes. For more correctness, a valid "limit_tuples" of PlannerInfo is set only > when > LIMIT clause takes constant values; it is true for most of use case. > Then, the hook I added shall be called at execution time for more exact > optimization. > > If FDW/CSP cannot accept uncertain number of rows to generate on planning > time, > it is not a duty to provide its own path which is optimized for small number > of > LIMIT clause. > > > Apart from that, attached patch applies cleanly on latest sources and found > > no issues > > with make or with regressions. > > > > However this patch is an infrastructure for any possible optimization when > > foreign/customscan is under LIMIT. > > > > So look good to me. > > > > I quickly tried adding a hook support in postgres_fdw, and it gets called > > correctly > > when we have foreignscan with LIMIT (limit being evaluated on local server). > > > > So code wise no issue. Also add this hook details in documentation. > > > OK, I'll try to write up some detailed documentation stuff; not only API > specification. > The v2 patch attached. It introduces the role of this hook and how extension utilizes the LIMIT clause for its further optimization on planning and execution time. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei pgsql-v10-fdw-css-limit-bound.v2.patch Description: pgsql-v10-fdw-css-limit-bound.v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] COPY vs \copy HINT
On 5 September 2016 at 16:32, Christoph Berg wrote: > The part about the server permissions might be useful to hint at. > What about > > "COPY TO instructs the PostgreSQL server to write to a file on the > server. " > "You may want a client-side facility such as psql's \\copy." > > "COPY FROM instructs the PostgreSQL server to read from a file on the > server. " > "You may want a client-side facility such as psql's \\copy." > > This stresses more explicitly that the file is opened by the server > process. (I'm not particularly happy that it says "server" in there > twice, but couldn't think of anything else.) Tom, any preference here? I'm probably inclined to go for your original wording and accept that it's just too hard to hint at the client/server process split in a single short message. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
On 25 August 2016 at 13:04, Craig Ringer wrote: > By the way, I now think that the second part of your patch, to allow > interruption during ReorderBufferCommit processing, is also very > desirable. I've updated your patch, rebasing it on top of 10.0 master and splitting it into two pieces. I thought you were planning on following up with an update to the second part, but maybe that was unclear from our prior discussion, so I'm doing this to help get this moving again. The first patch is what I posted earlier - the part of your patch that allows the walsender to exit COPY BOTH mode and return to command mode in response to a client-initiated CopyDone when it's in the WalSndLoop. For logical decoding this means that it will notice a client CopyDone when it's between transactions or while it's ingesting transaction changes. It won't react to CopyDone while it's in ReorderBufferCommit and sending a transaction to an output plugin though. The second piece is the other half of your original patch. Currently unmodified from what you wrote. It adds a hook in ReorderBufferCommit that calls back into the walsender to check for new client input and interrupt processing. I haven't yet investigated this in detail. Review comments on the 2nd patch, i.e. the 2nd half of your original patch: * Other places in logical decoding use the CB suffix for callback types. This should do the same. * I'm not too keen on the name is_active for the callback. We discussed the name continue_decoding_cb in our prior conversation. * Instead of having LogicalContextAlwaysActive, would it be better to test if the callback pointer is null and skip it if so? * Lots of typos in LogicalContextAlwaysActive comments, and wording is unclear * comment added to arguments of pg_logical_slot_get_changes_guts is not in PostgreSQL style - it goes way wide on the line. Probably better to put the comment above the call and mention which argument it refers to. * A comment somewhere is needed - probably on the IsStreamingActive() callback - to point readers at the fact that WalSndWriteData() calls ProcessRepliesIfAny() and that's how IsStreamingActive can get set. I had to look at the email discussion history to remind myself how this worked when I was re-reading the code. * I'm not keen on typedef ReorderBufferIsActive LogicalDecondingContextIsActive; I think instead a single callback name that encompasses both should be used. ContinueDecodingCB ? * There are no tests. I don't see any really simple way to test this, though. I suggest that you apply these updated/split patches to a fresh copy of master then see if you can update it based on this feedback. Something like: git checkout master git pull git checkout -b stop-decoding-v10 git am /path/to/0001-Respect-client-initiated-CopyDone-in-walsender.patch git am /path/to/0002-Second-part-of-Vladimir-Gordiychuk-s-patch.patch then modify the code per the above, and "git commit --amend" the changes and send an updated set of patches with "git format-patch -2" . I am setting this patch as "waiting on author" in the commitfest. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 81b3f61b0235e1b936e3336e43ce55c00bc230c4 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Tue, 10 May 2016 10:34:10 +0800 Subject: [PATCH 1/2] Respect client-initiated CopyDone in walsender The walsender never looked for CopyDone sent by the client unless it had already decided it was done sending data and dispatched its own CopyDone message. Check for client-initiated CopyDone when in COPY BOTH mode, returning to command mode. In logical decoding this will allow the client to end a logical decoding session between transactions without just unilaterally closing its connection. This change does not allow a client to end COPY BOTH session in the middle of processing a logical decoding block. TODO effect on physical walsender? --- src/backend/replication/walsender.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 1ea2a5c..c43310c 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -759,6 +759,13 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req /* make sure we have enough WAL available */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); + /* + * If the client sent CopyDone while we were waiting, + * bail out so we can wind up the decoding session. + */ + if (streamingDoneSending) + return -1; + /* more than one block available */ if (targetPagePtr + XLOG_BLCKSZ <= flushptr) count = XLOG_BLCKSZ; @@ -1220,8 +1227,11 @@ WalSndWaitForWal(XLogRecPtr loc) * It's important to do this check after the recomputation of * RecentFlushPtr, so we can send all remaining data before shutting * down. + *
Re: [HACKERS] System load consideration before spawning parallel workers
On Fri, Sep 2, 2016 at 3:01 AM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 8/16/16 3:39 AM, Haribabu Kommi wrote: > > Yes, we need to consider many parameters as a system load, not just only > > the CPU. Here I attached a POC patch that implements the CPU load > > calculation and decide the number of workers based on the available CPU > > load. The load calculation code is not an optimized one, there are many > ways > > that can used to calculate the system load. This is just for an example. > > I see a number of discussion points here: > > We don't yet have enough field experience with the parallel query > facilities to know what kind of use patterns there are and what systems > for load management we need. So I think building a highly specific > system like this seems premature. We have settings to limit process > numbers, which seems OK as a start, and those knobs have worked > reasonably well in other areas (e.g., max connections, autovacuum). We > might well want to enhance this area, but we'll need more experience and > information. > Yes, I agree that parallel query is a new feature and we cannot decide it's affect now itself. > If we think that checking the CPU load is a useful way to manage process > resources, why not apply this to more kinds of processes? I could > imagine that limiting connections by load could be useful. Parallel > workers is only one specific niche of this problem. > Yes, I agree that parallel is only one problem. How about Postmater calculates the CPU and etc load on the system and update it in a shared location where every backend can access the details. Using that, we can decide what operations to control. Using some GUC specified interval, Postmater updates the system load, so this will not affect the performance of other backends. > As I just wrote in another message in this thread, I don't trust system > load metrics very much as a gatekeeper. They are reasonable for > long-term charting to discover trends, but there are numerous potential > problems for using them for this kind of resource control thing. > > All of this seems very platform specific, too. You have > Windows-specific code, but the rest seems very Linux-specific. The > dstat tool I had never heard of before. There is stuff with cgroups, > which I don't know how portable they are across different Linux > installations. Something about Solaris was mentioned. What about the > rest? How can we maintain this in the long term? How do we know that > these facilities actually work correctly and not cause mysterious problems? > The CPU load calculation patch is a POC patch, i didn't evaluate it's behavior in all platforms. > Maybe a couple of hooks could be useful to allow people to experiment > with this. But the hooks should be more general, as described above. > But I think a few GUC settings that can be adjusted at run time could be > sufficient as well. With the GUC settings of parallel it is possible to control the behavior where it improves the performance because of more parallel workers when there is very less load on the system. In case if the system load increases and use of more parallel workers can add the overhead instead of improvement to existing current behavior when the load is high. In such cases, the number of parallel workers needs to be reduced with change in GUC settings. Instead of that, I just thought, how about if we do the same automatically. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] Suggestions for first contribution?
On Mon, Sep 05, 2016 at 01:25:03PM -0400, Christian Convey wrote: > Hi guys, > > Can anyone suggest a project for my first PG contribution? How about adding PIVOT tables? MS SQL Server and Oracle both have them. If you're interested, I have some ideas about the UI parts and a few about the implementation. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
On September 5, 2016 7:26:42 AM PDT, Tom Lane wrote: >Simon Riggs writes: >> On 4 September 2016 at 23:17, Greg Stark wrote: >>> So? Clients expect changes like this between major releases surely. >>> Subtle changes that cause silent breakage for end-users seems >scarier >>> than unsubtle breakage that tool authors can fix. > >> Agreed; some change in the behaviour if SELECT * FROM sequence is >> effectively part of this proposal. I was going to make the same >> comment myself. > >Well, if we're going to blow off compatibility on that score, I suggest >that we blow it off all the way. Make sequences not be relations >anymore, >and what you do instead of "SELECT * FROM sequence" is "SELECT * FROM >pg_sequences WHERE seqname = 'sequence'". Or more likely, since >sequences >should still belong to schemas, we need a "regsequence" OID-alias type >like "regclass" and you do "SELECT * FROM pg_sequences WHERE oid = >'foo.bar'::regsequence". > >The main problem I can see with this is that serial columns will >have default expressions that are written out as >"nextval('foo_f1_seq'::regclass)". I do not think we can afford to >break >dumps containing that, but I'm not sure how to get the regclass cast >replaced with a regsequence cast. Why not just continue having a pgclass entry, but no relfilenode? -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Showing parallel status in \df+
I wrote: > Pavel Stehule writes: >> Using footer for this purpose is little bit strange. What about following >> design? >> 1. move out source code of PL functions from \df+ >> 2. allow not unique filter in \sf and allow to display multiple functions > Wasn't that proposed and rejected upthread? So ... why did you put this patch in "Waiting on Author" state? AFAIK, we had dropped the idea of relying on \sf for this, mainly because Peter complained about \df+ no longer providing source code. I follow his point: if you're used to using \df+ to see source code, you probably can figure it out quickly if that command shows the source in a different place than before. But if it doesn't show it at all, using \sf instead might not occur to you right away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Bug in 9.6 tuplesort batch memory growth logic
While working on my parallel CREATE INDEX patch, I came across a problem. I initially assumed that what I saw was just a bug in my development branch. During a final merge in a parallel worker, with very little maintenance_work_mem, workers attempted to allocate an amount of memory slightly less than 2 ^ 64, and fail, since that exceeds MaxAllocHugeSize. I noticed that this happened immediately following batchmemtuples() leaving us in a LACKMEM() state due to a fault in its calculation (or, if you prefer, the lack of any defense against that fault). Further analysis led me to believe that this is a 9.6 bug. We should have a check for sane memtuples growth, in line with the grow_memtuples() check, where "We need to be sure that we do not cause LACKMEM to become true, else the space management algorithm will go nuts". Because of the way grow_memtuples() is called by batchmemtuples() in practice (in particular, because availMemLessRefund may be < 0 in these corner cases), grow_memtuples()'s own protections may not save us as previously assumed. FREEMEM() *takes away* memory from the availMem budget when "availMemLessRefund < 0", just after the conventional grow_memtuples() checks run. Attached patch adds a defense. FWIW, there is no reason to think that more careful accounting could fix this problem, since in general a LACKMEM() condition may not immediately lead to tuples being dumped out. I haven't been able to reproduce this on 9.6, but that seems unnecessary. I can reproduce it with my parallel CREATE INDEX patch applied, with just the right test case and right number of workers (it's rather delicate). After careful consideration, I can think of no reason why 9.6 would be unaffected. -- Peter Geoghegan From 850af5f9f7b1b85f98a54f5b4a757bbd00df77ec Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Sun, 4 Sep 2016 16:05:49 -0700 Subject: [PATCH] Defend against faulty batch memtuples calculation Add a new test to batchmemtuples() that must be passed in order to proceed with actually growing memtuples. This avoids problems with spurious target allocation sizes in corner cases with low memory. Backpatch to 9.6, where the tuplesort batch memory optimization was added. --- src/backend/utils/sort/tuplesort.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index c8fbcf8..3478277 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -2868,6 +2868,7 @@ batchmemtuples(Tuplesortstate *state) /* For simplicity, assume no memtuples are actually currently counted */ Assert(state->memtupcount == 0); + Assert(state->activeTapes > 0); /* * Refund STANDARDCHUNKHEADERSIZE per tuple. @@ -2880,6 +2881,20 @@ batchmemtuples(Tuplesortstate *state) availMemLessRefund = state->availMem - refund; /* + * We need to be sure that we do not cause LACKMEM to become true, else a + * negative number of bytes could be calculated as batch allocation size, + * causing havoc (a negative availMemLessRefund cannot be accepted). Note + * that we are never reliant on a tape's batch allocation being large + * enough to fit even a single tuple; the use of the constant + * ALLOCSET_DEFAULT_INITSIZE here is not critical (a threshold like this + * avoids a useless repalloc that only grows memtuples by a tiny amount, + * which seems worth avoiding here in passing). + */ + if (availMemLessRefund <= + (int64) state->activeTapes * ALLOCSET_DEFAULT_INITSIZE) + return; + + /* * To establish balanced memory use after refunding palloc overhead, * temporarily have our accounting indicate that we've allocated all * memory we're allowed to less that refund, and call grow_memtuples() to @@ -2889,8 +2904,11 @@ batchmemtuples(Tuplesortstate *state) USEMEM(state, availMemLessRefund); (void) grow_memtuples(state); /* Should not matter, but be tidy */ + state->growmemtuples = false; + /* availMem must stay accurate for spacePerTape calculation */ FREEMEM(state, availMemLessRefund); - state->growmemtuples = false; + if (LACKMEM(state)) + elog(ERROR, "unexpected out-of-memory situation in tuplesort"); #ifdef TRACE_SORT if (trace_sort) -- 2.7.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 09/05/2016 03:58 PM, Steve Singer wrote: On 08/31/2016 04:51 PM, Petr Jelinek wrote: Hi, and one more version with bug fixes, improved code docs and couple more tests, some general cleanup and also rebased on current master for the start of CF. A few more things I noticed when playing with the patches 1, Creating a subscription to yourself ends pretty badly, the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill it. The background process seems to be waiting for a transaction to commit (I assume the create subscription command). I had to kill -9 the various processes to get things to stop. Getting confused about hostnames and ports is a common operator error. 2. Failures during the initial subscription aren't recoverable For example on db1 create table a(id serial4 primary key,b text); insert into a(b) values ('1'); create publication testpub for table a; on db2 create table a(id serial4 primary key,b text); insert into a(b) values ('1'); create subscription testsub connection 'host=localhost port=5440 dbname=test' publication testpub; I then get in my db2 log ERROR: duplicate key value violates unique constraint "a_pkey" DETAIL: Key (id)=(1) already exists. LOG: worker process: logical replication worker 16396 sync 16387 (PID 10583) exited with exit code 1 LOG: logical replication sync for subscription testsub, table a started ERROR: could not crate replication slot "testsub_sync_a": ERROR: replication slot "testsub_sync_a" already exists LOG: worker process: logical replication worker 16396 sync 16387 (PID 10585) exited with exit code 1 LOG: logical replication sync for subscription testsub, table a started ERROR: could not crate replication slot "testsub_sync_a": ERROR: replication slot "testsub_sync_a" already exists and it keeps looping. If I then truncate "a" on db2 it doesn't help. (I'd expect at that point the initial subscription to work) If I then do on db2 drop subscription testsub cascade; I still see a slot in use on db1 select * FROM pg_replication_slots ; slot_name| plugin | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | rest art_lsn | confirmed_flush_lsn +--+---++--+++--+--+- +- testsub_sync_a | pgoutput | logical | 16384 | test | f || | 1173 | 0/15 66E08 | 0/1566E40 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Mon, Sep 5, 2016 at 5:36 PM, Simon Riggs wrote: > On 5 September 2016 at 15:50, Claudio Freire wrote: >> On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs wrote: >>> On 3 September 2016 at 04:25, Claudio Freire wrote: The patch also makes vacuum free the dead_tuples before starting truncation. It didn't seem necessary to hold onto it beyond that point, and it might help give the OS more cache, especially if work mem is configured very high to avoid multiple index scans. >>> >>> How long does that part ever take? Is there any substantial gain from this? >>> >>> Lets discuss that as a potential second patch. >> >> In the test case I mentioned, it takes longer than the vacuum part itself. > > Please provide a test case and timings so we can see what's happening. The referenced test case is the one I mentioned on the OP: - createdb pgbench - pgbench -i -s 4000 pgbench - psql pgbench -c 'delete from pgbench_accounts;' - vacuumdb -v -t pgbench_accounts pgbench fsync=off, autovacuum=off, maintainance_work_mem=4GB >From what I remember, it used ~2.7GB of RAM up until the truncate phase, where it freed it. It performed a single index scan over the PK. I don't remember timings, and I didn't take them, so I'll have to repeat the test to get them. It takes all day and makes my laptop unusably slow, so I'll post them later, but they're not very interesting. The only interesting bit is that it does a single index scan instead of several, which on TB-or-more tables it's kinda nice. Btw, without a further patch to prefetch pages on the backward scan for truncate, however, my patience ran out before it finished truncating. I haven't submitted that patch because there was an identical patch in an older thread that was discussed and more or less rejected since it slightly penalized SSDs. While I'm confident my version of the patch is a little easier on SSDs, I haven't got an SSD at hand to confirm, so that patch is still waiting on the queue until I get access to an SSD. The tests I'll make include that patch, so the timing regarding truncate won't be representative of HEAD (I really can't afford to run the tests on a scale=4000 pgbench without that patch, it crawls, and smaller scales don't fill the dead_tuples array). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On 5 September 2016 at 15:50, Claudio Freire wrote: > On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs wrote: >> On 3 September 2016 at 04:25, Claudio Freire wrote: >>> The patch also makes vacuum free the dead_tuples before starting >>> truncation. It didn't seem necessary to hold onto it beyond that >>> point, and it might help give the OS more cache, especially if work >>> mem is configured very high to avoid multiple index scans. >> >> How long does that part ever take? Is there any substantial gain from this? >> >> Lets discuss that as a potential second patch. > > In the test case I mentioned, it takes longer than the vacuum part itself. Please provide a test case and timings so we can see what's happening. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 08/31/2016 04:51 PM, Petr Jelinek wrote: Hi, and one more version with bug fixes, improved code docs and couple more tests, some general cleanup and also rebased on current master for the start of CF. To get the 'subscription' TAP tests to pass I need to set export PGTZ=+02 Shouldn't the expected output be with reference to PST8PDT? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Mon, Sep 5, 2016 at 11:50 AM, Claudio Freire wrote: > On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs wrote: >> On 3 September 2016 at 04:25, Claudio Freire wrote: >>> The patch also makes vacuum free the dead_tuples before starting >>> truncation. It didn't seem necessary to hold onto it beyond that >>> point, and it might help give the OS more cache, especially if work >>> mem is configured very high to avoid multiple index scans. >> >> How long does that part ever take? Is there any substantial gain from this? >> >> Lets discuss that as a potential second patch. > > In the test case I mentioned, it takes longer than the vacuum part itself. > > Other than freeing RAM there's no gain. I didn't measure any speed > difference while testing, but that's probably because the backward > scan doesn't benefit from the cache, but other activity on the system > might. So, depending on the workload on the server, extra available > RAM may be a significant gain on its own or not. It just didn't seem > there was a reason to keep that RAM reserved, especially after making > it a huge allocation. > > I'm fine either way. I can remove that from the patch or leave it > as-is. It just seemed like a good idea at the time. Rebased and split versions attached From 7b47b59d89bf3edaeb11c4ffe3660f3d5b7b86de Mon Sep 17 00:00:00 2001 From: Claudio Freire Date: Fri, 2 Sep 2016 23:21:01 -0300 Subject: [PATCH 1/2] Vacuum: allow using more than 1GB work mem Turns the palloc for dead_tuples into a huge allocation to allow using more than 1GB worth of dead_tuples, saving index scans on heavily bloated tables --- src/backend/commands/vacuumlazy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 231e92d..2c98da4 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -1952,7 +1952,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks) { maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData); maxtuples = Min(maxtuples, INT_MAX); - maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData)); + maxtuples = Min(maxtuples, MaxAllocHugeSize / sizeof(ItemPointerData)); /* curious coding here to ensure the multiplication can't overflow */ if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks) @@ -1969,7 +1969,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks) vacrelstats->num_dead_tuples = 0; vacrelstats->max_dead_tuples = (int) maxtuples; vacrelstats->dead_tuples = (ItemPointer) - palloc(maxtuples * sizeof(ItemPointerData)); + MemoryContextAllocHuge(CurrentMemoryContext, maxtuples * sizeof(ItemPointerData)); } /* -- 2.6.6 From 24fa0815d915cc31ae455ef60585f54a33dbd09c Mon Sep 17 00:00:00 2001 From: Claudio Freire Date: Fri, 2 Sep 2016 23:21:01 -0300 Subject: [PATCH 2/2] Vacuum: free dead_tuples sooner Frees the dead_tuples array sooner, decreasing impact of huge settings for autovacuum_work_mem or maintainance_work_mem during lazy vacuum. --- src/backend/commands/vacuumlazy.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 2c98da4..dbe2040 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -155,6 +155,7 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats); static BlockNumber count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats); static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks); +static void lazy_space_dealloc(LVRelStats *vacrelstats); static void lazy_record_dead_tuple(LVRelStats *vacrelstats, ItemPointer itemptr); static bool lazy_tid_reaped(ItemPointer itemptr, void *state); @@ -1310,6 +1311,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, for (i = 0; i < nindexes; i++) lazy_cleanup_index(Irel[i], indstats[i], vacrelstats); + lazy_space_dealloc(vacrelstats); + /* If no indexes, make log report that lazy_vacuum_heap would've made */ if (vacuumed_pages) ereport(elevel, @@ -1973,6 +1976,18 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks) } /* + * lazy_space_dealloc - free lazy vacuum work mem + */ +static void +lazy_space_dealloc(LVRelStats *vacrelstats) +{ + if (vacrelstats->dead_tuples != NULL) { + pfree(vacrelstats->dead_tuples); + vacrelstats->dead_tuples = NULL; + } +} + +/* * lazy_record_dead_tuple - remember one deletable tuple */ static void -- 2.6.6 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cache Hash Index meta page.
On Sep 2, 2016 7:38 PM, "Jesper Pedersen" wrote: > Could you provide a rebased patch based on Amit's v5 ? Please find the the patch, based on Amit's V5. I have fixed following things 1. now in "_hash_first" we check if (opaque->hasho_prevblkno == InvalidBlockNumber) to see if bucket is from older version hashindex and has been upgraded. Since as of now InvalidBlockNumber is one value greater than maximum value the variable "metap->hashm_maxbucket" can be set (see _hash_expandtable). We can distinguish it from rest. I tested the upgrade issue reported by amit. It is fixed now. 2. One case which buckets hasho_prevblkno is used is where we do backward scan. So now before testing for previous block number I test whether current page is bucket page if so we end the bucket scan (see changes in _hash_readprev). On other places where hasho_prevblkno is used it is not for bucket page, so I have not put any extra check to verify if is a bucket page. cache_hash_index_metapage_onAmit_v5_01.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Alter or rename enum value
Emre Hasegeli writes: >> I started looking at this patch. I'm kind of unhappy with having *both* >> IF EXISTS and IF NOT EXISTS options on the statement, especially since >> the locations of those phrases in the syntax seem to have been chosen >> with a dartboard. This feels way more confusing than it is useful. >> Is there really a strong use-case for either option? I note that >> ALTER TABLE RENAME COLUMN, which is probably used a thousand times >> more often than this will be, has so far not grown either kind of option, >> which sure makes me think that this proposal is getting ahead of itself. > I think they can be useful. I am writing a lot of migration scripts > for small projects. It really helps to be able to run parts of them > again. ALTER TYPE ... ADD VALUE already have IF NOT EXISTS option. I > don't think we would lose anything to support both of them in here. The opportunity cost here is potential user confusion. The only closely parallel rename operation we have is ALTER TABLE RENAME COLUMN, and that doesn't have a column-level IF EXISTS option; it has a table-level IF EXISTS option. So I think it would be weird and confusing for ALTER TYPE RENAME VALUE to be different from that. And again, it's hard to get excited about having these options for RENAME VALUE when no one has felt a need for them yet in RENAME COLUMN. I'm especially dubious about IF NOT EXISTS against the destination name, considering that there isn't *any* variant of RENAME that has an equivalent of that. If it's really useful, why hasn't that happened? I think we should leave these features out for now and wait to see if there's really field demand for them. If there is, it probably will make sense to add them in more than just this one kind of RENAME. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On 09/05/2016 06:03 AM, Amit Kapila wrote: > On Mon, Sep 5, 2016 at 3:18 AM, Tomas Vondra > wrote: >> Hi, >> >> This thread started a year ago, different people contributed various >> patches, some of which already got committed. Can someone please post a >> summary of this thread, so that it's a bit more clear what needs >> review/testing, what are the main open questions and so on? >> > > Okay, let me try to summarize this thread. This thread started off to > ameliorate the CLOGControlLock contention with a patch to increase the > clog buffers to 128 (which got committed in 9.6). Then the second > patch was developed to use Group mode to further reduce the > CLOGControlLock contention, latest version of which is upthread [1] (I > have checked that version still gets applied). Then Andres suggested > to compare the Group lock mode approach with an alternative (more > granular) locking model approach for which he has posted patches > upthread [2]. There are three patches on that link, the patches of > interest are 0001-Improve-64bit-atomics-support and > 0003-Use-a-much-more-granular-locking-model-for-the-clog-. I have > checked that second one of those doesn't get applied, so I have > rebased it and attached it with this mail. In the more granular > locking approach, actually, you can comment USE_CONTENT_LOCK to make > it use atomic operations (I could not compile it by disabling > USE_CONTENT_LOCK on my windows box, you can try by commenting that as > well, if it works for you). So, in short we have to compare three > approaches here. > > 1) Group mode to reduce CLOGControlLock contention > 2) Use granular locking model > 3) Use atomic operations > > For approach-1, you can use patch [1]. For approach-2, you can use > 0001-Improve-64bit-atomics-support patch[2] and the patch attached > with this mail. For approach-3, you can use > 0001-Improve-64bit-atomics-support patch[2] and the patch attached > with this mail by commenting USE_CONTENT_LOCK. If the third doesn't > work for you then for now we can compare approach-1 and approach-2. > OK, I can compile all three cases - but onl with gcc 4.7 or newer. Sadly the 4-socket 64-core machine runs Debian Jessie with just gcc 4.6 and my attempts to update to a newer version were unsuccessful so far. > I have done some testing of these patches for read-write pgbench > workload and doesn't find big difference. Now the interesting test > case could be to use few sub-transactions (may be 4-8) for each > transaction as with that we can see more contention for > CLOGControlLock. Understood. So a bunch of inserts/updates interleaved by savepoints? I presume you started looking into this based on a real-world performance issue, right? Would that be a good test case? > > Few points to note for performance testing, one should use --unlogged > tables, else the WAL writing and WALWriteLock contention masks the > impact of this patch. The impact of this patch is visible at > higher-client counts (say at 64~128). > Even on good hardware (say, PCIe SSD storage that can do thousands of fsyncs per second)? Does it then make sense to try optimizing this if the effect can only be observed without the WAL overhead (so almost never in practice)? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestions for first contribution?
Hi 2016-09-05 19:25 GMT+02:00 Christian Convey : > Hi guys, > > Can anyone suggest a project for my first PG contribution? > > My first two ideas didn't pan out: Yury doesn't seem to need help > with CMake, and the TODO list's "-Wcast-align" project (now deleted) > appeared impractical. > > I can continue trying things from the TODO list, but if someone knows > of a project that's definitely worth doing, all the better. > > I'd prefer doing back-end coding, but I'm flexible. The only two > caveats are: (1) I can't work on speedup-focused changes without my > employer's permission, and (2) I only have access to Linux boxes. > > Any advice would be greatly appreciated. I wrote XMLTABLE function, and I am thinking about JSON_TABLE function. But there is one blocker - missing JsonPath support in our JSON implementation. So one idea - implement JsonPath support and related JSON query functions. This can help with better standard conformance. Regards Pavel > > Kind regards, > Christian > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] Better locale-specific-character-class handling for regexps
Heikki Linnakangas writes: > On 09/05/2016 07:10 PM, Tom Lane wrote: >> In any case, this is getting very far afield from the current patch. >> I'm willing to add a regexp.linux.ut8.sql test file if you think it's >> important to have some canned tests that exercise this new code, but >> otherwise I don't see any near-term solution. > Ok, that'll do. OK, I'll put something together. Thanks again for reviewing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Suggestions for first contribution?
Hi guys, Can anyone suggest a project for my first PG contribution? My first two ideas didn't pan out: Yury doesn't seem to need help with CMake, and the TODO list's "-Wcast-align" project (now deleted) appeared impractical. I can continue trying things from the TODO list, but if someone knows of a project that's definitely worth doing, all the better. I'd prefer doing back-end coding, but I'm flexible. The only two caveats are: (1) I can't work on speedup-focused changes without my employer's permission, and (2) I only have access to Linux boxes. Any advice would be greatly appreciated. Kind regards, Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better locale-specific-character-class handling for regexps
On 09/05/2016 07:10 PM, Tom Lane wrote: Heikki Linnakangas writes: On 09/04/2016 08:44 PM, Tom Lane wrote: I guess I could follow the lead of collate.linux.utf8.sql and produce a test that's only promised to pass on one platform with one encoding, but I'm not terribly excited by that. AFAIK that test file does not get run at all in the buildfarm or in the wild. I'm not too worried if the tests don't get run regularly, but I don't like the idea that only works on one platform. Well, it would work on any platform that reports high Unicode letters as letters. The problem for putting this into the regular regression tests is that the generic tests don't even assume UTF8 encoding, let alone a Unicode-ish locale. Ah, ok. I thought there were some more special requirements. Since we're now de facto maintainers of this regexp library, and our version could be used somewhere else than PostgreSQL too, it would actually be nice to have a regression suite that's independent from the pg_regress infrastructure, and wouldn't need a server to run. If anyone ever really picks up the challenge of making the regexp library a standalone project, I think one of the first orders of business would be to pull out the Tcl project's regexp-related regression tests. There's a pretty extensive set of tests written by Henry Spencer himself, and more that they added over the years; it's far more comprehensive than our tests. (I've looked at stealing that test set in toto, but it requires some debug APIs that we don't expose in SQL, and probably don't want to.) Oh, interesting. In any case, this is getting very far afield from the current patch. I'm willing to add a regexp.linux.ut8.sql test file if you think it's important to have some canned tests that exercise this new code, but otherwise I don't see any near-term solution. Ok, that'll do. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fun fact about autovacuum and orphan temp tables
On Mon, Sep 5, 2016 at 12:48:32PM -0300, Alvaro Herrera wrote: > > The least invasive solution would be to have a guc, something like > > 'keep_orphan_temp_tables' with boolean value. > > Which would determine a autovacuum worker policy toward encountered orphan > > temp tables. > > The stated reason for keeping them around is to ensure you have time to > do some forensics research in case there was something useful in the > crashing backend. My feeling is that if the reason they are kept around > is not a crash but rather some implementation defect that broke end-time > cleanup, then they don't have their purported value and I would rather > just remove them. > > I have certainly faced my fair share of customers with dangling temp > tables, and would like to see this changed in some way or another. I don't think we look at those temp tables frequently enough to justify keeping them around for all users. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor fix to comments
On Sun, Sep 4, 2016 at 05:30:57PM -0500, Jim Nasby wrote: > I noticed some imbalanced '-'s in execnodes.h. Though, ISTM newer code > doesn't use -'s in comments anymore, so maybe it'd be better to just ditch > them? Patch applied. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Obsolete TODO item "-Wcast-align" ?
On Sun, Sep 4, 2016 at 06:10:23PM -0400, Christian Convey wrote: > On Sun, Sep 4, 2016 at 5:56 PM, Tom Lane wrote: > > Christian Convey writes: > >> I chose this item from the TODO page: "[E] Remove warnings created by > >> -Wcast-align". It didn't have a check-mark after the "[E]", which I > >> took to mean it's an outstanding issue. > >> However, I'm starting to wonder if it's already been fixed: > > > > No, but you only see it on some platforms/compilers. On my OS X laptop > > (clang-based not gcc-based compiler), turning that on generates just a > > hair short of 13000 warnings :-( > > > > I think that TODO item is indeed obsolete, but more in the direction > > of "we're never gonna do that". There are too many places where we > > do need to cast up from generic pointer to specific structure pointer, > > and there doesn't seem to be any practical way to tell a compiler which > > instances are useful to warn about. > > Thanks for the response. I'm unclear about how the TODO list is > curated. Is there someone whose attention I should direct to this > thread? Someone has removed the item. It is a wiki so anyone can add/remove things. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
On 09/05/2016 05:47 PM, Tom Lane wrote: "Tsunakawa, Takayuki" writes: Before digging into the problem, could you share your impression on whether PostgreSQL can support SJIS? Would it be hopeless? I think it's pretty much hopeless. Agreed. But one thing that would help a little, would be to optimize the UTF-8 -> SJIS conversion. It uses a very generic routine, with a binary search over a large array of mappings. I bet you could do better than that, maybe using a hash table or a radix tree instead of the large binary-searched array. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] less expensive pg_buffercache on big shmem
On 09/03/2016 05:04 AM, Tomas Vondra wrote: > This patch needs a rebase, as 06d7fd6e bumped the version to 1.2. Thank you for a valuable hint. > > If we will replace consistent method, then we should replace it with the > > partially consistent method (called "nonconsistent") because: > > 1) it's based on fast spinlocks (it's not fully up to its name, though) > > In other words, it does exactly the thing proposed up-thread, i.e. > locking only buffer headers. > > What do you mean by fast spinlocks? And why aren't they up to the name? Not they (spinlocks), but the name “nonconsistent” is somewhat misleading. At the moment we can't implement concurrent shared memory access without locks in general, so most inconsistent method that has been proposed was the "nonconsistent" one. But roughly speaking *nonconsistent* is not as such by the name, because it contains a locking mechanism (spinlocks). > > 2) it's *almost* the fastest one (the less time needed for execution of > > method, the less data core will change and as a consequence the more > > consistent snapshot will be) > > I'm not sure I understand what you're saying here? What do you mean by > "almost the fastest one"? I mean the fastest one that has been proposed ("consistent"| "semiconsistent"| "nonconsistent"). > I'm a bit confused by the results you reported before, i.e. that > > 1) nonconsistent is 10% faster than consistent method > 2) semiconsistent is 5-times slower than nonconsistent method > > What does that mean? Are you refering to duration of the queries reading > data from pg_buffercache, or to impact on the other queries? Here I mean "working duration time". > How can be semiconsistent 5x slower than nonconsistent? Wouldn't that > make it significantly slower than the consistent method? Firstly, when we want to measure the quality of pg_buffercache, we must measure several values: 1) Execution time (duration of the queries reading data from pg_buffercache) 2) How it influences the system (and other queries) during its work Secondly, the semiconsistent is slower than nonconsistent and consistent method, but it makes less influence on other queries then consistent. Thirdly, it is slower because it locks the partitions of shared memory in a different way than in consistent or nonconsistent methods. The semi-consistent strategy implies that only one buffer is locked at a time. Therefor has no significant effect on the system, but it takes more time. --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better locale-specific-character-class handling for regexps
Heikki Linnakangas writes: > On 09/04/2016 08:44 PM, Tom Lane wrote: >> I guess I could follow the lead of collate.linux.utf8.sql and produce >> a test that's only promised to pass on one platform with one encoding, >> but I'm not terribly excited by that. AFAIK that test file does not >> get run at all in the buildfarm or in the wild. > I'm not too worried if the tests don't get run regularly, but I don't > like the idea that only works on one platform. Well, it would work on any platform that reports high Unicode letters as letters. The problem for putting this into the regular regression tests is that the generic tests don't even assume UTF8 encoding, let alone a Unicode-ish locale. > Since we're now de facto maintainers of this regexp library, and our > version could be used somewhere else than PostgreSQL too, it would > actually be nice to have a regression suite that's independent from the > pg_regress infrastructure, and wouldn't need a server to run. If anyone ever really picks up the challenge of making the regexp library a standalone project, I think one of the first orders of business would be to pull out the Tcl project's regexp-related regression tests. There's a pretty extensive set of tests written by Henry Spencer himself, and more that they added over the years; it's far more comprehensive than our tests. (I've looked at stealing that test set in toto, but it requires some debug APIs that we don't expose in SQL, and probably don't want to.) In any case, this is getting very far afield from the current patch. I'm willing to add a regexp.linux.ut8.sql test file if you think it's important to have some canned tests that exercise this new code, but otherwise I don't see any near-term solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fun fact about autovacuum and orphan temp tables
Grigory Smolkin wrote: > > On 09/05/2016 04:34 PM, Alvaro Herrera wrote: > >Grigory Smolkin wrote: > > > >>Funny part is that it never drops them. So when backend is finally > >>terminated, it tries to drop them and fails with error: > >> > >>FATAL: out of shared memory > >>HINT: You might need to increase max_locks_per_transaction > >> > >>If I understand that rigth, we are trying to drop all these temp tables in > >>one transaction and running out of locks to do so. > >Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient. > >It is certainly pointless to hold onto these locks for temp tables. I > >wonder how ugly would be to fix this problem ... > > > > Thank you for your interest in this problem. > I dont think this is a source of problem. Ugly fix here would only force > backend to terminate properly. > It will not help at all in cause of server crash or power outage. > We need a way to tell autovacuum, that we don`t need orphan temp tables, so > they can be removed using existing routine. It is always possible to drop the containing schemas; and as soon as some other backend uses the BackendId 15 (in your example) the tables would be removed anyway. This only becomes a longstanding problem when the crashing backend uses a high-numbered BackendId that's not reused promptly enough. > The least invasive solution would be to have a guc, something like > 'keep_orphan_temp_tables' with boolean value. > Which would determine a autovacuum worker policy toward encountered orphan > temp tables. The stated reason for keeping them around is to ensure you have time to do some forensics research in case there was something useful in the crashing backend. My feeling is that if the reason they are kept around is not a crash but rather some implementation defect that broke end-time cleanup, then they don't have their purported value and I would rather just remove them. I have certainly faced my fair share of customers with dangling temp tables, and would like to see this changed in some way or another. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for lazy_scan_heap
On Mon, Sep 5, 2016 at 6:27 PM, Simon Riggs wrote: > On 4 August 2016 at 05:57, Masahiko Sawada wrote: > >> While reviewing freeze map code, Andres pointed out[1] that >> lazy_scan_heap could accesses visibility map twice and its logic is >> seems a bit tricky. >> As discussed before, it's not nice especially when large relation is >> entirely frozen. >> >> I posted the patch for that before but since this is an optimization, >> not bug fix, I'd like to propose it again. >> Please give me feedback. >> >> [1] >> https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de > > If we have a freeze map now, surely tables will no longer be entirely frozen? Well, if table is completely frozen, freezing for all pages will be skipped. > What performance difference does this make, in a realistic use case? I have yet to measure performance effect but it would be effect for very large frozen table. > How would we test this to check it is exactly correct? One possible idea is that we emit the number of skipped page according visibility map as a vacuum verbose message, and check it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs wrote: > On 3 September 2016 at 04:25, Claudio Freire wrote: >> The patch also makes vacuum free the dead_tuples before starting >> truncation. It didn't seem necessary to hold onto it beyond that >> point, and it might help give the OS more cache, especially if work >> mem is configured very high to avoid multiple index scans. > > How long does that part ever take? Is there any substantial gain from this? > > Lets discuss that as a potential second patch. In the test case I mentioned, it takes longer than the vacuum part itself. Other than freeing RAM there's no gain. I didn't measure any speed difference while testing, but that's probably because the backward scan doesn't benefit from the cache, but other activity on the system might. So, depending on the workload on the server, extra available RAM may be a significant gain on its own or not. It just didn't seem there was a reason to keep that RAM reserved, especially after making it a huge allocation. I'm fine either way. I can remove that from the patch or leave it as-is. It just seemed like a good idea at the time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
"Tsunakawa, Takayuki" writes: > Before digging into the problem, could you share your impression on > whether PostgreSQL can support SJIS? Would it be hopeless? I think it's pretty much hopeless. Even if we were willing to make every bit of code that looks for '\' and other specific at-risk characters multi-byte aware (with attendant speed penalties), we could expect that third-party extensions would still contain vulnerable code. More, we could expect that new bugs of the same ilk would get introduced all the time. Many such bugs would amount to security problems. So the amount of effort and vigilance required seems out of proportion to the benefits. Most of the recent discussion about allowed backend encodings has run more in the other direction, ie, "why don't we disallow everything but UTF8 and get rid of all the infrastructure for multiple backend encodings?". I'm not personally in favor of that, but there are very few hackers who want to add any more overhead in this area. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new autovacuum criterion for visible pages
On 12 August 2016 at 01:01, Tom Lane wrote: > Michael Paquier writes: >> In short, autovacuum will need to scan by itself the VM of each >> relation and decide based on that. > > That seems like a worthwhile approach to pursue. The VM is supposed to be > small, and if you're worried it isn't, you could sample a few pages of it. > I do not think any of the ideas proposed so far for tracking the > visibility percentage on-the-fly are very tenable. Sounds good, but we can't scan the VM for every table, every minute. We need to record something that will tell us how many VM bits have been cleared, which will then allow autovac to do a simple SELECT to decide what needs vacuuming. Vik's proposal to keep track of the rows inserted seems like the best approach to this issue. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
Simon Riggs writes: > On 4 September 2016 at 23:17, Greg Stark wrote: >> So? Clients expect changes like this between major releases surely. >> Subtle changes that cause silent breakage for end-users seems scarier >> than unsubtle breakage that tool authors can fix. > Agreed; some change in the behaviour if SELECT * FROM sequence is > effectively part of this proposal. I was going to make the same > comment myself. Well, if we're going to blow off compatibility on that score, I suggest that we blow it off all the way. Make sequences not be relations anymore, and what you do instead of "SELECT * FROM sequence" is "SELECT * FROM pg_sequences WHERE seqname = 'sequence'". Or more likely, since sequences should still belong to schemas, we need a "regsequence" OID-alias type like "regclass" and you do "SELECT * FROM pg_sequences WHERE oid = 'foo.bar'::regsequence". The main problem I can see with this is that serial columns will have default expressions that are written out as "nextval('foo_f1_seq'::regclass)". I do not think we can afford to break dumps containing that, but I'm not sure how to get the regclass cast replaced with a regsequence cast. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index Onlys Scan for expressions
Hi Tomas, On 03.09.2016 14:37, Tomas Vondra wrote: Hi Ildar, I've looked at this patch again today to do a bit more thorough review, and I think it's fine. There are a few comments (particularly in the new code in check_index_only) that need improving, and also a few small tweaks in the walkers. Attached is a modified v5 patch with the proposed changes - it's probably easier than explaining what the changes should/might be. I've added an XXX comment in check_index_only_expr_walker - ISTM we're first explicitly matching Vars to index attributes, and then dealing with expressions. But we loop over all index columns (including those that can only really match Vars). Not sure if it makes any difference, but is it worth differentiating between Var and non-Var expressions? Why not to simply call match_index_to_operand() in both cases? Thank you! Yes, you're right and probably it doesn't worth it. I intended to optimize just a little bit since we already have attribute numbers that can be returned by index. But as you have already noted in comment it is a cheap check and it would barely be noticeable. I've also tweaked a few comments to match project code style, and moved a few variables into the block where they are used. But the latter is probably matter of personal taste, I guess. regards Thanks for that, I have some difficulties in expressing myself in english, so it was very helpful. Best regards, -- Ildar Musin i.mu...@postgrespro.ru -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
On 4 September 2016 at 23:17, Greg Stark wrote: > On Wed, Aug 31, 2016 at 3:01 PM, Tom Lane wrote: >> >> Uh, not as subtly as all that, because "select * from sequence" will >> now return a different set of columns, which will flat out break a >> lot of clients that use that method to get sequence properties. > > So? Clients expect changes like this between major releases surely. > Subtle changes that cause silent breakage for end-users seems scarier > than unsubtle breakage that tool authors can fix. Agreed; some change in the behaviour if SELECT * FROM sequence is effectively part of this proposal. I was going to make the same comment myself. I think we should balance the problems caused by not having a sequence catalog against the problems caused by changing the currently somewhat broken situation. I vote in favour of applying Peter's idea/patch, though if that is not acceptable I don't think its worth spending further time on for any of us, so if thats the case lets Return with Feedback and move on. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT .. SET syntax
On 09/05/2016 03:58 PM, Simon Riggs wrote: > On 3 July 2016 at 20:36, Marko Tiikkaja wrote: > >> Here's a patch for $SUBJECT. I'll probably work on the docs a bit more >> before the next CF, but I thought I'd post it anyway. > > I think this should be Returned With Feedback. You're probably right (even though you're quoting the wrong message), so I've changed it to that. Marko, please resubmit an updated patch. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fun fact about autovacuum and orphan temp tables
On 09/05/2016 04:34 PM, Alvaro Herrera wrote: Grigory Smolkin wrote: Funny part is that it never drops them. So when backend is finally terminated, it tries to drop them and fails with error: FATAL: out of shared memory HINT: You might need to increase max_locks_per_transaction If I understand that rigth, we are trying to drop all these temp tables in one transaction and running out of locks to do so. Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient. It is certainly pointless to hold onto these locks for temp tables. I wonder how ugly would be to fix this problem ... Thank you for your interest in this problem. I dont think this is a source of problem. Ugly fix here would only force backend to terminate properly. It will not help at all in cause of server crash or power outage. We need a way to tell autovacuum, that we don`t need orphan temp tables, so they can be removed using existing routine. The least invasive solution would be to have a guc, something like 'keep_orphan_temp_tables' with boolean value. Which would determine a autovacuum worker policy toward encountered orphan temp tables. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] INSERT .. SET syntax
On 3 July 2016 at 20:36, Marko Tiikkaja wrote: > Here's a patch for $SUBJECT. I'll probably work on the docs a bit more > before the next CF, but I thought I'd post it anyway. I think this should be Returned With Feedback. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index Onlys Scan for expressions
Hi Vladimir, On 03.09.2016 19:31, Vladimir Sitnikov wrote: Ildar>The reason why this doesn't work is that '~~' operator (which is a Ildar>synonym for 'like') isn't supported by operator class for btree. Since Ildar>the only operators supported by btree are <, <=, =, >=, >, you can use Ildar>it with queries like: Ildar>And in 3rd query 'OFFSET' statement prevents rewriter from Ildar>transforming the query, so it is possible to use index only scan on Ildar>subquery and then filter the result of subquery with '~~' operator. I'm afraid I do not follow you. Note: query 3 is 100% equivalent of query 2, however query 3 takes 55 times less reads. It looks like either an optimizer bug, or some missing feature in the "index only scan" logic. Here's quote from "query 2" (note % are at both ends): ... where type=42) as x where upper_vc like '%ABC%'; Note: I do NOT use "indexed scan" for the like operator. I'm very well aware that LIKE patterns with leading % cannot be optimized to a btree range scan. What I want is "use the first indexed column as index scan, then use the second column for filtering". As shown in "query 2" vs "query 3", PostgreSQL cannot come up with such a plan on its own for some reason. This is not a theoretical issue, but it is something that I use a lot with Oracle DB (it just creates a good plan for "query 2"). Vladimir Thanks, I get it now. The reason why it acts like this is that I used match_clause_to_index() function to determine if IOS can be used with the specified clauses. This function among other things checks if operator matches the index opfamily. Apparently this isn't correct. I wrote another prototype to test your case and it seems to work. But it's not ready for public yet, I'll publish it in 1-2 days. -- Ildar Musin i.mu...@postgrespro.ru -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fun fact about autovacuum and orphan temp tables
Grigory Smolkin wrote: > Funny part is that it never drops them. So when backend is finally > terminated, it tries to drop them and fails with error: > > FATAL: out of shared memory > HINT: You might need to increase max_locks_per_transaction > > If I understand that rigth, we are trying to drop all these temp tables in > one transaction and running out of locks to do so. Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient. It is certainly pointless to hold onto these locks for temp tables. I wonder how ugly would be to fix this problem ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fun fact about autovacuum and orphan temp tables
On 09/05/2016 01:54 PM, Grigory Smolkin wrote: > What is the purpose of keeping orphan tables around and not dropping > them on the spot? You can read the discussion about it here: https://www.postgresql.org/message-id/flat/3507.1214581513%40sss.pgh.pa.us -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fun fact about autovacuum and orphan temp tables
Hello, hackers! We were testing how well some application works with PostgreSQL and stumbled upon an autovacuum behavior which I fail to understand. Application in question have a habit to heavily use temporary tables in funny ways. For example it creates A LOT of them. Which is ok. Funny part is that it never drops them. So when backend is finally terminated, it tries to drop them and fails with error: FATAL: out of shared memory HINT: You might need to increase max_locks_per_transaction If I understand that rigth, we are trying to drop all these temp tables in one transaction and running out of locks to do so. After that postgresql.log is flooded at the rate 1k/s with messages like that: LOG: autovacuum: found orphan temp table "pg_temp_15"."tt38147" in database "DB_TEST" It produces a noticeable load on the system and it`s getting worst with every terminated backend or restart. I did some RTFS and it appears that autovacuum has no intention of cleaning that orphan tables unless it`s wraparound time: src/backend/postmaster/autovacuum.c /* We just ignore it if the owning backend is still active */ 2037 if (backendID == MyBackendId || BackendIdGetProc(backendID) == NULL) 2038 { 2039 /* 2040 * We found an orphan temp table (which was probably left 2041 * behind by a crashed backend). If it's so old as to need 2042 * vacuum for wraparound, forcibly drop it. Otherwise just 2043 * log a complaint. 2044 */ 2045 if (wraparound) 2046 { 2047 ObjectAddress object; 2048 2049 ereport(LOG, 2050 (errmsg("autovacuum: dropping orphan temp table \"%s\".\"%s\" in database \"%s\"", 2051 get_namespace_name(classForm->relnamespace), 2052 NameStr(classForm->relname), 2053 get_database_name(MyDatabaseId; 2054 object.classId = RelationRelationId; 2055 object.objectId = relid; 2056 object.objectSubId = 0; 2057 performDeletion(&object, DROP_CASCADE, PERFORM_DELETION_INTERNAL); 2058 } 2059 else 2060 { 2061 ereport(LOG, 2062 (errmsg("autovacuum: found orphan temp table \"%s\".\"%s\" in database \"%s\"", 2063 get_namespace_name(classForm->relnamespace), 2064 NameStr(classForm->relname), 2065 get_database_name(MyDatabaseId; 2066 } 2067 } 2068 } What is more troubling is that pg_statistic is starting to bloat badly. LOG: automatic vacuum of table "DB_TEST.pg_catalog.pg_statistic": index scans: 0 pages: 0 removed, 68225 remain, 0 skipped due to pins tuples: 0 removed, 2458382 remain, 2408081 are dead but not yet removable buffer usage: 146450 hits, 31 misses, 0 dirtied avg read rate: 0.010 MB/s, avg write rate: 0.000 MB/s system usage: CPU 3.27s/6.92u sec elapsed 23.87 sec What is the purpose of keeping orphan tables around and not dropping them on the spot? -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 09/05/2016 03:12 AM, Andreas Karlsson wrote: On 08/30/2016 08:42 AM, Heikki Linnakangas wrote: There's the ResourceOwner mechanism, see src/backend/utils/resowner/. That would be the proper way to do this. Call RegisterResourceReleaseCallback() when the context is allocated, and have the callback free it. One pitfall to watch out for is that RegisterResourceReleaseCallback() itself calls palloc(), and can error out, so you have to do things in such an order that you don't leak in that case either. Want to take a stab at that? Another approach is put each allocated context in a list or array in a global variable, and to register a callback to be called at end-of-(sub)transaction, which closes all the contexts. But the resource owner mechanism is probably easier. There's also PG_TRY-CATCH, that you could maybe use in the callers of px_find_digest(), to make sure they call px_free_digest() even on error. But that also seems difficult to use with the pgp_encrypt() pipeline. Sure, I have attached a patch where I try to use it. Thanks! Unfortunately the callback mechanism is a bit more complicated to use than that. The list of registered callbacks is global, so the callback gets called for every ResourceOwner that's closed, not just the one that was active when you registered it. Also, unregistering the callback from within the callback is not safe. I fixed those things in the (first) attached patch. On 09/05/2016 03:23 AM, Tom Lane wrote: Judging by the number of people who have popped up recently with their own OpenSSL 1.1 patches, I think there is going to be a lot of demand for back-patching some sort of 1.1 support into our back branches. All this talk of refactoring does not sound very back-patchable. Should we be thinking of what we can extract that is back-patchable? Yes, I think you're right. The minimum set of changes is the first of the attached patches. It includes Andreas' first patch, with the configure changes and other changes needed to compile, and a working version of the resourceowner callback mechanism to make sure we don't leak OpenSSL handles in pgcrypto. Maybe we could get away without the resourceowner mechanism, and just accept the minor memory leaks on the rare error case (out-of-memory might be the only situation where that happen). Or #ifdef it so that we keep the old embed-in-palloced-struct approach for OpenSSL versions older than 1.1. But on the whole, I'd prefer to keep the code the same in all branches. The second patch attached here includes Andreas' second and third patches, to silence deprecation warnings. That's not strictly required, but seems safe enough that I think we should back-patch that too. It needs one additional #ifdef version check in generate_dh_params(), if we want it to still work with OpenSSL 0.9.7, but that's easy. I'm assuming we want to still support it in back-branches, even though we just dropped it from master. - Heikki From 4534d01a0471d7378100cddebb5641800950e6c6 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Mon, 5 Sep 2016 13:46:15 +0300 Subject: [PATCH 1/2] Fix compilation with OpenSSL 1.1. - Check for SSL_new in configure, now that SSL_library_init is a macro. - Do not access struct members directly. This includes some new code in pgcrypto, to use the resource owner mechanism to ensure that we don't leak OpenSSL handles, now that we don't palloc them as part of other structs anymore. - RAND_SSLeay was renamed to RAND_OpenSSL Andreas Karlsson, with some changes by me. --- configure| 44 ++-- configure.in | 4 +- contrib/pgcrypto/openssl.c | 115 +++ contrib/sslinfo/sslinfo.c| 14 +--- src/backend/libpq/be-secure-openssl.c| 39 +++ src/interfaces/libpq/fe-secure-openssl.c | 39 +++ 6 files changed, 181 insertions(+), 74 deletions(-) diff --git a/configure b/configure index 45c8eef..caf6f26 100755 --- a/configure +++ b/configure @@ -9538,9 +9538,9 @@ else as_fn_error $? "library 'crypto' is required for OpenSSL" "$LINENO" 5 fi - { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_library_init in -lssl" >&5 -$as_echo_n "checking for SSL_library_init in -lssl... " >&6; } -if ${ac_cv_lib_ssl_SSL_library_init+:} false; then : + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_new in -lssl" >&5 +$as_echo_n "checking for SSL_new in -lssl... " >&6; } +if ${ac_cv_lib_ssl_SSL_new+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -9554,27 +9554,27 @@ cat confdefs.h - <<_ACEOF >conftest.$ac_ext #ifdef __cplusplus extern "C" #endif -char SSL_library_init (); +char SSL_new (); int main () { -return SSL_library_init (); +return SSL_new (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : - ac_cv_lib_ssl_SSL_library_init=yes + ac_cv_lib_ssl_SSL_new=yes else -
Re: [HACKERS] Patch: Implement failover on libpq connect level.
Hello, Victor. > As community shows an interest with this patch, and it has been > included in the current commitfest, I've to submit it. Naturally we show interest in this patch. It's a great feature that would be very nice to have! > This version of patch includes > > 1. Most important - a tap test suite (really small enough and has a > room for improvements). > > 2. Compatibility with older versions - now readonly check is skipped if > only one hostname is specified in the connect string > > 3. pg_host and pg_port variables now show host name and port library > was connected to. > > 4. Your fix with compilation of ecpg tests. > > Patch is attached. After a brief examination I've found following ways to improve the patch. 1) It looks like code is not properly formatted. 2) > + readonly > + > + > + If this parameter is 0 (the default), upon successful connection > + library checks if host is in recovery state, and if it is so, > + tries next host in the connect string. If this parameter is > + non-zero, connection to warm standby nodes are considered > + successful. IIRC the are "warm" and "hot" standby's in PostgreSQL terminology. The difference is that you can't execute any queries on a "warm" standby. So the description is probably wrong. I suggest to fix it to just "... connection to standby nodes are...". 3) > + falover_timeout > + > + > + Maximum time to cycilically retry all the hosts in commect string. > + (as decimal integer number of seconds). If not specified, then > + hosts are tried just once. > + Typos: cycilically, commect 4) > static int > connectDBStart(PGconn *conn) > { > + struct nodeinfo > + { > + char *host; > + char *port; > + }; Not sure if all compilers we support can parse this. I suggest to declare nodinfo structure outside of procedure, just to be safe. 5) > + nodes = calloc(sizeof(struct nodeinfo), 2); > + nodes->host = strdup(conn->pghostaddr); > hint.ai_family = AF_UNSPEC; > + /* Parse comma-separated list of host-port pairs into > + function-local array of records */ > + nodes = malloc(sizeof(struct nodeinfo) * 4); > /* pghostaddr and pghost are NULL, so use Unix domain > * socket */ > - node = NULL; > + nodes = calloc(sizeof(struct nodeinfo), 2); > + sprintf(port,"%d", htons(((struct sockaddr_in > *)ai->ai_addr)->sin_port)); > + return strdup(port); You should check return values of malloc, calloc and strdup procedures, or use safe pg_* equivalents. 6) > + for (p = addrs; p->ai_next != NULL; p = > p->ai_next); > + p->ai_next = this_node_addrs; Looks a bit scary. I suggest to add an empty scope ( {} ) and a comment to make our intention clear here. 7) Code compiles with following warnings (CLang 3.8): > 1 warning generated. > fe-connect.c:5239:39: warning: if statement has empty body > [-Wempty-body] > > errorMessage, > false, true)); > > ^ > fe-connect.c:5239:39: note: put the semicolon on a separate line to > silence this warning > 1 warning generated. 8) get_next_element procedure implementation is way too smart (read "complicated"). You could probably just store current list length and generate a random number between 0 and length-1. -- Best regards, Aleksander Alekseev -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CF app and patch series
Hi all Now that it's becoming more common to post patch series, not just standalone patches, it might be worth looking at how the CF app can help manage them. Any ideas? Especially since the patch series may not get committed all in one go, but progressively rebased on top of the bits that did get committed until it's all in? Making one CF app entry for this doesn't work well. It's confusing, spammy in terms of number of entries involved, confusing for potential reviewers, and pretty useless unless each patch is independent of all the others. In which case they shouldn't be a series in the first place. A "partly committed" status doesn't help much; it doesn't signal whether the rest is waiting for author response, awaiting review after revision, etc. So far I've just not been flagging it committed at all until everything is. Or creating new CF entries that are immediately set to "committed" for each part that goes in. But neither seems ideal. Ideas? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))
On 05/09/2016 11:55, Julien Rouhaud wrote: > On 20/06/2016 06:28, Thomas Munro wrote: >> On Mon, Jun 20, 2016 at 3:43 PM, Craig Ringer wrote: >>> On 18 June 2016 at 11:28, Thomas Munro >>> wrote: Several times now when reading, debugging and writing code I've wished that LWLockHeldByMe assertions specified the expected mode, especially where exclusive locking is required. What do you think about something like the attached? See also an example of use. I will add this to the next commitfest. >>> >>> I've wanted this before too [...] >> > > same here. > >> Before ab5194e6f (25 December 2014) held_lwlocks didn't record the mode. >> > > I just reviewed both patches. They applies cleanly on current HEAD, > work as intended and make check run smoothly. Patches are pretty > straightforward, so I don't have much to say. > > My only remark is on following comment: > > + * LWLockHeldByMeInMode - test whether my process holds a lock in mode X > > Maybe something like "test whether my process holds a lock in given > mode" would be better? > > Otherwise, I think they're ready for committer. > Didn't saw that Simon just committed it, sorry about it. -- Julien Rouhaud http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))
On 20/06/2016 06:28, Thomas Munro wrote: > On Mon, Jun 20, 2016 at 3:43 PM, Craig Ringer wrote: >> On 18 June 2016 at 11:28, Thomas Munro >> wrote: >>> Several times now when reading, debugging and writing code I've wished >>> that LWLockHeldByMe assertions specified the expected mode, especially >>> where exclusive locking is required. >>> >>> What do you think about something like the attached? See also an >>> example of use. I will add this to the next commitfest. >> >> I've wanted this before too [...] > same here. > Before ab5194e6f (25 December 2014) held_lwlocks didn't record the mode. > I just reviewed both patches. They applies cleanly on current HEAD, work as intended and make check run smoothly. Patches are pretty straightforward, so I don't have much to say. My only remark is on following comment: + * LWLockHeldByMeInMode - test whether my process holds a lock in mode X Maybe something like "test whether my process holds a lock in given mode" would be better? Otherwise, I think they're ready for committer. -- Julien Rouhaud http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] condition variables
On Mon, Aug 15, 2016 at 10:35 PM, Robert Haas wrote: >>> Don't you need to set proc->cvSleeping = false in ConditionVariableSignal? >> >> I poked at this a bit... OK, a lot... and have some feedback: >> >> 1. As above, we need to clear cvSleeping before setting the latch. > > Right, OK. > I have independently faced this problem while using your patch and for now I have updated my local copy. If possible, please send an updated patch as this patch could be used for development of various parallelism projects. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))
On 18 June 2016 at 04:28, Thomas Munro wrote: > Hi hackers, > > Several times now when reading, debugging and writing code I've wished > that LWLockHeldByMe assertions specified the expected mode, especially > where exclusive locking is required. > > What do you think about something like the attached? See also an > example of use. I will add this to the next commitfest. Committed, thanks. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong
On 5 September 2016 at 16:33, Petr Jelinek wrote: >> The better alternative is to add a variant on >> pg_logical_slot_get_changes(...) etc that accepts a start LSN. But >> it's not convenient or easy for SQL callers to extract the last commit >> LSN received from the last call to pass it to the next one, so this >> isn't simple, and is probably best tackled as part of the SQL >> interface API change Petr and Andres discussed in this thread. >> > > It isn't? I thought lsn is first column of the output of that function? It is. While it's a pain to use that from SQL (psql, etc) as well as doing something with the results, there's no point since anything working in simple SQL will get terminated when the server restarts anyway. So really my point above is moot. That's doesn't reflect on the slot dirty patch, it just means there's no need to do anything extra when we add the cursor-like interface later in order to fully solve this. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Send catalog_xmin separately in hot standby feedback
On 5 September 2016 at 14:44, Craig Ringer wrote: > On 5 September 2016 at 12:40, Craig Ringer wrote: >> Hi all >> >> Currently hot standby feedback sends GetOldestXmin()'s result to the >> upstream as the required xmin. GetOldestXmin() returns a slot's >> catalog_xmin if that's the lowest xmin on the system. > > > Note that this patch changes the API to GetOldestXmin(), adding a new > boolean to allow it to disregard the catalog_xmin of slots. > > Per Simon's feedback I'm going to split that out into a separate > patch, so will post a follow-up split one soon as the series. Now formatted a series: 1. Send catalog_xmin in hot standby feedback protocol 2. Make walsender respect catalog_xmin in hot standby feedback messages 3. Allow GetOldestXmin(...) to optionally disregard the catalog_xmin 4. Send catalog_xmin separately in hot_standby_feedback messages Descriptions are in the patch headers. 1 adds the protocol field only. The value is at this point always sent as 0 by walreceiver and ignored by walsender. There's need to handle backward compatibility in the addition to the hot standby protocol message here as only the same major version Pg has any business sending us hot standby feedback. pg_receivexlog, pg_basebackup etc don't use hs feedback. Includes protocol docs change. 2 makes walsender now pay attention to the sent catalog_xmin. walreceiver doesn't set it yet and has no way to get it separately. 3 Provides a way to get the global xmin without considering the catalog_xmin so walreceiver can use it. 4 makes walsender use the modified GetOldestXmin() (3) needs additional attention: By ignoring slot catalog_xmin in the GetOldestXmin() call then separately calling ProcArrayGetReplicationSlotXmin() to get the catalog_xmin to we might produce a catalog xmin slightly later than what was in the procarray at the time we previously called GetOldestXmin() to examine backend/session state. ProcArrayLock is released so it can be advanced in-between the calls. That's harmless - it isn't necessary for the reported catalog_xmin to be exactly consistent with backend state. If it advances it's safe to report the new position since we know the confirmed positions are on-disk locally. The alternative here is to extend GetOldestXmin() to take an out-param to report the slot catalog xmin. That really just duplicates the functionality of ProcArrayGetReplicationSlotXmin but means we can grab it within a single ProcArray lock. Variants of GetOldestXmin and ProcArrayGetReplicationSlotXmin that take an already-locked flag would work too, but would hold the lock across parts of GetOldestXmin() that currently don't retain it. I could also convert the current boolean param ignoreVacuum into a flags argument instead of adding another boolean. No real preference from me. I cut out some comment changes to be submitted separately; otherwise this series is much the same as the original patch upthread. Also available at https://github.com/2ndQuadrant/postgres/tree/dev/feedback-catalog-xmin (and tagged dev/feedback-catalog-xmin). Branch subject to rebasing. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 275c6422962f1fc326cc5f0c92186de0b127c472 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Mon, 5 Sep 2016 15:30:53 +0800 Subject: [PATCH 1/6] Send catalog_xmin in hot standby feedback protocol Add catalog_xmin to the to the hot standby feedback protocol so a read replica that has logical slots can use its physical slot to the master to hold down the master's catalog_xmin. This information will let a replica prevent vacuuming of catalog tuples still required by the replica's logical slots. This is the hot standby feedback protocol change, the new value is always set to zero by the walreceiver and is ignored by the walsender. --- doc/src/sgml/protocol.sgml| 33 - src/backend/replication/walreceiver.c | 20 ++-- src/backend/replication/walsender.c | 14 -- 3 files changed, 54 insertions(+), 13 deletions(-) diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 68b0941..c4e41ca 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1783,10 +1783,11 @@ The commands accepted in walsender mode are: - The standby's current xmin. This may be 0, if the standby is - sending notification that Hot Standby feedback will no longer - be sent on this connection. Later non-zero messages may - reinitiate the feedback mechanism. + The standby's current global xmin, excluding the catalog_xmin from any + replication slots. If both this value and the following + catalog_xmin are 0 this is treated as a notification that Hot Standby + feedback will no longer be sent on this connection. Later non-zero + messages may rein
Re: [HACKERS] Optimization for lazy_scan_heap
On 4 August 2016 at 05:57, Masahiko Sawada wrote: > While reviewing freeze map code, Andres pointed out[1] that > lazy_scan_heap could accesses visibility map twice and its logic is > seems a bit tricky. > As discussed before, it's not nice especially when large relation is > entirely frozen. > > I posted the patch for that before but since this is an optimization, > not bug fix, I'd like to propose it again. > Please give me feedback. > > [1] > https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de If we have a freeze map now, surely tables will no longer be entirely frozen? What performance difference does this make, in a realistic use case? How would we test this to check it is exactly correct? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On Mon, Sep 5, 2016 at 2:00 PM, Pavan Deolasee wrote: > > > On Mon, Sep 5, 2016 at 3:18 AM, Tomas Vondra > wrote: >> >> Hi, >> >> This thread started a year ago, different people contributed various >> patches, some of which already got committed. Can someone please post a >> summary of this thread, so that it's a bit more clear what needs >> review/testing, what are the main open questions and so on? >> >> I'm interested in doing some tests on the hardware I have available, but >> I'm not willing spending my time untangling the discussion. >> > > I signed up for reviewing this patch. But as Amit explained later, there are > two different and independent implementations to solve the problem. Since > Tomas has volunteered to do some benchmarking, I guess I should wait for the > results because that might influence which approach we choose. > > Does that sound correct? > Sounds correct to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Unused function arguments
While reading the logical replication (and related) code, I found a few unused function arguments: * XactLogCommitRecord() - unused argument forceSync * SnapBuildBuildSnapshot() - xid * TeardownHistoricSnapshot() - is_error No idea which ones are intended for future use and which ones can be removed. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Yet another small patch - reorderbuffer.c:1099
> I looked at this and can see some of the argument on both sides, but > if it's setting off static-analyzer warnings for some people, that > seems like a sufficient reason to change it. We certainly make more > significant changes than this in order to silence warnings. > > I rewrote the comment a bit more and pushed it. Tom, thank you for committing this patch. And thanks everyone for your contribution! -- Best regards, Aleksander Alekseev -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong
On 5 September 2016 at 03:41, Craig Ringer wrote: > On 2 September 2016 at 17:49, Craig Ringer wrote: > >> So the main change becomes the one-liner in my prior mail. > > Per feedback from Simon, updated with a new test in src/test/recovery . > > If you revert the change to > src/backend/replication/logical/logicalfuncs.c then the test will > start failing. > > I'd quite like to backpatch the fix since it's simple and safe, but I > don't feel that it's hugely important to do so. This is an annoyance > not a serious data risking bug. I've committed to HEAD only. We can discuss backpatching elsewhere also. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong
On 05/09/16 04:41, Craig Ringer wrote: On 2 September 2016 at 17:49, Craig Ringer wrote: So the main change becomes the one-liner in my prior mail. Per feedback from Simon, updated with a new test in src/test/recovery . If you revert the change to src/backend/replication/logical/logicalfuncs.c then the test will start failing. I'd quite like to backpatch the fix since it's simple and safe, but I don't feel that it's hugely important to do so. This is an annoyance not a serious data risking bug. My only concern here is that we still lose position on crash. So SQL-interface callers should still be able to cope with it. But they won't see it happen if they're only testing with normal shutdowns, host reboots, etc. In practice users aren't likely to notice this anyway, though, since most people don't restart the server all the time. I think it's better than what we have. This issue could be eliminated completely by calling ReplicationSlotSave() after ReplicationSlotMarkDirty(). This would force an immediate flush after a non-peeked SQL interface decoding call. It means more fsync()ing, though, and the SQL interface can only be used by polling so that could be a significant performance hit. We'd want to only flush when the position changed, but even then it'll mean a sync every time anything gets returned. The better alternative is to add a variant on pg_logical_slot_get_changes(...) etc that accepts a start LSN. But it's not convenient or easy for SQL callers to extract the last commit LSN received from the last call to pass it to the next one, so this isn't simple, and is probably best tackled as part of the SQL interface API change Petr and Andres discussed in this thread. It isn't? I thought lsn is first column of the output of that function? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] COPY vs \copy HINT
Re: Craig Ringer 2016-09-05 > To cover the same-host case we could try something like: > >COPY runs on the PostgreSQL server, using the PostgreSQL server's > directories and permissions, it doesn't run on the client. > > ... but I think that's actually less helpful for the users who'll need this. The part about the server permissions might be useful to hint at. > > I don't suppose there's any easy way for COPY to distinguish local > > from remote connections > > Not that I see, since "local" can be unix socket or tcp to localhost. > Not cleanly anyway. > > I don't think it matters. Many hosts have enough paths in common that > in practice the hint on EACCES will be useful anyway. It'd be nice to > work in something about running with the permissions of the PostgreSQL > server, but I don't see a way to do that without making it all more > complex. > > I don't think it's worth tons of time anyway. This will be better than > what we have, lets do it. It's probably not helpful trying to distinguish local and remote connections, the set of possible problems is diverse and this is just one of them. > I'm fairly happy with the wording you came up with: > > "COPY copies to a file on the PostgreSQL server, not on the client. " > "You may want a client-side facility such as psql's \\copy." What about "COPY TO instructs the PostgreSQL server to write to a file on the server. " "You may want a client-side facility such as psql's \\copy." "COPY FROM instructs the PostgreSQL server to read from a file on the server. " "You may want a client-side facility such as psql's \\copy." This stresses more explicitly that the file is opened by the server process. (I'm not particularly happy that it says "server" in there twice, but couldn't think of anything else.) Christoph -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On Mon, Sep 5, 2016 at 3:18 AM, Tomas Vondra wrote: > Hi, > > This thread started a year ago, different people contributed various > patches, some of which already got committed. Can someone please post a > summary of this thread, so that it's a bit more clear what needs > review/testing, what are the main open questions and so on? > > I'm interested in doing some tests on the hardware I have available, but > I'm not willing spending my time untangling the discussion. > > I signed up for reviewing this patch. But as Amit explained later, there are two different and independent implementations to solve the problem. Since Tomas has volunteered to do some benchmarking, I guess I should wait for the results because that might influence which approach we choose. Does that sound correct? Or do we already know which implementation is more likely to be pursued, in which case I can start reviewing that patch. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Supporting SJIS as a database encoding
> Before digging into the problem, could you share your impression on whether > PostgreSQL can support SJIS? Would it be hopeless? Can't we find any > direction to go? Can I find relevant source code by searching specific words > like "ASCII", "HIGH_BIT", "\\" etc? For starters, you could grep "multibyte". Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - allow to store select results into variables
Hi Fabien, On 2016/09/03 2:47, Fabien COELHO wrote: >> This patch needs to be rebased because of commit 64710452 (on 2016-08-19). > > Here it is! Thanks for sending the updated patch. Here are some (mostly cosmetic) comments. Before the comments, let me confirm whether the following result is odd (a bug) or whether I am just using it wrong: Custom script looks like: \; select a \into a from tab where a = 1; \set i debug(:a) I get the following error: undefined variable "a" client 0 aborted in state 1; execution of meta-command failed Even the following script gives the same result: \; select a from a where a = 1; \into a \set t debug(:a) However with the following there is no error and a gets set to 2: select a from a where a = 1 \; select a+1 from a where a = 1; \into a \set t debug(:a) Comments on the patch follow: + + + Stores the first fields of the resulting row from the current or preceding + SELECT command into these variables. Any command returning rows ought to work, no? For example, the following works: update a set a = a+1 returning *; \into a \set t debug(:a) -char *line; /* text of command line */ +char *line; /* first line for short display */ +char *lines; /* full multi-line text of command */ When I looked at this and related hunks (and also the hunks related to semicolons), it made me wonder whether this patch contains two logical changes. Is this a just a refactoring for the \into implementation or does this provide some new externally visible feature? char *argv[MAX_ARGS]; /* command word list */ +int compound; /* last compound command (number of \;) */ +char***intos; /* per-compound command \into variables */ Need an extra space for intos to align with earlier fields. Also I wonder if intonames or intoargs would be a slightly better name for the field. +static bool +read_response(CState *st, char ** intos[]) Usual style seems to be to use ***intos here. +fprintf(stderr, +"client %d state %d compound %d: " +"cannot apply \\into to non SELECT statement\n", +st->id, st->state, compound); How about make this error message say something like other messages related to \into, perhaps something like: "\\into cannot follow non SELECT statements\n" /* * Read and discard the query result; note this is not included in - * the statement latency numbers. + * the statement latency numbers (above), thus if reading the + * response fails the transaction is counted nevertheless. */ Does this comment need to mention that the result is not discarded when \into is specified? +my_command->intos = pg_malloc0(sizeof(char**) * (compounds+1)); Need space: s/char**/char **/g This comments applies also to a couple of nearby places. -my_command->line = pg_malloc(nlpos - p + 1); +my_command->line = pg_malloc(nlpos - p + 1 + 3); A comment mentioning what the above means would be helpful. +boolsql_command_in_progress = false; This variable's name could be slightly confusing to readers. If I understand its purpose correctly, perhaps it could be called sql_command_continues. +if (index == 0) +syntax_error(desc, lineno, NULL, NULL, + "\\into cannot start a script", + NULL, -1); How about: "\\into cannot be at the beginning of a script" ? Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup stream xlog to tar
On Sat, Sep 3, 2016 at 10:35 PM, Magnus Hagander wrote: > Ugh. That would be nice to have, but I think that's outside the scope of > this patch. A test for this patch that could have value would be to use pg_basebackup -X stream -Ft, then untar pg_xlog.tar and look at the size of the segments. If you have an idea to untar something without the in-core perl support because we need to have the TAP stuff able to run on at least 5.8.8, I'd welcome an idea. Honestly I still have none, and that's why the recovery tests do not use tarballs in their tests when using pg_basebackup. In short let's not add something more for this patch. > PFA is an updated version of this patch that: > * documents a magic value passed to zlib (which is in their documentation as > being a magic value, but has no define) > * fixes the padding of tar files > * adds a most basic test that the -X stream -Ft does produce a tarfile So I had a more serious look at this patch, and it basically makes more generic the operations done for the plain mode by adding a set of routines that can be used by both tar and plain mode to work on the WAL files streamed. Elegant. + +The transaction log files will be written to + the base.tar file. + Nit: number of spaces here. -mark_file_as_archived(const char *basedir, const char *fname) +mark_file_as_archived(StreamCtl *stream, const char *fname) Just passing WalMethod as argument would be enough, but... My patch adding the fsync calls to pg_basebackup could just make use of StreamCtl, so let's keep it as you suggest. static bool existsTimeLineHistoryFile(StreamCtl *stream) { [...] + return stream->walmethod->existsfile(histfname); } existsfile returns always false for the tar method. This does not matter much because pg_basebackup exists immediately in case of a failure, but I think that this deserves a comment in ReceiveXlogStream where existsTimeLineHistoryFile is called. I find the use of existsfile() in open_walfile() rather confusing because this relies on the fact that existsfile() returns always false for the tar mode. We could add an additional field in WalMethod to store the method type and use that more, but that may make the code more confusing than what you propose. What do you think? + int (*unlink) (const char *pathname); The unlink method is used nowhere. This could just be removed. -static void +void print_tar_number(char *s, int len, uint64 val) This could be an independent patch. Or not. I think that I found another bug regarding the contents of the segments. I did pg_basebackup -F t -X stream, then untared pg_xlog.tar which contained segment 1/0/2, then: $ pg_xlogdump 00010002 pg_xlogdump: FATAL: could not find a valid record after 0/200 I'd expect this segment to have records, up to a XLOG_SWITCH record. > As for using XLOGDIR to drive the name of the tarfile. pg_basebackup is > already hardcoded to use pg_xlog. And so are the tests. We probably want to > fix that, but that's a separate step and this patch will be easier to review > and test if we keep it out for now. Yes. pg_basebackup is not the only thing here missing the point here, here is most of the list: $ git grep "pg_xlog\"" -- *.c *.h src/backend/access/transam/xlog.c:static const char *xlogSourceNames[] = {"any", "archive", "pg_xlog", "stream"}; src/backend/replication/basebackup.c: dir = AllocateDir("pg_xlog"); src/backend/replication/basebackup.c:(errmsg("could not open directory \"%s\": %m", "pg_xlog"))); src/backend/replication/basebackup.c: while ((de = ReadDir(dir, "pg_xlog")) != NULL) src/backend/replication/basebackup.c: if (strcmp(pathbuf, "./pg_xlog") == 0) src/backend/storage/file/fd.c: if (lstat("pg_xlog", &st) < 0) src/backend/storage/file/fd.c: "pg_xlog"))); src/backend/storage/file/fd.c: if (pgwin32_is_junction("pg_xlog")) src/backend/storage/file/fd.c: walkdir("pg_xlog", pre_sync_fname, false, DEBUG1); src/backend/storage/file/fd.c: walkdir("pg_xlog", datadir_fsync_fname, false, LOG); src/bin/initdb/initdb.c:snprintf(pg_xlog, MAXPGPATH, "%s/pg_xlog", pg_data); src/bin/initdb/initdb.c:subdirloc = psprintf("%s/pg_xlog", pg_data); src/bin/pg_basebackup/pg_basebackup.c: snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir); src/bin/pg_basebackup/pg_basebackup.c: if (!((pg_str_endswith(filename, "/pg_xlog") || src/bin/pg_basebackup/pg_basebackup.c: linkloc = psprintf("%s/pg_xlog", basedir); src/bin/pg_rewind/copy_fetch.c: strcmp(path, "pg_xlog") == 0) src/bin/pg_rewind/filemap.c:if (strcmp(path, "pg_xlog") == 0 && type == FILE_TYPE_SYMLINK) src/bin/pg_rewind/filemap.c:if (exists && !S_ISDIR(statbuf.st_mode) && strcmp(path, "pg_xlog") != 0) src/bin/pg_rewind/filemap.c:if (strcmp(path, "pg_xlog") == 0 && type == FILE_TYPE_SYMLINK) src/bin/pg_upgrade/exec.c: "pg
Re: [HACKERS] Supporting SJIS as a database encoding
> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii > > But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there > any technical difficulty? Is there anything you are worried about if adding > SJIS? > > Yes, there's a technical difficulty with backend code. In many places it > is assumed that any string is "ASCII compatible", which means no ASCII > character is used as a part of multi byte string. Here is such a random > example from src/backend/util/adt/varlena.c: > > /* Else, it's the traditional escaped style */ > for (bc = 0, tp = inputText; *tp != '\0'; bc++) > { > if (tp[0] != '\\') > tp++; > > Sometimes SJIS uses '\' as the second byte of it. Thanks, I'll try to understand the seriousness of the problem as I don't have good knowledge of character sets. But your example seems to be telling everything about the difficulty... Before digging into the problem, could you share your impression on whether PostgreSQL can support SJIS? Would it be hopeless? Can't we find any direction to go? Can I find relevant source code by searching specific words like "ASCII", "HIGH_BIT", "\\" etc? Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting SJIS as a database encoding
> But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there any > technical difficulty? Is there anything you are worried about if adding SJIS? Yes, there's a technical difficulty with backend code. In many places it is assumed that any string is "ASCII compatible", which means no ASCII character is used as a part of multi byte string. Here is such a random example from src/backend/util/adt/varlena.c: /* Else, it's the traditional escaped style */ for (bc = 0, tp = inputText; *tp != '\0'; bc++) { if (tp[0] != '\\') tp++; Sometimes SJIS uses '\' as the second byte of it. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Supporting SJIS as a database encoding
Hello, I'd like to propose adding SJIS as a database encoding. You may wonder why SJIS is still necessary in the world of Unicode. The purpose is to achieve comparable performance when migrating legacy database systems from other DBMSs without little modification of applications. Recently, we failed to migrate some customer's legacy database from DBMS-X to PostgreSQL. That customer wished for PostgreSQL, but PostgreSQL couldn't meet the performance requirement. The system uses DBMS-X with the database character set being SJIS. The main applications are written in embedded SQL, which require SJIS in their host variables. They insisted they cannot use UTF8 for the host variables because that would require large modification of applications due to character handling. So no character set conversion is necessary between the clients and the server. On the other hand, PostgreSQL doesn't support SJIS as a database encoding. Therefore, character set conversion from UTF-8 to SJIS has to be performed. The batch application runs millions of SELECTS each of which retrieves more than 100 columns. And many of those columns are of character type. If PostgreSQL supports SJIS, PostgreSQL will match or outperform the performance of DBMS-X with regard to the applications. We confirmed it by using psql to run a subset of the batch processing. When the client encoding is SJIS, one FETCH of 10,000 rows took about 500ms. When the client encoding is UTF8 (the same as the database encoding), the same FETCH took 270ms. Supporting SJIS may somewhat regain attention to PostgreSQL here in Japan, in the context of database migration. BTW, MySQL supports SJIS as a database encoding. PostgreSQL used to be the most popular open source database in Japan, but MySQL is now more popular. But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there any technical difficulty? Is there anything you are worried about if adding SJIS? I'd like to write a patch for adding SJIS if there's no strong objection. I'd appreciate it if you could let me know good design information to add a server encoding (e.g. the URL of the most recent patch to add a new server encoding) Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel build with MSVC
* Michael Paquier wrote: On Mon, Sep 5, 2016 at 9:18 AM, Noah Misch wrote: Every vcbuild and msbuild invocation ought to recognize this variable, so please update the two places involving ecpg_regression.proj. Apart from that, the patch looks good. Good catch. I did not notice those during my lookups of those patches. Not my intention to bump into Christian's feet, but here are updated patches. My feet feel fine. Thanks for updating. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_hba_file_settings view patch
On Sun, Sep 4, 2016 at 1:44 AM, Simon Riggs wrote: > On 15 August 2016 at 12:17, Haribabu Kommi > wrote: > > > comments? > > This looks like a good feature contribution, thanks. > > At present the patch doesn't apply cleanly, please rebase. > Rebased patch is attached. > The patch doesn't contain any tests, which means I can't see what the > output looks like, so I can't judge the exact usefulness of this > particular patch. ISTM likely that there will be some detailed points > to review in there somewhere. > Added a test in the regress and also in the docs. Do we want line number, or priority order? i.e. do we want to count > comment lines or just main rule lines? I prefer latter. > Various other questions along those lines needed, once I can see the > output. > I preferred the line number that includes the comment lines also. This way it will be easy to edit the file if it contains any errors by directly going to that line number. > What is push_jsonb_string_value() etc..? > If there is required infrastructure there is no explanation of why. > Suggest you explain and/or split into two. > I added a JSONB type column to display the hba.conf options values. To store the options data into JSONB format, currently i didn't find any functions that are available to use in the core. So I added key/value functions to add the data into JSONB object. The functions related code is split into a different patch and attached. > pg_hba_file_settings seems a clumsy name. I'd prefer pg_hba_settings, > since that name could live longer than the concept of pg_hba.conf, > which seems likely to become part of ALTER SYSTEM in future, so we > wouldn't really want the word "file" in there. > Yes, I also agree that *file* is not required. The hba rules are not available in memory also in the backends. I changed the view name to pg_hba_rules as per the other mail from Christoph. Regards, Hari Babu Fujitsu Australia pg_hba_rules_1.patch Description: Binary data jsonb_utilitiy_funcs_1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LSN as a recovery target
On Mon, Sep 5, 2016 at 4:02 PM, Simon Riggs wrote: > On 5 September 2016 at 06:55, Michael Paquier > wrote: >> On Sun, Sep 4, 2016 at 11:30 PM, Simon Riggs wrote: > >>> I noticed we don't mention what LSN is anywhere, so I'd like to apply >>> the following doc patch also. >> >> +1 for the idea. What do you think about adding a mention to the >> system data type pg_lsn? >> >> >> + WAL records are appended to the WAL >> + logs as each new record is written. The insert position is described by >> + a Log Sequence Number (LSN) that is a byte offset into >> + the logs, increasing monotonically with each new record. Two >> + LSNs can be compared to calculate the volume of >> + WAL data that separates them, so are used to measure >> + progress of WAL during replication and recovery. >> + >> Here we could for example append a sentence like "The system data type >> pg_lsn is an internal >> representation of the LSN". > > Good input, thanks. v2 attached. Looks good to me. Thanks for the wordsmithing. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better locale-specific-character-class handling for regexps
On 09/04/2016 08:44 PM, Tom Lane wrote: Heikki Linnakangas writes: On 08/23/2016 03:54 AM, Tom Lane wrote: +1 for this patch in general. Some regression test cases would be nice. I'm not sure how to write such tests without introducing insurmountable platform dependencies --- particularly on platforms with weak support for UTF8 locales, such as OS X. All the interesting cases require knowing what iswalpha() etc will return for some high character codes. What I did to test it during development was to set MAX_SIMPLE_CHR to something in the ASCII range, so that the high-character-code paths could be tested without making any assumptions about locale classifications for non-ASCII characters. I'm not sure that's a helpful idea for regression testing purposes, though. I guess I could follow the lead of collate.linux.utf8.sql and produce a test that's only promised to pass on one platform with one encoding, but I'm not terribly excited by that. AFAIK that test file does not get run at all in the buildfarm or in the wild. I'm not too worried if the tests don't get run regularly, but I don't like the idea that only works on one platform. This code is unlikely to be accidentally broken by unrelated changes in the backend, as the regexp code is very well isolated. But for someone hacks on the regexp library in the future, having a test suite to tickle all these corner-cases would be very handy. Another class of regressions would be that something changes in the way a locale treats some characters, and that breaks an application. That would be very difficult to test for in a platform-independent way. That wouldn't really our bug, though, but the locale's. Since we're now de facto maintainers of this regexp library, and our version could be used somewhere else than PostgreSQL too, it would actually be nice to have a regression suite that's independent from the pg_regress infrastructure, and wouldn't need a server to run. Perhaps a stand-alone C program that compiles the regexp code with mock versions of pg_wc_is* functions. Or perhaps a magic collation OID that makes pg_wc_is* functions to return hard-coded values for particular inputs. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LSN as a recovery target
On 5 September 2016 at 06:55, Michael Paquier wrote: > On Sun, Sep 4, 2016 at 11:30 PM, Simon Riggs wrote: >> I noticed we don't mention what LSN is anywhere, so I'd like to apply >> the following doc patch also. > > +1 for the idea. What do you think about adding a mention to the > system data type pg_lsn? > > > + WAL records are appended to the WAL > + logs as each new record is written. The insert position is described by > + a Log Sequence Number (LSN) that is a byte offset into > + the logs, increasing monotonically with each new record. Two > + LSNs can be compared to calculate the volume of > + WAL data that separates them, so are used to measure > + progress of WAL during replication and recovery. > + > Here we could for example append a sentence like "The system data type > pg_lsn is an internal > representation of the LSN". Good input, thanks. v2 attached. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services doc_lsn.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers