Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?
On 2014-02-18 18:10:02 -0800, Peter Geoghegan wrote: On Tue, Feb 18, 2014 at 5:50 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Peter Geoghegan wrote: I've had multiple complaints of apparent data loss on 9.3.2 customer databases. There are 2 total, both complaints from the past week, one of which I was able to confirm. The customer's complaint is that certain rows are either visible or invisible, depending on whether an index scan is used or a sequential scan (I confirmed this with an EXPLAIN ANALYZE). The multixact bugs would cause tuples to be hidden at the heap level. If the tuples are visible in a seqscan, then these are more likely to be related to index problems, not multixact problem. I guess I wasn't clear enough here: The row in question was visible with a sequential scans but *not* with an index scan. So you have it backwards here (understandably). Was there an index only scan or just a index scan? Any chance of a corrupted index? Do you still have the page from before you did the VACUUM? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?
On Wed, Feb 19, 2014 at 12:40 AM, Andres Freund and...@2ndquadrant.com wrote: Was there an index only scan or just a index scan? Any chance of a corrupted index? Just an index scan. I think it's unlikely to be a corrupt index, because the customer said that he dropped and restored the index, and it's difficult to imagine how VACUUM FREEZE could then have impacted a plan with only a sequential scan. Do you still have the page from before you did the VACUUM? It seems likely that I could get it, given that the problem persisted over several days (apparently there was an earlier, manual attempt to repair the damage by hand). I'm reasonably confident that I could get back an affected database by performing a point-in-time recovery. That would also give me the benefit of an environment that I could break as needed. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?
On 2014-02-19 00:55:03 -0800, Peter Geoghegan wrote: On Wed, Feb 19, 2014 at 12:40 AM, Andres Freund and...@2ndquadrant.com wrote: Was there an index only scan or just a index scan? Any chance of a corrupted index? Just an index scan. I think it's unlikely to be a corrupt index, because the customer said that he dropped and restored the index, and it's difficult to imagine how VACUUM FREEZE could then have impacted a plan with only a sequential scan. I am wondering whether it's possibly either the vm or the all-visible flag on the page... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
(2014/02/19 12:12), Kyotaro HORIGUCHI wrote: At Tue, 18 Feb 2014 19:24:50 +0900, Etsuro Fujita wrote From: Shigeru Hanada [mailto:shigeru.han...@gmail.com] I'm not sure that allowing ALTER TABLE against parent table affects descendants even some of them are foreign table. I think the rule should be simple enough to understand for users, of course it should be also consistent and have backward compatibility. Yeah, the understandability is important. But I think the flexibility is also important. In other words, I think it is a bit too inflexible that we disallow e.g., SET STORAGE to be set on an inheritance tree that contains foreign table(s) because we disallow SET STORAGE to be set on foreign tables directly. What use case needs such a flexibility precedes the lost behavior predictivity of ALTER TABLE and/or code maintenancibility(more ordinally words must be...) ? I don't agree with the idea that ALTER TABLE implicitly affects foreign children for the reason in the upthread. Also turning on/off feature implemented as special syntax seems little hope. It is just my personal opinion, but I think it would be convenient for users to alter inheritance trees that contain foreign tables the same way as inheritance trees that don't contain any foreign tables, without making user conscious of the inheritance trees contains foreign tables or not. Imagine we have an inheritance tree that contains only plain tables and then add a foreign table as a child of the inheritance tree. Without the flexiblity, we would need to change the way of altering the structure of the inheritance tree (e.g., ADD CONSTRAINT) to a totally different one, immediately when adding the foreign table. I don't think that would be easy to use. What I think should be newly allowed to be set on foreign tables is * ADD table_constraint * DROP CONSTRAINT As of 9.3 http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html Consistency with the foreign server is not checked when a column is added or removed with ADD COLUMN or DROP COLUMN, a NOT NULL constraint is added, or a column type is changed with SET DATA TYPE. It is the user's responsibility to ensure that the table definition matches the remote side. So I belive implicit and automatic application of any constraint on foreign childs are considerably danger. We spent a lot of time discussing this issue, and the consensus is that it's users' fault if there are some tuples on the remote side violating a given constraint, as mentioned in the documentation. * [NO] INHERIT parent_table Is this usable for inheritance foreign children? NO INHERIT removes all foreign children but INHERIT is nop. I didn't express clearly. Sorry for that. Let me explain about it. * ALTER FOREIGN TABLE target_table *INHERIT* parent_table: Add the target table as a new child of the parent table. * ALTER FOREIGN TABLE target_table *NO INHERIT* parent_table: Remove the target table from the list of children of the parent table. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST support for inet datatypes
On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Emre Hasegeli e...@hasegeli.com writes: How about only removing the inet and the cidr operator classes from btree_gist. btree-gist-drop-inet-v2.patch does that. I'm not sure which part of no you didn't understand, but to be clear: you don't get to break existing installations. Assuming that this opclass is sufficiently better than the existing one, it would sure be nice if it could become the default; but I've not seen any proposal in this thread that would allow that without serious upgrade problems. I think the realistic alternatives so far are (1) new opclass is not the default, or (2) this patch gets rejected. We should probably expend some thought on a general approach to replacing the default opclass for a datatype, because I'm sure this will come up again. Right now I don't see a feasible way. It seems odd for default to be a property of the opclass rather than a separate knob. What comes to mind is something like: ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops; Not sure how much that helps. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL Rate Limiting
On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote: Agreed; that was the original plan, but implementation delays prevented the whole vision/discussion/implementation. Requirements from various areas include WAL rate limiting for replication, I/O rate limiting, hard CPU and I/O limits for security and mixed workload coexistence. I'd still like to get something on this in 9.4 that alleviates the replication issues, leaving wider changes for later releases. My first reaction was that we should just have a generic I/O resource throttling. I was only convinced this was a reasonable idea by the replication use case. It would help me to understand the specific situations where replication breaks down due to WAL bandwidth starvation. Heroku has had some problems with slaves falling behind though the immediate problems that causes is the slave filling up disk which we could solve more directly by switching to archive mode rather than slowing down the master. But I would suggest you focus on a specific use case that's problematic so we can judge better if the implementation is really fixing it. The vacuum_* parameters don't allow any control over WAL production, which is often the limiting factor. I could, for example, introduce a new parameter for vacuum_cost_delay that provides a weighting for each new BLCKSZ chunk of WAL, then rename all parameters to a more general form. Or I could forget that and just press ahead with the patch as is, providing a cleaner interface in next release. It's also interesting to wonder about the relationship to CHECK_FOR_INTERRUPTS --- although I think that currently, we assume that that's *cheap* (1 test and branch) as long as nothing is pending. I don't want to see a bunch of arithmetic added to it. Good point. I think it should be possible to actually merge it into CHECK_FOR_INTERRUPTS. Have a single global flag io_done_since_check_for_interrupts which is set to 0 after each CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one in the normal case. In fact you could do all the arithmetic when you do the wal write. Only set the flag if the bandwidth consumed is above the budget. Then the flag should only ever be set when you're about to sleep. I would dearly love to see a generic I/O bandwidth limits so it would be nice to see a nicely general pattern here that could be extended even if we only target wal this release. I'm going to read the existing patch now, do you think it's ready to go or did you want to do more work based on the feedback? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier michael.paqu...@gmail.com wrote: Here are updated patches to use pg_lsn instead of pglsn... Should I register this patch somewhere to avoid having it lost in the void? Regards, Well, I committed this, but the buildfarm's deeply unhappy with it. Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on some platforms... and I'm not sure what to do about that, right off-hand. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On 2014-02-19 09:24:03 -0500, Robert Haas wrote: On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier michael.paqu...@gmail.com wrote: Here are updated patches to use pg_lsn instead of pglsn... Should I register this patch somewhere to avoid having it lost in the void? Regards, Well, I committed this, but the buildfarm's deeply unhappy with it. Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on some platforms... and I'm not sure what to do about that, right off-hand. The relevant bit probably is: pg_lsn.c: In function 'pg_lsn_out': pg_lsn.c:59:2: warning: implicit declaration of function 'GET_8_BYTES' [-Wimplicit-function-declaration] GET_8_BYTES only exists for 64bit systems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-19 09:24:03 -0500, Robert Haas wrote: On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier michael.paqu...@gmail.com wrote: Here are updated patches to use pg_lsn instead of pglsn... Should I register this patch somewhere to avoid having it lost in the void? Regards, Well, I committed this, but the buildfarm's deeply unhappy with it. Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on some platforms... and I'm not sure what to do about that, right off-hand. The relevant bit probably is: pg_lsn.c: In function 'pg_lsn_out': pg_lsn.c:59:2: warning: implicit declaration of function 'GET_8_BYTES' [-Wimplicit-function-declaration] GET_8_BYTES only exists for 64bit systems. Right, I got that far. So it looks like float8, int8, timestamp, timestamptz, and money all have behavior contingent on USE_FLOAT8_BYVAL, making that symbol a misnomer in every way. But since we've already marched pretty far down that path I suppose we should keep marching. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?
From: Jeff Janes jeff.ja...@gmail.com I thought that this was the point I was making, not the point I was missing. You have the same hard drives you had before, but now due to a software improvement you are cramming 5 times more stuff through them. Yeah, you will get bigger latency spikes. Why wouldn't you? You are now beating the snot out of your hard drives, whereas before you were not. If you need 10,000 TPS, then you need to upgrade to 9.4. If you need it with low maximum latency as well, then you probably need to get better IO hardware as well (maybe not--maybe more tuning could help). With 9.3 you didn't need better IO hardware, because you weren't capable of maxing out what you already had. With 9.4 you can max it out, and this is a good thing. If you need 10,000 TPS but only 2000 TPS are completing under 9.3, then what is happening to the other 8000 TPS? Whatever is happening to them, it must be worse than a latency spike. On the other hand, if you don't need 10,000 TPS, than measuring max latency at 10,000 TPS is the wrong thing to measure. Thank you, I've probably got the point --- you mean the hard disk for WAL is the bottleneck. But I still wonder a bit why the latency spike became so bigger even with # of clients fewer than # of CPU cores. I suppose the requests get processed more smoothly when the number of simultaneous requests is small. Anyway, I want to believe the latency spike would become significantly smaller on an SSD. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST support for inet datatypes
Robert Haas robertmh...@gmail.com writes: On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: We should probably expend some thought on a general approach to replacing the default opclass for a datatype, because I'm sure this will come up again. Right now I don't see a feasible way. It seems odd for default to be a property of the opclass rather than a separate knob. What comes to mind is something like: ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops; Not sure how much that helps. Not at all, AFAICS. If it were okay to decide that some formerly-default opclass is no longer default, then having such a command would be better than manually manipulating pg_opclass.opcdefault --- but extension upgrade scripts could certainly do the latter if they had to. The problem here is whether it's upgrade-safe to make such a change at all; having syntactic sugar doesn't make it safer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
Robert Haas robertmh...@gmail.com writes: On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote: GET_8_BYTES only exists for 64bit systems. Right, I got that far. So it looks like float8, int8, timestamp, timestamptz, and money all have behavior contingent on USE_FLOAT8_BYVAL, making that symbol a misnomer in every way. But since we've already marched pretty far down that path I suppose we should keep marching. You need somebody to help you with getting that working on 32-bit platforms? Because it needs to get fixed, or reverted, PDQ. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Wed, Feb 19, 2014 at 10:03 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote: GET_8_BYTES only exists for 64bit systems. Right, I got that far. So it looks like float8, int8, timestamp, timestamptz, and money all have behavior contingent on USE_FLOAT8_BYVAL, making that symbol a misnomer in every way. But since we've already marched pretty far down that path I suppose we should keep marching. You need somebody to help you with getting that working on 32-bit platforms? Because it needs to get fixed, or reverted, PDQ. Hopefully the commit I just pushed will fix it. It now works on my machine with and without --disable-float8-byval. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.
--On 18. Februar 2014 22:23:59 +0200 Heikki Linnakangas hlinn...@iki.fi wrote: I considered it a new feature, so not back-patching was the default. If you want to back-patch it, I won't object. That was my original feeling, too, but +1 for backpatching. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.
On Tue, Feb 18, 2014 at 04:39:27PM -0300, Alvaro Herrera wrote: Heikki Linnakangas wrote: Add a GUC to report whether data page checksums are enabled. Is there are reason this wasn't back-patched to 9.3? I think it should be. +1 for back-patching. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL Rate Limiting
On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark st...@mit.edu wrote: On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote: Agreed; that was the original plan, but implementation delays prevented the whole vision/discussion/implementation. Requirements from various areas include WAL rate limiting for replication, I/O rate limiting, hard CPU and I/O limits for security and mixed workload coexistence. I'd still like to get something on this in 9.4 that alleviates the replication issues, leaving wider changes for later releases. My first reaction was that we should just have a generic I/O resource throttling. I was only convinced this was a reasonable idea by the replication use case. It would help me to understand the specific situations where replication breaks down due to WAL bandwidth starvation. Heroku has had some problems with slaves falling behind though the immediate problems that causes is the slave filling up disk which we could solve more directly by switching to archive mode rather than slowing down the master. But I would suggest you focus on a specific use case that's problematic so we can judge better if the implementation is really fixing it. The vacuum_* parameters don't allow any control over WAL production, which is often the limiting factor. I could, for example, introduce a new parameter for vacuum_cost_delay that provides a weighting for each new BLCKSZ chunk of WAL, then rename all parameters to a more general form. Or I could forget that and just press ahead with the patch as is, providing a cleaner interface in next release. It's also interesting to wonder about the relationship to CHECK_FOR_INTERRUPTS --- although I think that currently, we assume that that's *cheap* (1 test and branch) as long as nothing is pending. I don't want to see a bunch of arithmetic added to it. Good point. I think it should be possible to actually merge it into CHECK_FOR_INTERRUPTS. Have a single global flag io_done_since_check_for_interrupts which is set to 0 after each CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one in the normal case. In fact you could do all the arithmetic when you do the wal write. Only set the flag if the bandwidth consumed is above the budget. Then the flag should only ever be set when you're about to sleep. I would dearly love to see a generic I/O bandwidth limits so it would be nice to see a nicely general pattern here that could be extended even if we only target wal this release. I'm going to read the existing patch now, do you think it's ready to go or did you want to do more work based on the feedback? Well, *I* don't think this is ready to go. A WAL rate limit that only limits WAL sometimes still doesn't impress me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Wed, Feb 19, 2014 at 3:11 PM, Robert Haas robertmh...@gmail.com wrote: Hopefully the commit I just pushed will fix it. It now works on my machine with and without --disable-float8-byval. It builds and passes here on 32bits -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Wed, Feb 19, 2014 at 11:03 AM, Greg Stark st...@mit.edu wrote: On Wed, Feb 19, 2014 at 3:11 PM, Robert Haas robertmh...@gmail.com wrote: Hopefully the commit I just pushed will fix it. It now works on my machine with and without --disable-float8-byval. It builds and passes here on 32bits The buildfarm seems happy so far, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Description for pg_replslot in docs
On Mon, Feb 17, 2014 at 10:50 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Feb 18, 2014 at 12:43 PM, Amit Kapila amit.kapil...@gmail.com wrote: Description for contents of PGDATA is mentioned at following page in docs: http://www.postgresql.org/docs/devel/static/storage-file-layout.html Isn't it better to have description of pg_replslot in the same place? Definitely. +1. Yep, good catch. Fixed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl stop times out when it should respond quickly
On Mon, Feb 17, 2014 at 11:29 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: The pg_regress part is ugly. However, pg_regress is doing something unusual when starting postmaster itself, so the ugly coding to stop it seems to match. If we wanted to avoid the ugliness here, the right fix would be to use pg_ctl to start postmaster as well as to stop it. I wonder if this would change the behavior in cases where we hit ^C during the regression tests. Right now I think that kills the postmaster as well as pg_regress, but if we used pg_ctl, it might not, because pg_regress uses fork()+exec(), but pg_ctl uses system() to launch a shell which is in turn instructed to background the postmaster. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote: On 2/5/14, 1:31 PM, Robert Haas wrote: On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote: Perhaps this type should be called pglsn, since it's an implementation-specific detail and not a universal concept like int, point, or uuid. If we're going to do that, I suggest pg_lsn rather than pglsn. We already have pg_node_tree, so using underscores for separation would be more consistent. Yes, that's a good precedent in multiple ways. Here are updated patches to use pg_lsn instead of pglsn... OK, so I think this stuff is all committed now, with assorted changes. Thanks for your work on this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changeset Extraction v7.6.1
On Sat, Feb 15, 2014 at 6:59 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-15 17:29:04 -0500, Robert Haas wrote: On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund and...@2ndquadrant.com wrote: [ new patches ] 0001 already needs minor Hm? If there are conflicts, I'll push/send a rebased tomorrow or monday. As you guessed, the missing word was rebasing. It's a trivial conflict though, so please don't feel the need to repost just for that. +* the transaction's invalidations. This currently won't be needed if +* we're just skipping over the transaction, since that currently only +* happens when starting decoding, after we invalidated all caches, but +* transactions in other databases might have touched shared relations. I'm having trouble understanding what this means, especially the part after the but. Let me rephrase, maybe that will help: This currently won't be needed if we're just skipping over the transaction because currenlty we only do so during startup, to get to the first transaction the client needs. As we have reset the catalog caches before starting to read WAL and we haven't yet touched any catalogs there can't be anything to invalidate. But if we're forgetting this commit because it's it happened in another database, the invalidations might be important, because they could be for shared catalogs and we might have loaded data into the relevant syscaches. Better? Much! Please include that text, or something like it. + if (IsTransactionState() + GetTopTransactionIdIfAny() != InvalidTransactionId) + ereport(ERROR, + (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), +errmsg(cannot perform changeset extraction in transaction that has performed writes))); This is sort of an awkward restriction, as it makes it hard to compose this feature with others. What underlies the restriction, can we get rid of it, and if not can we include a comment here explaining why it exists? Well, you can write the result into a table if you're halfway careful... :) I think it should be fairly easy to relax the restriction to creating a slot, but not getting data from it. Do you think that would that be sufficient? That would be a big improvement, for sure, and might be entirely sufficient. + /* register output plugin name with slot */ + strncpy(NameStr(slot-data.plugin), plugin, + NAMEDATALEN); + NameStr(slot-data.plugin)[NAMEDATALEN - 1] = '\0'; If it's safe to do this without a lock, I don't know why. Well, the worst that could happen is that somebody else doing a SELECT * FROM pg_replication_slot gets a incomplete plugin name... But we certainly can hold the spinlock during it if you think that's better. Isn't the worst thing that can happen that they copy garbage out of the buffer, because the new name is longer than the old and only partially written? More broadly, I wonder why this is_init stuff is in here at all. Maybe the stuff that happens in the is_init case should be done in the caller, or another helper function. The reason I moved it in there is that after the walsender patch there are two callers and the stuff is sufficiently complex that I having it twice lead to bugs. The reason it's currenlty the same function is that sufficiently much of the code would have to be shared that I found it it ugly to split. I'll have a look whether I can figure something out. I was thinking that the is_init portion could perhaps be done first, in a *previous* function call, and adjusted in such a way that the non-is-init path can be used for both case here. I don't think this is a very good idea. The problem with doing things during error recovery that can themselves fail is that you'll lose the original error, which is not cool, and maybe even blow out the error stack. Many people have confuse PG_TRY()/PG_CATCH() with an exception-handling system, but it's not. One way to fix this is to put some of the initialization logic in ReplicationSlotCreate() just prior to calling CreateSlotOnDisk(). If the work that needs to be done is too complex or protracted to be done there, then I think that it should be pulled out of the act of creating the replication slot and made to happen as part of first use, or as a separate operation like PrepareForLogicalDecoding. I think what should be done here is adding a drop_on_release flag. As soon as everything important is done, it gets unset. That might be more elegant, but I don't think it really fixes anything, because backing stuff out from on disk can fail. AIUI, your whole concern here is that you don't want the slot creation to fail halfway through and leave behind the slot, but what you've got here doesn't prevent that; it just makes it less likely. The more I think about it,
Re: [HACKERS] Changeset Extraction v7.6.1
On Sun, Feb 16, 2014 at 12:12 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-15 17:29:04 -0500, Robert Haas wrote: On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund and...@2ndquadrant.com wrote: [ new patches ] 0001 already needs minor + * copied stuff from tuptoaster.c. Perhaps there should be toast_internal.h? Yes, please. If you can submit a separate patch creating this file and relocating this stuff there, I will commit it. I started to work on that, but I am not sure we actually need it anymore. tuptoaster.h isn't included in that many places, so perhaps we should just add it there? That seems fine to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
Pavel Stehule escribió: 7) Why do the functions accept only the timezone abbreviation, not the full name? I find it rather confusing, because the 'timezone' option uses the full name, and we're using this as the default. But doing 'show timestamp' and using the returned value fails. Is it possible to fix this somehow? A only abbreviation is allowed for timetz type. Timestamp can work with full time zone names. A rules (behave) should be same as input functions for types: timestamptz and timetz. postgres=# select '10:10:10 CET'::timetz; timetz ─ 10:10:10+01 (1 row) postgres=# select '10:10:10 Europe/Prague'::timetz; ERROR: invalid input syntax for type time with time zone: 10:10:10 Europe/Prague LINE 1: select '10:10:10 Europe/Prague'::timetz; ^ This limit is due used routines limits. I think this is a strange limitation, and perhaps it should be fixed rather than inflicting the limitation on the new function. I tweaked your patch a bit, attached; other than defining what to do about full TZ names in timetz, this seems ready to commit. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index be548d7..69eb45f 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -6725,6 +6725,32 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})'); row entry indexterm + primarymake_interval/primary + /indexterm + literal + function + make_interval(parameteryears/parameter typeint/type DEFAULT 0, + parametermonths/parameter typeint/type DEFAULT 0, + parameterweeks/parameter typeint/type DEFAULT 0, + parameterdays/parameter typeint/type DEFAULT 0, + parameterhours/parameter typeint/type DEFAULT 0, + parametermins/parameter typeint/type DEFAULT 0, + parametersecs/parameter typedouble precision/type DEFAULT 0.0) + /function + /literal +/entry +entrytypeinterval/type/entry +entry + Create interval from years, months, weeks, days, hours, minutes and + seconds fields +/entry +entryliteralmake_interval(days := 10)/literal/entry +entryliteral10 days/literal/entry + /row + + row +entry + indexterm primarymake_time/primary /indexterm literal @@ -6746,6 +6772,81 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})'); row entry indexterm + primarymake_timetz/primary + /indexterm + literal + function + make_timetz(parameterhour/parameter typeint/type, + parametermin/parameter typeint/type, + parametersec/parameter typedouble precision/type, + optional parametertimezone/parameter typetext/type /optional) + /function + /literal +/entry +entrytypetime with time zone/type/entry +entry + Create time with time zone from hour, minute and seconds fields. When + parametertimezone/parameter is not specified, then current time zone + is used. +/entry +entryliteralmake_timetz(8, 15, 23.5)/literal/entry +entryliteral08:15:23.5+01/literal/entry + /row + + row +entry + indexterm + primarymake_timestamp/primary + /indexterm + literal + function + make_timestamp(parameteryear/parameter typeint/type, + parametermonth/parameter typeint/type, + parameterday/parameter typeint/type, + parameterhour/parameter typeint/type, + parametermin/parameter typeint/type, + parametersec/parameter typedouble precision/type) + /function + /literal +/entry +entrytypetimestamp/type/entry +entry + Create timestamp from year, month, day, hour, minute and seconds fields +/entry +entryliteralmake_timestamp(1-23, 7, 15, 8, 15, 23.5)/literal/entry +entryliteral2013-07-15 08:15:23.5/literal/entry + /row + + row +entry + indexterm + primarymake_timestamptz/primary + /indexterm + literal + function + make_timestamptz(parameteryear/parameter typeint/type, + parametermonth/parameter typeint/type, + parameterday/parameter typeint/type, + parameterhour/parameter typeint/type, + parametermin/parameter typeint/type, + parametersec/parameter typedouble precision/type, + optional parametertimezone/parameter typetext/type /optional) + /function + /literal +/entry +entrytypetimestamp with time zone/type/entry +entry + Create timestamp with time zone from year,
Re: [HACKERS] Changeset Extraction v7.6.1
On Tue, Feb 18, 2014 at 4:07 AM, Andres Freund and...@2ndquadrant.com wrote: 2. I think the snapshot-export code is fundamentally misdesigned. As I said before, the idea that we're going to export one single snapshot at one particular point in time strikes me as extremely short-sighted. I don't think so. It's precisely what you need to implement a simple replication solution. Yes, there are usecases that could benefit from more possibilities, but that's always the case. For example, consider one-to-many replication where clients may join or depart the replication group at any time. Whenever somebody joins, we just want a snapshot, LSN pair such that they can apply all changes after the LSN except for XIDs that would have been visible to the snapshot. And? They need to create individual replication slots, which each will get a snapshot. So we have to wait for startup N times, and transmit the change stream N times, instead of once? Blech. And in fact, we don't even need any special machinery for that; the client can just make a connection and *take a snapshot* once decoding is initialized enough. No, they can't. Two reasons: For one the commit order between snapshots and WAL isn't necessarily the same. So what? For another, clients now need logic to detect whether a transaction's contents has already been applied or has not been applied yet, that's nontrivial. My point is, I think we should be trying to *make* that trivial, rather than doing this. 3. As this feature is proposed, the only plugin we'll ship with 9.4 is a test_decoding plugin which, as its own documentation says, doesn't do anything especially useful. What exactly do we gain by forcing users who want to make use of these new capabilities to write C code? It gains us to have a output plugin in which we can easily demonstrate features so they can be tested in the regression tests. Which I find to be rather important. Just like e.g. the test_shm_mq stuff doesn't do anything really useful. It definitely doesn't, but this patch is a lot closer to being done than parallel query is, so I'm not sure it's a fair comparison. The test_decoding plugin doesn't seem tremendously much simpler than something that someone could actually use, so why not make that the goal? For one, it being a designated toy plugin allows us to easily change it, to showcase/test new features. For another, I still don't agree that it's easy to agree to an output format. I think we should include some that matured into 9.5. I regret that more effort has not been made in that area. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes up for review
On Sun, Feb 16, 2014 at 10:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Any comments before I start transposing them into the back branches? Sorry I'm late. Shore up GRANT ... WITH ADMIN OPTION restrictions (Noah Misch) I'm not familiar with the phrase Shore up, I think it should use more precise language: are the privilege checks getting more strict or less strict? Wow, there are quite a lot of items this time. Have you considered grouping the items by their impact, for example security/data corruption/crash/correctness/other? I think that would make it easier for readers to find items they're interested in. Most changes seem pretty straightforward to categorize; there are always outliers, but even if a few items are miscategorized, that's an improvement over what we have now. Of course someone has to be willing to do that work. If this warrants more discussion, I can draft out a proposal in a new topic. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changeset Extraction v7.6.1
On Tue, Feb 18, 2014 at 4:33 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-17 21:35:23 -0500, Robert Haas wrote: What I don't understand is why we're not taking the test_decoding module, polishing it up a little to produce some nice, easily machine-parseable output, calling it basic_decoding, and shipping that. Then people who want something else can build it, but people who are happy with something basic will already have it. Because every project is going to need their own plugin *anyway*. Londiste, slony sure are going to ignore changes to relations they don't need. Querying their own metadata. They will want compatibility to the earlier formats as far as possible. Sometime not too far away they will want to optionally support binary output because it's so much faster. There's just not much chance that either of these will be able to agree on a format short term. Ah, so part of what you're expecting the output plugin to do is filtering. I can certainly see where there might be considerable variation between solutions in that area - but I think that's separate from the question of formatting per se. Although I think we should have an in-core output plugin with filtering capabilities eventually, I'm happy to define that as out of scope for 9.4. But isn't there a way that we can ship something that will due for people who want to just see the database's entire change stream float by? TBH, as compared to what you've got now, I think this mostly boils down to a question of quoting and escaping. I'm not really concerned with whether we ship something that's perfectly efficient, or that has filtering capabilities, or that has a lot of fancy bells and whistles. What I *am* concerned about is that if the user updates a text field that contains characters like or ' or : or [ or ] or , that somebody might be using as delimiters in the output format, that a program can still parse that output format and reliably determine what the actual change was. I don't care all that much whether we use JSON or CSV or something custom, but the data that gets spit out should not have SQL-injection-like vulnerabilities. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp ma...@juffo.org wrote: With partial-sort-basic-1 and this fix on the same test suite, the planner overhead is now a more manageable 0.5% to 1.3%; one test is faster by 0.5%. Ping, Robert or anyone, does this overhead seem bearable or is that still too much? Do these numbers look conclusive enough or should I run more tests? I think the 1st patch now has a bug in initial_cost_mergejoin; you still pass the presorted_keys argument to cost_sort, making it calculate a partial sort cost Ping, Alexander? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
2014-02-19 19:01 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: 7) Why do the functions accept only the timezone abbreviation, not the full name? I find it rather confusing, because the 'timezone' option uses the full name, and we're using this as the default. But doing 'show timestamp' and using the returned value fails. Is it possible to fix this somehow? A only abbreviation is allowed for timetz type. Timestamp can work with full time zone names. A rules (behave) should be same as input functions for types: timestamptz and timetz. postgres=# select '10:10:10 CET'::timetz; timetz ─ 10:10:10+01 (1 row) postgres=# select '10:10:10 Europe/Prague'::timetz; ERROR: invalid input syntax for type time with time zone: 10:10:10 Europe/Prague LINE 1: select '10:10:10 Europe/Prague'::timetz; ^ This limit is due used routines limits. I think this is a strange limitation, and perhaps it should be fixed rather than inflicting the limitation on the new function. I tweaked your patch a bit, attached; other than defining what to do about full TZ names in timetz, this seems ready to commit. I have not a objection - thank you Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Changeset Extraction v7.6.1
On 18-02-2014 06:33, Andres Freund wrote: I really hope there will be nicer ones by the time 9.4 is released. Euler did send in a json plugin http://archives.postgresql.org/message-id/52A5BFAE.1040209%2540timbira.com.br , but there hasn't too much feedback yet. It's hard to start discussing something that needs a couple of patches to pg before you can develop your own patch... BTW, I've updated that code to reflect the recent changes in the API and publish it in [1]. This version is based on the Andres' branch xlog-decoding-rebasing-remapping. I'll continue to polish this code. Regards, [1] https://github.com/eulerto/wal2json -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
2014-02-19 19:01 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: 7) Why do the functions accept only the timezone abbreviation, not the full name? I find it rather confusing, because the 'timezone' option uses the full name, and we're using this as the default. But doing 'show timestamp' and using the returned value fails. Is it possible to fix this somehow? A only abbreviation is allowed for timetz type. Timestamp can work with full time zone names. A rules (behave) should be same as input functions for types: timestamptz and timetz. postgres=# select '10:10:10 CET'::timetz; timetz ─ 10:10:10+01 (1 row) postgres=# select '10:10:10 Europe/Prague'::timetz; ERROR: invalid input syntax for type time with time zone: 10:10:10 Europe/Prague LINE 1: select '10:10:10 Europe/Prague'::timetz; ^ This limit is due used routines limits. I think this is a strange limitation, and perhaps it should be fixed rather than inflicting the limitation on the new function. I though about it, and now I am thinking so timezone in format 'Europe/Prague' is together with time ambiguous We can do it, but we have to expect so calculation will be related to current date - and I am not sure if it is correct, because someone can write some like make_date(x,x,x) + make_timetz(..) - and result will be damaged. I tweaked your patch a bit, attached; other than defining what to do about full TZ names in timetz, this seems ready to commit. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: Fwd: [HACKERS] patch: make_timestamp function
Pavel Stehule escribió: I though about it, and now I am thinking so timezone in format 'Europe/Prague' is together with time ambiguous We can do it, but we have to expect so calculation will be related to current date - and I am not sure if it is correct, because someone can write some like make_date(x,x,x) + make_timetz(..) - and result will be damaged. Hmm, I see your point --- the make_timetz() call would use today's timezone displacement, which might be different from the one used in the make_date() result. That would result in a botched timestamptz sometimes, but it might escape testing because it's subtle and depends on the input data. However, your proposal is to use an abbreviation timezone, thereby forcing the user to select the correct timezone i.e. the one that matches the make_date() arguments. I'm not sure this is much of an improvement, because then the user is faced with the difficult problem of figuring out the correct abbreviation in the first place. I think there is little we can do to solve the problem at this level; it seems to me that the right solution here is to instruct users to use make_date() only in conjunction with make_time(), that is, produce a timezone-less timestamp; and then apply a AT TIME ZONE operator to the result. That could take a full timezone name, and that would always work correctly. My conclusion here is that the time with time zone datatype is broken in itself, because of this kind of ambiguity. Maybe we should just avoid offering more functionality on top of it, that is get rid of make_timetz() in this patch? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
Dne 19. 2. 2014 21:20 Alvaro Herrera alvhe...@2ndquadrant.com napsal(a): Pavel Stehule escribió: I though about it, and now I am thinking so timezone in format 'Europe/Prague' is together with time ambiguous We can do it, but we have to expect so calculation will be related to current date - and I am not sure if it is correct, because someone can write some like make_date(x,x,x) + make_timetz(..) - and result will be damaged. Hmm, I see your point --- the make_timetz() call would use today's timezone displacement, which might be different from the one used in the make_date() result. That would result in a botched timestamptz sometimes, but it might escape testing because it's subtle and depends on the input data. However, your proposal is to use an abbreviation timezone, thereby forcing the user to select the correct timezone i.e. the one that matches the make_date() arguments. I'm not sure this is much of an improvement, because then the user is faced with the difficult problem of figuring out the correct abbreviation in the first place. I think there is little we can do to solve the problem at this level; it seems to me that the right solution here is to instruct users to use make_date() only in conjunction with make_time(), that is, produce a timezone-less timestamp; and then apply a AT TIME ZONE operator to the result. That could take a full timezone name, and that would always work correctly. My conclusion here is that the time with time zone datatype is broken in itself, because of this kind of ambiguity. Maybe we should just avoid offering more functionality on top of it, that is get rid of make_timetz() in this patch? +1 Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: Fwd: [HACKERS] patch: make_timestamp function
Alvaro Herrera alvhe...@2ndquadrant.com writes: My conclusion here is that the time with time zone datatype is broken in itself, because of this kind of ambiguity. That's the conclusion that's been arrived at by pretty much everybody who's looked at it with any care. Maybe we should just avoid offering more functionality on top of it, that is get rid of make_timetz() in this patch? +1. We don't need to encourage people to use that type. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_standby: Question about truncation of trigger file in fast failover
I was trying to understand (and then perhaps mimic) how pg_standby does a fast failover. My current understanding is that when a secondary db is in standby mode, it will exhaust all the archive log to be replayed from the primary and then start streaming. It is at this point that xlog.c checks for the existence of a trigger file to promote the secondary. This was been a cause of some irritation for some of our customers who do not really care about catching up all the way. I want to achieve the exact semantics of pg_standby's fast failover option. I manipulated the restore command to return 'failure' when the word fast is present in the trigger file (see below), hoping that when I want a secondary database to come out fast, I can just echo the word fast into the trigger file thereby simulating pg_standby's fast failover behavior. However, that did not work. Techically, I did not truncate the trigger file like how pg_standby. New restore_command = ! fgrep -qsi fast trigger_file Old restore_command And that is where I have a question. I noticed that in pg_standby.c when we detect the word fast in the trigger file we truncate the file. https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456 There is also a comment above it about not upsetting the server. https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454 What is the purpose of truncating the file? To do a smart failover once you come out of standby? But, when I look at xlog.c, when we come out of standby due to a failure returned by restore_command, we call CheckForStandbyTrigger() here: https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441 Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through the debugger that the unlinking happens before xlog.c makes a call to the next restore_command. So, what is the reason for truncating the fast word from the trigger file if the file is going to be deleted soon after it is discovered? How will we upset the server if we don't? Assuming this question is answered and I get a better understanding, I have a follow up question. If truncation is indeed necessary, can I simulate the truncation by manipulating restore_command and achieve the same effect as a fast failover in pg_standby? Thanks in advance for the help. Neil
Re: [HACKERS] GiST support for inet datatypes
2014-02-19 16:52, Tom Lane t...@sss.pgh.pa.us: Not at all, AFAICS. If it were okay to decide that some formerly-default opclass is no longer default, then having such a command would be better than manually manipulating pg_opclass.opcdefault --- but extension upgrade scripts could certainly do the latter if they had to. The problem here is whether it's upgrade-safe to make such a change at all; having syntactic sugar doesn't make it safer. It is not hard to support != operator on the new operator class. That would make it functionally compatible with the ones on btree_gist. That way, changing the default would not break pg_dump dumps, it would only upgrade the index to the new one. pg_upgrade dumps current objects in the extension. It fails to restore them, if the new operator class exists as the default. I do not really understand how pg_upgrade handle extension upgrades. I do not have a solution to this. It would be sad if we cannot make the new operator class default because of the btree_gist implementation which is not useful for inet data types. You wrote on 2010-10-11 about it [1]: Well, actually the btree_gist implementation for inet is a completely broken piece of junk: it thinks that convert_network_to_scalar is 100% trustworthy and can be used as a substitute for the real comparison functions, which isn't even approximately true. I'm not sure why Teodor implemented it like that instead of using the type's real comparison functions, but it's pretty much useless if you want the same sort order that the type itself defines. [1] http://www.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_standby: Question about truncation of trigger file in fast failover
On 02/19/2014 11:15 PM, Neil Thombre wrote: And that is where I have a question. I noticed that in pg_standby.c when we detect the word fast in the trigger file we truncate the file. https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456 There is also a comment above it about not upsetting the server. https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454 What is the purpose of truncating the file? To do a smart failover once you come out of standby? But, when I look at xlog.c, when we come out of standby due to a failure returned by restore_command, we call CheckForStandbyTrigger() here: https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441 Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through the debugger that the unlinking happens before xlog.c makes a call to the next restore_command. So, what is the reason for truncating the fast word from the trigger file if the file is going to be deleted soon after it is discovered? How will we upset the server if we don't? At end-of-recovery, the server will fetch again the last WAL file that was replayed. If it can no longer find it, because restore_command now returns an error even though it succeeded for the same file few seconds earlier, it will throw an error and refuse to start up. That's the way it used to be until 9.2, anyway. In 9.2, the behavior was changed, so that the server keeps all the files restored from archive, in pg_xlog, so that it can access them again. I haven't tried, but it's possible that the truncation is no longer necessary. Try it, with 9.1 and 9.3, and see what happens. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST support for inet datatypes
Emre Hasegeli e...@hasegeli.com writes: [ cites bug #5705 ] Hm. I had forgotten how thoroughly broken btree_gist's inet and cidr opclasses are. There was discussion at the time of just ripping them out despite the compatibility hit. We didn't do it, but if we had then life would be simpler now. Perhaps it would be acceptable to drop the btree_gist implementation and teach pg_upgrade to refuse to upgrade if the old database contains any such indexes. I'm not sure that solves the problem, though, because I think pg_upgrade will still fail if the opclass is in the old database even though unused (because you'll get the complaint about multiple default opclasses anyway). The only simple way to avoid that is to not have btree_gist loaded at all in the old DB, and I doubt that's an acceptable upgrade restriction. It would require dropping indexes of the other types supported by btree_gist, which would be a pretty hard sell since those aren't broken. Really what we'd need here is for btree_gist to be upgraded to a version that doesn't define gist_inet_ops (or at least doesn't mark it as default) *before* pg_upgrading to a server version containing the proposed new implementation. Not sure how workable that is. Could we get away with having pg_upgrade unset the default flag in the old database? (Ick, but there are no very good alternatives here ...) BTW, I'd not been paying any attention to this thread, but now that I scan through it, I think the proposal to change the longstanding names of the inet operators has zero chance of being accepted. Consistency with the names chosen for range operators is a mighty weak argument compared to backwards compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On 2014-02-19 12:47:40 -0500, Robert Haas wrote: On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: Yes, that's a good precedent in multiple ways. Here are updated patches to use pg_lsn instead of pglsn... OK, so I think this stuff is all committed now, with assorted changes. Thanks for your work on this. cool, thanks you two. I wonder if pg_stat_replication shouldn't be updated to use it as well? SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows that as names that are possible candidates for conversion. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Priority table or Cache table
Hi, I want to propose a new feature called priority table or cache table. This is same as regular table except the pages of these tables are having high priority than normal tables. These tables are very useful, where a faster query processing on some particular tables is expected. The same faster query processing can be achieved by placing the tables on a tablespace of ram disk. In this case there is a problem of data loss in case of system shutdown. To avoid this there is a need of continuous backup of this tablespace and WAL files is required. The priority table feature will solve these problems by providing the similar functionality. User needs a careful decision in deciding how many tables which require a faster access, those can be declared as priority tables and also these tables should be in small in both number of columns and size. New syntax: create [priority] Table ...; or Create Table .. [ buffer_pool = priority | default ]; By adding a new storage parameter of buffer_pool to specify the type of buffer pool this table can use. The same can be extended for index also. Solution -1: This solution may not be a proper one, but it is simple. So while placing these table pages into buffer pool, the usage count is changed to double max buffer usage count instead of 1 for normal tables. Because of this reason there is a less chance of these pages will be moved out of buffer pool. The queries which operates on these tables will be faster because of less I/O. In case if the tables are not used for a long time, then only the first query on the table will be slower and rest of the queries are faster. Just for test, a new bool member can be added to RELFILENODE structure to indicate the table type is priority or not. Using this while loading the page the usage count can be modified. The pg_buffercache output of a priority table: postgres=# select * from pg_buffercache where relfilenode=16385; bufferid | relfilenode | reltablespace | reldatabase | relforknumber | relblocknumber | isdirty | usagecount ---+---+---+-++-+-+ 270 |16385 | 1663 | 12831 | 0 | 0 |t | 10 Solution - 2: By keeping an extra flag in the buffer to know whether the buffer is used for a priority table or not? By using this flag while replacing a buffer used for priority table some extra steps needs to be taken care like 1. Only another page of priority table can replace this priority page. 2. Only after at least two complete cycles of clock sweep, a normal table page can replace this. In this case the priority buffers are present in memory for long time as similar to the solution-1, but not guaranteed always. Solution - 3: Create an another buffer pool called priority buffer pool similar to shared buffer pool to place the priority table pages. A new guc parameter called priority_buffers can be added to the get the priority buffer pool size from the user. The Maximum limit of these buffers can be kept smaller value to make use of it properly. As an extra care, whenever any page needs to move out of the priority buffer pool a warning is issued, so that user can check whether the configured the priority_buffers size is small or the priority tables are grown too much as not expected? In this case all the pages are always loaded into memory thus the queries gets the faster processing. IBM DB2 have the facility of creating one more buffer pools and fixing specific tables and indexes into them. Oracle is also having a facility to specify the buffer pool option as keep or recycle. I am preferring syntax-2 and solution-3. please provide your suggestions/improvements. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] narwhal and PGDLLIMPORT
Hiroshi Inoue in...@tpf.co.jp writes: (2014/02/12 3:03), Tom Lane wrote: Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile. Do we need that either? Sorry for the late reply. Attached is a patch to remove dllwarp from pgevent/Makefile. Pushed, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Priority table or Cache table
Haribabu Kommi kommi.harib...@gmail.com writes: I want to propose a new feature called priority table or cache table. This is same as regular table except the pages of these tables are having high priority than normal tables. These tables are very useful, where a faster query processing on some particular tables is expected. Why exactly does the existing LRU behavior of shared buffers not do what you need? I am really dubious that letting DBAs manage buffers is going to be an improvement over automatic management. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote: On 2/5/14, 1:31 PM, Robert Haas wrote: On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote: Perhaps this type should be called pglsn, since it's an implementation-specific detail and not a universal concept like int, point, or uuid. If we're going to do that, I suggest pg_lsn rather than pglsn. We already have pg_node_tree, so using underscores for separation would be more consistent. Yes, that's a good precedent in multiple ways. Here are updated patches to use pg_lsn instead of pglsn... OK, so I think this stuff is all committed now, with assorted changes. Thanks for your work on this. Thanks! Oops, it looks like I am coming after the battle (time difference does not help). I'll be more careful to test such patches on 32b platforms as well in the future. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.
On Thu, Feb 20, 2014 at 1:01 AM, David Fetter da...@fetter.org wrote: On Tue, Feb 18, 2014 at 04:39:27PM -0300, Alvaro Herrera wrote: Heikki Linnakangas wrote: Add a GUC to report whether data page checksums are enabled. Is there are reason this wasn't back-patched to 9.3? I think it should be. +1 for back-patching. Back-patching would be interesting for existing applications, but -1 as it is a new feature :) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Priority table or Cache table
On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote: Haribabu Kommi kommi.harib...@gmail.com writes: I want to propose a new feature called priority table or cache table. This is same as regular table except the pages of these tables are having high priority than normal tables. These tables are very useful, where a faster query processing on some particular tables is expected. Why exactly does the existing LRU behavior of shared buffers not do what you need? Lets assume a database having 3 tables, which are accessed regularly. The user is expecting a faster query results on one table. Because of LRU behavior which is not happening some times. So if we just separate those table pages into an another buffer pool then all the pages of that table resides in memory and gets faster query processing. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Thu, Feb 20, 2014 at 9:43 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote: On 2/5/14, 1:31 PM, Robert Haas wrote: On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote: Perhaps this type should be called pglsn, since it's an implementation-specific detail and not a universal concept like int, point, or uuid. If we're going to do that, I suggest pg_lsn rather than pglsn. We already have pg_node_tree, so using underscores for separation would be more consistent. Yes, that's a good precedent in multiple ways. Here are updated patches to use pg_lsn instead of pglsn... OK, so I think this stuff is all committed now, with assorted changes. Thanks for your work on this. Thanks! Oops, it looks like I am coming after the battle (time difference does not help). I'll be more careful to test such patches on 32b platforms as well in the future. After re-reading the code, I found two incorrect comments in the new regression tests. Patch fixing them is attached. Thanks, -- Michael diff --git a/src/backend/utils/adt/pg_lsn.c b/src/backend/utils/adt/pg_lsn.c index e2b528a..fae12e1 100644 --- a/src/backend/utils/adt/pg_lsn.c +++ b/src/backend/utils/adt/pg_lsn.c @@ -19,7 +19,7 @@ #include utils/pg_lsn.h #define MAXPG_LSNLEN 17 -#define MAXPG_LSNCOMPONENT 8 +#define MAXPG_LSNCOMPONENT 8 /*-- * Formatting and conversion routines. diff --git a/src/test/regress/expected/pg_lsn.out b/src/test/regress/expected/pg_lsn.out index 01d2983..504768c 100644 --- a/src/test/regress/expected/pg_lsn.out +++ b/src/test/regress/expected/pg_lsn.out @@ -52,13 +52,13 @@ SELECT '0/16AE7F8' pg_lsn '0/16AE7F7'; t (1 row) -SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; -- No negative results +SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; ?column? -- -1 (1 row) -SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- correct +SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; ?column? -- 1 diff --git a/src/test/regress/sql/pg_lsn.sql b/src/test/regress/sql/pg_lsn.sql index dddafb3..1634d37 100644 --- a/src/test/regress/sql/pg_lsn.sql +++ b/src/test/regress/sql/pg_lsn.sql @@ -21,5 +21,5 @@ SELECT '0/16AE7F8' = '0/16AE7F8'::pg_lsn; SELECT '0/16AE7F8'::pg_lsn != '0/16AE7F7'; SELECT '0/16AE7F7' '0/16AE7F8'::pg_lsn; SELECT '0/16AE7F8' pg_lsn '0/16AE7F7'; -SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; -- No negative results -SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- correct +SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; +SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding
I wrote: The minimum-refactoring solution to this would be to tweak pg_do_encoding_conversion() so that if the src_encoding is SQL_ASCII but the dest_encoding isn't, it does pg_verify_mbstr() rather than nothing. I'm not sure if this would break anything we need to have work, though. Thoughts? Do we want to back-patch such a change? I looked through all the callers of pg_do_encoding_conversion(), and AFAICS this change is a good idea. There are a whole bunch of places that use pg_do_encoding_conversion() to convert from the database encoding to encoding X (most usually UTF8), and right now if you do that in a SQL_ASCII database you have no assurance whatever that what is produced is actually valid in encoding X. I think we need to close that loophole. The more I looked into mbutils.c, the less happy I got. The attached proposed patch takes care of the missing-verification hole in pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid of what I believe to be obsolete provisions in pg_do_encoding_conversion to work if called outside a transaction --- if you consider it working to completely fail to honor its API contract. That should no longer be necessary now that we perform client-server encoding conversions via perform_default_encoding_conversion rather than here. For testing I inserted an Assert(IsTransactionState()); at the top of pg_do_encoding_conversion(), and couldn't trigger it, so I'm fairly sure this change is OK. Still, back-patching it might not be a good thing. On the other hand, if there is still any code path that can get here outside a transaction, we probably ought to find out rather than allow completely bogus data to be returned. I also made some more-cosmetic improvements, notably removing a bunch of Asserts that are certainly dead code because the relevant variables are never NULL. I've not done anything yet about simplifying unnecessary calls of pg_do_encoding_conversion into pg_server_to_any/pg_any_to_server. How much of this is back-patch material, do you think? regards, tom lane diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c index 08440e9..7f43cae 100644 *** a/src/backend/utils/mb/mbutils.c --- b/src/backend/utils/mb/mbutils.c *** *** 1,10 ! /* ! * This file contains public functions for conversion between ! * client encoding and server (database) encoding. * ! * Tatsuo Ishii * ! * src/backend/utils/mb/mbutils.c */ #include postgres.h --- 1,36 ! /*- * ! * mbutils.c ! * This file contains functions for encoding conversion. * ! * The string-conversion functions in this file share some API quirks. ! * Note the following: ! * ! * The functions return a palloc'd, null-terminated string if conversion ! * is required. However, if no conversion is performed, the given source ! * string pointer is returned as-is. ! * ! * Although the presence of a length argument means that callers can pass ! * non-null-terminated strings, care is required because the same string ! * will be passed back if no conversion occurs. Such callers *must* check ! * whether result == src and handle that case differently. ! * ! * If the source and destination encodings are the same, the source string ! * is returned without any verification; it's assumed to be valid data. ! * If that might not be the case, the caller is responsible for validating ! * the string using a separate call to pg_verify_mbstr(). Whenever the ! * source and destination encodings are different, the functions ensure that ! * the result is validly encoded according to the destination encoding. ! * ! * ! * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group ! * Portions Copyright (c) 1994, Regents of the University of California ! * ! * ! * IDENTIFICATION ! * src/backend/utils/mb/mbutils.c ! * ! *- */ #include postgres.h *** InitializeClientEncoding(void) *** 290,296 int pg_get_client_encoding(void) { - Assert(ClientEncoding); return ClientEncoding-encoding; } --- 316,321 *** pg_get_client_encoding(void) *** 300,328 const char * pg_get_client_encoding_name(void) { - Assert(ClientEncoding); return ClientEncoding-name; } /* ! * Apply encoding conversion on src and return it. The encoding ! * conversion function is chosen from the pg_conversion system catalog ! * marked as default. If it is not found in the schema search path, ! * it's taken from pg_catalog schema. If it even is not in the schema, ! * warn and return src. ! * ! * If conversion occurs, a palloc'd null-terminated string is returned. ! * In the case of no conversion, src is returned. ! * ! * CAUTION: although the presence of a length argument means that callers !
Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?
On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-19 12:47:40 -0500, Robert Haas wrote: On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: Yes, that's a good precedent in multiple ways. Here are updated patches to use pg_lsn instead of pglsn... OK, so I think this stuff is all committed now, with assorted changes. Thanks for your work on this. cool, thanks you two. I wonder if pg_stat_replication shouldn't be updated to use it as well? SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows that as names that are possible candidates for conversion. I was sure to have forgotten some views or functions in the previous patch... Please find attached a patch making pg_stat_replication use pg_lsn instead of the existing text fields. Regards, -- Michael diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index a37e6b6..370857a 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -1490,24 +1490,24 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re /row row entrystructfieldsent_location//entry - entrytypetext//entry + entrytypepg_lsn//entry entryLast transaction log position sent on this connection/entry /row row entrystructfieldwrite_location//entry - entrytypetext//entry + entrytypepg_lsn//entry entryLast transaction log position written to disk by this standby server/entry /row row entrystructfieldflush_location//entry - entrytypetext//entry + entrytypepg_lsn//entry entryLast transaction log position flushed to disk by this standby server/entry /row row entrystructfieldreplay_location//entry - entrytypetext//entry + entrytypepg_lsn//entry entryLast transaction log position replayed into the database on this standby server/entry /row diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 06b22e2..048367a 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -67,6 +67,7 @@ #include utils/builtins.h #include utils/guc.h #include utils/memutils.h +#include utils/pg_lsn.h #include utils/ps_status.h #include utils/resowner.h #include utils/timeout.h @@ -2137,7 +2138,6 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS) { /* use volatile pointer to prevent code rearrangement */ volatile WalSnd *walsnd = WalSndCtl-walsnds[i]; - char location[MAXFNAMELEN]; XLogRecPtr sentPtr; XLogRecPtr write; XLogRecPtr flush; @@ -2171,28 +2171,19 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS) else { values[1] = CStringGetTextDatum(WalSndGetStateString(state)); - - snprintf(location, sizeof(location), %X/%X, - (uint32) (sentPtr 32), (uint32) sentPtr); - values[2] = CStringGetTextDatum(location); + values[2] = LSNGetDatum(sentPtr); if (write == 0) nulls[3] = true; - snprintf(location, sizeof(location), %X/%X, - (uint32) (write 32), (uint32) write); - values[3] = CStringGetTextDatum(location); + values[3] = LSNGetDatum(write); if (flush == 0) nulls[4] = true; - snprintf(location, sizeof(location), %X/%X, - (uint32) (flush 32), (uint32) flush); - values[4] = CStringGetTextDatum(location); + values[4] = LSNGetDatum(flush); if (apply == 0) nulls[5] = true; - snprintf(location, sizeof(location), %X/%X, - (uint32) (apply 32), (uint32) apply); - values[5] = CStringGetTextDatum(location); + values[5] = LSNGetDatum(apply); values[6] = Int32GetDatum(sync_priority[i]); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 11c1e1a..a2cc19f 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2634,7 +2634,7 @@ DATA(insert OID = 1936 ( pg_stat_get_backend_idset PGNSP PGUID 12 1 100 0 0 f DESCR(statistics: currently active backend IDs); DATA(insert OID = 2022 ( pg_stat_get_activity PGNSP PGUID 12 1 100 0 0 f f f f f t s 1 0 2249 23 {23,26,23,26,25,25,25,16,1184,1184,1184,1184,869,25,23} {i,o,o,o,o,o,o,o,o,o,o,o,o,o,o} {pid,datid,pid,usesysid,application_name,state,query,waiting,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port} _null_ pg_stat_get_activity _null_ _null_ _null_ )); DESCR(statistics: information about currently active backends); -DATA(insert OID = 3099 ( pg_stat_get_wal_senders PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249 {23,25,25,25,25,25,23,25} {o,o,o,o,o,o,o,o} {pid,state,sent_location,write_location,flush_location,replay_location,sync_priority,sync_state} _null_ pg_stat_get_wal_senders _null_ _null_ _null_ )); +DATA(insert OID = 3099 ( pg_stat_get_wal_senders PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249 {23,25,3220,3220,3220,3220,23,25} {o,o,o,o,o,o,o,o}
Re: [HACKERS] narwhal and PGDLLIMPORT
Hiroshi Inoue in...@tpf.co.jp writes: Attached is a patch to remove dllwarp from pgevent/Makefile. Actually, it looks like this patch doesn't work at all: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacanadt=2014-02-20%2001%3A00%3A53 Did I fat-finger the commit somehow? I made a couple of cosmetic changes, or at least I thought they were cosmetic. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] narwhal and PGDLLIMPORT
Tom Lane escribió: Hiroshi Inoue in...@tpf.co.jp writes: Attached is a patch to remove dllwarp from pgevent/Makefile. Actually, it looks like this patch doesn't work at all: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacanadt=2014-02-20%2001%3A00%3A53 Did I fat-finger the commit somehow? I made a couple of cosmetic changes, or at least I thought they were cosmetic. The format of the file is completely unlike that of the other *dll.def files we use elsewhere. AFAICT each line is tabsymbolname@tabsome_number -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] narwhal and PGDLLIMPORT
(2014/02/20 10:32), Tom Lane wrote: Hiroshi Inoue in...@tpf.co.jp writes: Attached is a patch to remove dllwarp from pgevent/Makefile. Actually, it looks like this patch doesn't work at all: Strangely enough it works here though I see double EXPORTS lines in libpgeventdll.def. http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacanadt=2014-02-20%2001%3A00%3A53 Did I fat-finger the commit somehow? I made a couple of cosmetic changes, or at least I thought they were cosmetic. Seems EXPORTS line in exports.txt should be removed. regards, Hiroshi Inoue -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] narwhal and PGDLLIMPORT
Hiroshi Inoue in...@tpf.co.jp writes: Seems EXPORTS line in exports.txt should be removed. Done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.
On Wed, Feb 19, 2014 at 4:49 PM, Michael Paquier michael.paqu...@gmail.com wrote: +1 for back-patching. Back-patching would be interesting for existing applications, but -1 as it is a new feature :) I think that it rises to the level of an omission in 9.3 that now requires correction. Many of our users couldn't run pg_controldata even if they'd heard of it... -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Priority table or Cache table
On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi kommi.harib...@gmail.com wrote: On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote: I want to propose a new feature called priority table or cache table. This is same as regular table except the pages of these tables are having high priority than normal tables. These tables are very useful, where a faster query processing on some particular tables is expected. Why exactly does the existing LRU behavior of shared buffers not do what you need? Lets assume a database having 3 tables, which are accessed regularly. The user is expecting a faster query results on one table. Because of LRU behavior which is not happening some times. I think this will not be a problem for regularly accessed tables(pages), as per current algorithm they will get more priority before getting flushed out of shared buffer cache. Have you come across any such case where regularly accessed pages get lower priority than non-regularly accessed pages? However it might be required for cases where user wants to control such behaviour and pass such hints through table level option or some other way to indicate that he wants more priority for certain tables irrespective of their usage w.r.t other tables. Now I think here important thing to find out is how much helpful it is for users or why do they want to control such behaviour even when Database already takes care of such thing based on access pattern. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding
On Wed, Feb 19, 2014 at 08:22:13PM -0500, Tom Lane wrote: The more I looked into mbutils.c, the less happy I got. The attached proposed patch takes care of the missing-verification hole in pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid of what I believe to be obsolete provisions in pg_do_encoding_conversion to work if called outside a transaction --- if you consider it working to completely fail to honor its API contract. That should no longer be necessary now that we perform client-server encoding conversions via perform_default_encoding_conversion rather than here. I like these changes. In particular, coping with the absence of a conversion function by calling ereport(LOG) and returning the source string was wrong for nearly every caller, but you'd need to try an encoding like MULE_INTERNAL to notice the problem. Good riddance. How much of this is back-patch material, do you think? None of it. While many of the failures to validate against a character encoding are clear bugs, applications hum along in spite of such bugs and break when we tighten the checks. I don't see a concern to override that here. Folks who want the tighter checking have some workarounds available. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Priority table or Cache table
On Thu, Feb 20, 2014 at 2:26 PM, Amit Kapila amit.kapil...@gmail.comwrote: On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi kommi.harib...@gmail.com wrote: On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote: I want to propose a new feature called priority table or cache table. This is same as regular table except the pages of these tables are having high priority than normal tables. These tables are very useful, where a faster query processing on some particular tables is expected. Why exactly does the existing LRU behavior of shared buffers not do what you need? Lets assume a database having 3 tables, which are accessed regularly. The user is expecting a faster query results on one table. Because of LRU behavior which is not happening some times. I think this will not be a problem for regularly accessed tables(pages), as per current algorithm they will get more priority before getting flushed out of shared buffer cache. Have you come across any such case where regularly accessed pages get lower priority than non-regularly accessed pages? Because of other regularly accessed tables, some times the table which expects faster results is getting delayed. However it might be required for cases where user wants to control such behaviour and pass such hints through table level option or some other way to indicate that he wants more priority for certain tables irrespective of their usage w.r.t other tables. Now I think here important thing to find out is how much helpful it is for users or why do they want to control such behaviour even when Database already takes care of such thing based on access pattern. Yes it is useful in cases where the application always expects the faster results whether the table is used regularly or not. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] inherit support for foreign tables
Hello, It is just my personal opinion, but I think it would be convenient for users to alter inheritance trees that contain foreign tables the same way as inheritance trees that don't contain any foreign tables, without making user conscious of the inheritance trees contains foreign tables or not. Imagine we have an inheritance tree that contains only plain tables and then add a foreign table as a child of the inheritance tree. Without the flexiblity, we would need to change the way of altering the structure of the inheritance tree (e.g., ADD CONSTRAINT) to a totally different one, immediately when adding the foreign table. I don't think that would be easy to use. I personally don't see significant advantages such a flexibility. Although my concerns here are only two points, unanticipated application and maintenancibility. I gave a consideration on these issues again. Then, I think it could be enough by giving feedback to operators for the first issue. =# ALTER TABLE parent ADD CHECK (tempmin tempmax), ALTER tempmin SET NOT NULL, ALTER tempmin SET DEFAULT 0; NOTICE: Child foregn table child01 is affected. NOTICE: Child foregn table child02 is affected NOTICE: Child foregn table child03 rejected 'alter tempmin set default' What do you think about this? It looks a bit too loud for me though... Then the second issue, however I don't have enough idea of how ALTER TABLE works, the complexity would be reduced if acceptance chek for alter actions would done on foreign server or data wrapper side, not on the core of ALTER TABLE. It would also be a help to output error messages like above. However, (NO)INHERIT looks a little different.. http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html Consistency with the foreign server is not checked when a column is added or removed with ADD COLUMN or DROP COLUMN, a NOT NULL constraint is added, or a column type is changed with SET DATA TYPE. It is the user's responsibility to ensure that the table definition matches the remote side. So I belive implicit and automatic application of any constraint on foreign childs are considerably danger. We spent a lot of time discussing this issue, and the consensus is that it's users' fault if there are some tuples on the remote side violating a given constraint, as mentioned in the documentation. I'm worried about not that but the changes and possible inconsistency would take place behind operators' backs. And this looks to cause such inconsistencies for me. * [NO] INHERIT parent_table Is this usable for inheritance foreign children? NO INHERIT removes all foreign children but INHERIT is nop. I didn't express clearly. Sorry for that. Let me explain about it. * ALTER FOREIGN TABLE target_table *INHERIT* parent_table: Add the * target table as a new child of the parent table. * ALTER FOREIGN TABLE target_table *NO INHERIT* parent_table: Remove the * target table from the list of children of the parent table. I got it, thank you. It alone seems no probmen but also doesn't seem to be a matter of 'ALTER TABLE'. Could you tell me how this is related to 'ALTER TABLE'? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi, At Wed, 19 Feb 2014 16:17:05 +0900, Shigeru Hanada wrote 2014-02-18 19:29 GMT+09:00 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp: Could you guess any use cases in which we are happy with ALTER TABLE's inheritance tree walking? IMHO, ALTER FOREIGN TABLE always comes with some changes of the data source so implicitly invoking of such commands should be defaultly turned off. Imagine a case that foreign data source have attributes (A, B, C, D) but foreign tables and their parent ware defined as (A, B, C). If user wants to use D as well, ALTER TABLE parent ADD COLUMN D type would be useful (rather necessary?) to keep consistency. Hmm. I seems to me an issue of mis-configuration at first step. However, my anxiety is - as in my message just before - ALTER'ing foreign table definitions without any notice to operatos and irregular or random logic on check applicability(?) of ALTER actions. Changing data type from compatible one (i.e., int to numeric, varchar(n) to text), adding CHECK/NOT NULL constraint would be also possible. I see, thank you. Changing data types are surely valuable but also seems difficult to check validity:( Anyway, I gave a second thought on this issue. Please have a look on that. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Selecting large tables gets killed
Hi All, Here is a strange behaviour with master branch with head at commit d3c4c471553265e7517be24bae64b81967f6df40 Author: Peter Eisentraut pete...@gmx.net Date: Mon Feb 10 21:47:19 2014 -0500 The OS is [ashutosh@ubuntu repro]uname -a Linux ubuntu 3.2.0-59-generic #90-Ubuntu SMP Tue Jan 7 22:43:51 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux This is a VM hosted on Mac-OS X 10.7.5 Here's the SQL script [ashutosh@ubuntu repro]cat big_select_killed.sql drop table big_tab; create table big_tab (val int, val2 int, str varchar); insert into big_tab select x, x, lpad('string', 100, x::text) from generate_series(1, 1000) x; select * from big_tab; The last select causes the Killed message. [ashutosh@ubuntu repro]psql -d postgres -f big_select_killed.sql DROP TABLE CREATE TABLE INSERT 0 1000 Killed There is a message in server log FATAL: connection to client lost STATEMENT: select * from big_tab; Any SELECT selecting all the rows is getting psql killed but not SELECT count(*) [ashutosh@ubuntu repro]psql -d postgres psql (9.4devel) Type help for help. postgres=# select count(*) from big_tab; count -- 1000 (1 row) postgres=# select * from big_tab; Killed [ashutosh@ubuntu repro]psql -d postgres psql (9.4devel) Type help for help. Below is the buffer cache size and the relation size (if anyone cares) postgres=# show shared_buffers; shared_buffers 128MB (1 row) postgres=# select pg_relation_size('big_tab'::regclass); pg_relation_size -- 1412415488 (1 row) postgres=# select pg_relation_size('big_tab'::regclass)/1024/1024; -- IN MBs to be simple ?column? -- 1346 (1 row) There are no changes in default configuration. Using unix sockets [ashutosh@ubuntu repro]ls /tmp/.s.PGSQL.5432* /tmp/.s.PGSQL.5432 /tmp/.s.PGSQL.5432.lock Looks like a bug in psql to me. Does anybody see that behaviour? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] PoC: Partial sort
On Thu, Feb 13, 2014 at 1:54 AM, Marti Raudsepp ma...@juffo.org wrote: I think the 1st patch now has a bug in initial_cost_mergejoin; you still pass the presorted_keys argument to cost_sort, making it calculate a partial sort cost, but generated plans never use partial sort. I think 0 should be passed instead. Patch attached, needs to be applied on top of partial-sort-basic-1 and then reverse-applied on partial-sort-merge-1. It doesn't look so for me. Merge join doesn't find partial sort especially. But if path with some presorted pathkeys will be accidentally selected then partial sort will be used. See create_mergejoin_plan function. So, I think this cost_sort call is relevant to create_mergejoin_plan. If we don't want partial sort to be used in such rare cases then we should revert it from both places. However, I doubt that it does any overhead, so we can leave it as is. -- With best regards, Alexander Korotkov.