Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
On Fri, Dec 26, 2014 at 12:38 PM, Kevin Grittner wrote: > Tom Lane wrote: > >> Just for starters, a 40XXX error report will fail to provide the >> duplicated key's value. This will be a functional regression, > > Not if, as is normally the case, the transaction is retried from > the beginning on a serialization failure. Either the code will > check for a duplicate (as in the case of the OP on this thread) and > they won't see the error, *or* the the transaction which created > the duplicate key will have committed before the start of the retry > and you will get the duplicate key error. I'm not buying that; that argument assumes duplicate key errors are always 'upsert' driven. Although OP's code may have checked for duplicates it's perfectly reasonable (and in many cases preferable) to force the transaction to fail and report the error directly back to the application. The application will then switch on the error code and decide what to do: retry for deadlock/serialization or abort for data integrity error. IOW, the error handling semantics are fundamentally different and should not be mixed. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Dec 19, 2014 at 5:32 PM, Peter Geoghegan wrote: >> Most people would list the columns, but if there is a really bizarre >> constraint, with non-default opclasses, or an exclusion constraint, it's >> probably been given a name that you could use. > > What I find curious about the opclass thing is: when do you ever have > an opclass that has a different idea of equality than the default > opclass for the type? In other words, when is B-Tree strategy number 3 > not actually '=' in practice, for *any* B-Tree opclass? Certainly, it > doesn't appear to be the case that it isn't so with any shipped > opclasses - the shipped non-default B-Tree opclasses only serve to > provide alternative notions of sort order, and never "equals". > > I think that with B-Tree (which is particularly relevant for the > UPDATE variant), it ought to be defined to work with the type's > default opclass "equals" operator, just like GROUP BY and DISTINCT. > Non-default opclass unique indexes work just as well in practice, > unless someone somewhere happens to create an oddball one that doesn't > use '=' as its "equals" operator (while also having '=' as the default > opclass "equals" operator). I am not aware that that leaves any > actually shipped opclass out (and I include our external extension > ecosystem here, although I might be wrong about that part). So looking at the way the system deals with its dependence on default operator classes, I have a hard time justifying all this extra overhead for the common case. The optimizer will refuse to use an index with a non-default opclass even when AFAICT there is no *real* semantic dependence on anything other than the "equals" operator, which seems to always match across a type's opclasses anyway. e.g., DISTINCT will only use a non-default opclass B-Tree index, even though in practice the "equals" operator always matches for shipped non-default opclasses; DISTINCT will not work with a text_pattern_ops index, while it will work with a default text B-Tree opclass index, *even though no corresponding "ORDER BY" was given*. Someone recently pointed out in a dedicated thread that the system isn't all that bright about exploiting the fact that group aggregates don't necessarily need to care about facets of sort ordering like collations, which have additional overhead [1]. That might be a useful special case to target (to make underlying sorts faster), but the big picture is that the system doesn't know when it only needs to care about an "equals" operator matching some particular B-Tree-opclass-defined notion of sorting, rather than caring about a variety of operators matching. Sometimes, having a matching "equals" operator of some non-default opclass is good enough to make an index (or sort scheme) of that opclass usable for some purpose that only involves equality, and not sort order (like DISTINCT, with no ORDER BY, executed using a GroupAggregate, for example). I thought we should formalize the idea that a non-default opclass must have the same notion of equality (the same "equals" operator) as its corresponding default opclass, if any. That way, presumably the optimizer has license to be clever about only caring about "DISTINCTness"/equality. That also gives my implementation license to not care about which operator class a unique index uses -- it must not matter. Heikki pointed out that there is one shipped opclass that has an "equals" operator that happens to not be spelt "=" [2] (and furthermore, does not match that of the default opclass). That's the record_image_ops opclass, which unusually has an "equals" operator of "*=". So as Heikki pointed out, it looks like there is some limited precedent for having to worry about B-Tree opclasses that introduce alternative notions of "equals", rather than merely alternative notions of sort order. So so much for formalizing that all of a type's B-Tree opclass "equals" operators must match... ...having thought about it for a while more, though, I think we should *still* ignore opclass for the purposes of unique index inference. The implementation doesn't care about the fact that you used a non-default opclass. Sure, in theory that could lead to inconsistencies, if there was multiple unique indexes of multiple opclasses that just so happened to have incompatible ideas about equality, but that seems ludicrous...we have only one extremely narrow example of how that could happen. Plus there'd have to be *both* unique indexes defined and available for us to infer as appropriate, before the inference logic could accidentally infer the wrong idea of equality. That seems like an extremely implausible scenario. Even if we allow for the idea that alternative notions of equality are something that will happen in the wild, obviously the user cares about the definition of equality that they actually used for the unique index in question. We can document that unique index inference doesn't care about opclasses (recall that I still only plan on l
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Tom Lane wrote: > The argument that autovac workers need fresher stats than anything else > seems pretty dubious to start with. Why shouldn't we simplify that down > to "they use PGSTAT_STAT_INTERVAL like everybody else"? The point of wanting fresher stats than that, eons ago, was to avoid a worker vacuuming a table that some other worker vacuumed more recently than PGSTAT_STAT_INTERVAL. I realize now that the semantics we really want was something like "stats no older than XYZ" where the given value is the timestamp at which we start checking; if we get anything newer than that it would be okay, but we currently reject it because of lack of a more appropriate API. (If it takes more than PGSTAT_STAT_INTERVAL to get the stats back, a regular backend would ask for fresher stats, but to an autovac worker they would be good enough as long as they are newer than its recheck start time.) Nowadays we can probably disregard the whole issue, since starting a new vacuum just after the prior one finished should not cause much stress to the system thanks to the visibility map. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Alvaro Herrera writes: > Tom Lane wrote: >> Yeah, I've been getting more annoyed by that too lately. I keep wondering >> though whether there's an actual bug underneath that behavior that we're >> failing to see. > I think the first thing to do is reconsider usage of PGSTAT_RETRY_DELAY > instead of PGSTAT_STAT_INTERVAL in autovacuum workers. That decreases > the wait time 50-fold, if I recall this correctly, and causes large > amounts of extra I/O traffic. Yeah --- that means that instead of the normal behavior that a stats file newer than 500 msec is good enough, an autovac worker insists on a stats file newer than 10 msec. I did some experimentation on prairiedog, and found that it's not hard at all to see autovac workers demanding multiple stats writes per second: 2014-12-26 16:26:52.958 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:53.128 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:53.188 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:54.903 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:55.058 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:00.022 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:00.285 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:00.792 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:01.010 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:01.163 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:01.193 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.595 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.673 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.839 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.878 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:05.878 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:06.571 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:07.001 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:07.769 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:07.950 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:10.256 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:11.039 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:11.402 EST 21026 LOG: sending inquiry for database 45116 The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to "they use PGSTAT_STAT_INTERVAL like everybody else"? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
On Fri, Dec 26, 2014 at 7:23 AM, Kevin Grittner wrote: > Are there any objections to generating a write conflict instead of > a duplicate key error if the duplicate key was added by a > concurrent transaction? Only for transactions at isolation level > REPEATABLE READ or higher? This independently occurred to me as perhaps preferable over a year ago. The INSERT...ON CONFLICT IGNORE feature that my patch adds will throw such an error for REPEATABLE READ and higher modes rather than proceeding on the basis of having ignored something from a concurrent transaction. +1 from me for doing this in the master branch, if it isn't too invasive (I think it might be; where is estate available from to work with close to duplicate checking for B-Trees?). It only makes sense to do what you propose if the users expects to check that there is no duplicate ahead of time, and only ever fail with a serialization failure (not a duplicate violation) from concurrent conflicts. That is a bit narrow; the OP can only reasonably expect to not see a duplicate violation on retry because he happens to be checking...most clients won't, and will "waste" a retry. The proposed ON CONFLICT IGNORE feature would do this checking for him correctly, FWIW, but in typical cases there is no real point in retrying the xact -- you'll just get another duplicate violation error. In any case I think exclusion violations should be covered, too. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
I'll repost my (OP) case, for the references to it to make more sense to the others. Having the following table: CREATE TABLE "song_artist" ( "song_id" INT8 NOT NULL, "artist_id" INT8 NOT NULL, PRIMARY KEY ("song_id", "artist_id") ); Even trying to protect from this with a select, won't help to get away from the error, because at the beginning of the transaction the key does not exist yet. BEGIN ISOLATION LEVEL SERIALIZABLE READ WRITE; INSERT INTO song_artist (song_id, artist_id) SELECT 1, 2 WHERE NOT EXISTS (SELECT * FROM song_artist WHERE song_id=1 AND artist_id=2); COMMIT; 2014-12-26 21:38 GMT+03:00 Kevin Grittner : > Tom Lane wrote: > > > Just for starters, a 40XXX error report will fail to provide the > > duplicated key's value. This will be a functional regression, > > Not if, as is normally the case, the transaction is retried from > the beginning on a serialization failure. Either the code will > check for a duplicate (as in the case of the OP on this thread) and > they won't see the error, *or* the the transaction which created > the duplicate key will have committed before the start of the retry > and you will get the duplicate key error. > > > I think an appropriate response to these complaints is to fix the > > documentation to point out that duplicate-key violations may also > > be worthy of retries. > > > but I see no mention of the issue in chapter 13.) > > I agree that's the best we can do for stable branches, and worth > doing. > > It would be interesting to hear from others who have rely on > serializable transactions in production environments about what > makes sense to them. This is probably the wrong list to find such > people directly; but I seem to recall Josh Berkus has a lot of > clients who do. Josh? Any opinion on this thread? > > -- > Kevin Grittner > EDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [HACKERS] What exactly is our CRC algorithm?
At 2014-12-26 13:48:40 -0500, br...@momjian.us wrote: > > I assume you are only linking into Heikki's new code and will not > change the places that use the old CRC method on disk --- just > checking. Correct. The legacy CRC computation code used by ltree etc. is completely unaffected by both my sets of changes. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On Fri, Dec 26, 2014 at 11:52:41PM +0530, Abhijit Menon-Sen wrote: > At 2014-12-26 13:11:43 -0500, br...@momjian.us wrote: > > > > Is this something that could potentially change the data stored on > > disk? Does pg_upgrade need to check for changes in this area? Is the > > detection exposed by pg_controldata? Could this affect running the > > data directory on a different CPU? > > No to all. > > Subsequent to Heikki's change (already in master) to use the CRC-32C > algorithm (instead of the earlier mistakenly-reflected-but-not-quite > one), both the slice-by-8 software implementation posted earlier and > the SSE4.2 CRC32* instructions will compute exactly the same value. > > (Yes, I have verified this in both cases.) Uh, we didn't fully change all code to use the new algorithm because there were cases that the CRC was already stored on disk, e.g hstore, ltree. I assume you are only linking into Heikki's new code and will not change the places that use the old CRC method on disk --- just checking. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
Thanks for suggestions. Patch updated. Mon, 22 Dec 2014 12:07:06 +0900 от Michael Paquier : >On Tue, Nov 4, 2014 at 6:25 AM, Alexey Vasiliev < leopard...@inbox.ru > wrote: >> Added new patch. >Seems useful to me to be able to tune this interval of time. > >I would simply reword the documentation as follows: >If restore_command returns nonzero exit status code, retry >command after the interval of time specified by this parameter. >Default value is 5s. > >Also, I think that it would be a good idea to error out if this >parameter has a value of let's say, less than 1s instead of doing a >check for a positive value in the waiting latch. On top of that, the >default value of restore_command_retry_interval should be 5000 and not >0 to simplify the code. >-- >Michael > > >-- >Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-hackers -- Alexey Vasiliev diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml index ef78bc0..53ccf13 100644 --- a/doc/src/sgml/recovery-config.sgml +++ b/doc/src/sgml/recovery-config.sgml @@ -145,6 +145,29 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows + + restore_command_retry_interval (integer) + +restore_command_retry_interval recovery parameter + + + + +If restore_command returns nonzero exit status code, retry +command after the interval of time specified by this parameter. +Default value is 5s. + + +This is useful, if I using for restore of wal logs some +external storage (like AWS S3) and no matter what the slave database +will lag behind the master. The problem, what for each request to +AWS S3 need to pay, what is why for N nodes, which try to get next +wal log each 5 seconds will be bigger price, than for example each +30 seconds. + + + + diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample index b777400..7ed6f3b 100644 --- a/src/backend/access/transam/recovery.conf.sample +++ b/src/backend/access/transam/recovery.conf.sample @@ -58,6 +58,11 @@ # #recovery_end_command = '' # +# specifies an optional timeout after nonzero code of restore_command. +# This can be useful to increase/decrease number of a restore_command calls. +# +#restore_command_retry_interval = 5s +# #--- # RECOVERY TARGET PARAMETERS #--- diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e54..02c55a8 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime; static char *recoveryTargetName; static int recovery_min_apply_delay = 0; static TimestampTz recoveryDelayUntilTime; +static int restore_command_retry_interval = 5000L; /* options taken from recovery.conf for XLOG streaming */ static bool StandbyModeRequested = false; @@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void) (errmsg_internal("trigger_file = '%s'", TriggerFile))); } + else if (strcmp(item->name, "restore_command_retry_interval") == 0) + { + const char *hintmsg; + + if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS, + &hintmsg)) +ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a temporal value", +"restore_command_retry_interval"), + hintmsg ? errhint("%s", _(hintmsg)) : 0)); + ereport(DEBUG2, + (errmsg_internal("restore_command_retry_interval = '%s'", item->value))); + + if (restore_command_retry_interval <= 0) + { +ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" must be bigger zero", + "restore_command_retry_interval"))); + } + } else if (strcmp(item->name, "recovery_min_apply_delay") == 0) { const char *hintmsg; @@ -10658,13 +10681,14 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, } /* - * Wait for more WAL to arrive. Time out after 5 seconds, + * Wait for more WAL to arrive. Time out after + * restore_command_retry_interval (5 seconds by default), * like when polling the archive, to react to a trigger * file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + restore_command_retry_interval); ResetLatch(&XLogCtl->recoveryWakeupLatch); break; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
Tom Lane wrote: > Just for starters, a 40XXX error report will fail to provide the > duplicated key's value. This will be a functional regression, Not if, as is normally the case, the transaction is retried from the beginning on a serialization failure. Either the code will check for a duplicate (as in the case of the OP on this thread) and they won't see the error, *or* the the transaction which created the duplicate key will have committed before the start of the retry and you will get the duplicate key error. > I think an appropriate response to these complaints is to fix the > documentation to point out that duplicate-key violations may also > be worthy of retries. > but I see no mention of the issue in chapter 13.) I agree that's the best we can do for stable branches, and worth doing. It would be interesting to hear from others who have rely on serializable transactions in production environments about what makes sense to them. This is probably the wrong list to find such people directly; but I seem to recall Josh Berkus has a lot of clients who do. Josh? Any opinion on this thread? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
At 2014-12-26 13:11:43 -0500, br...@momjian.us wrote: > > Is this something that could potentially change the data stored on > disk? Does pg_upgrade need to check for changes in this area? Is the > detection exposed by pg_controldata? Could this affect running the > data directory on a different CPU? No to all. Subsequent to Heikki's change (already in master) to use the CRC-32C algorithm (instead of the earlier mistakenly-reflected-but-not-quite one), both the slice-by-8 software implementation posted earlier and the SSE4.2 CRC32* instructions will compute exactly the same value. (Yes, I have verified this in both cases.) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On Fri, Dec 26, 2014 at 04:52:58PM +0100, Andres Freund wrote: > >Uh, what if the system is compiled on a different CPU that it > >is > >run on? Seems we would need a run-time CPU test. > > That's the cpuid thing mentioned above. Is this something that could potentially change the data stored on disk? Does pg_upgrade need to check for changes in this area? Is the detection exposed by pg_controldata? Could this affect running the data directory on a different CPU? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Alvaro Herrera writes: > Tom Lane wrote: >> TBH, I think we could have done without this test altogether; but if we're >> going to have it, a minimum expectation is that it not be hazardous to >> other tests around it. > The number of assertion failures in get_object_address without all the > sanity checks I added in pg_get_object_address was a bit surprising. > That's the whole reason I decided to add the test. I don't want to > blindly assume that all cases will work nicely in the future, > particularly as other object types are added. I'm surprised then that you didn't prefer the other solution (wrap the whole test in a single transaction). But we've probably spent more time on this than it deserves. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Andres Freund wrote: > >This sounds like a huge project -- it's not like event triggers are the > >only objects in the system where this is an issue, is it? I'm sure > >there is value in fixing it, but I have enough other projects. > > Can't we just move the test to run without parallelism? Its quite > quick, so I don't it'd have noticeable consequences timewise. (I got this a minute too late anyhow.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Tom Lane wrote: > Andres Freund writes: > > Can't we just move the test to run without parallelism? Its quite quick, so > > I don't it'd have noticeable consequences timewise. > > That just leaves the door open for somebody to add more tests parallel to > it in future. I've been long wanted to add declarative dependencies to tests: each test file would declare what other tests it depends on, and we would have a special clause to state "this one must not be run in concurrence with anything else". Of course, this is just wishful thinking at this point. > TBH, I think we could have done without this test altogether; but if we're > going to have it, a minimum expectation is that it not be hazardous to > other tests around it. The number of assertion failures in get_object_address without all the sanity checks I added in pg_get_object_address was a bit surprising. That's the whole reason I decided to add the test. I don't want to blindly assume that all cases will work nicely in the future, particularly as other object types are added. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Andres Freund writes: > Can't we just move the test to run without parallelism? Its quite quick, so I > don't it'd have noticeable consequences timewise. That just leaves the door open for somebody to add more tests parallel to it in future. TBH, I think we could have done without this test altogether; but if we're going to have it, a minimum expectation is that it not be hazardous to other tests around it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
On December 26, 2014 6:10:51 PM CET, Alvaro Herrera wrote: >Tom Lane wrote: >> Alvaro Herrera writes: >> > Tom Lane wrote: >> >> Alvaro Herrera writes: >> >>> Hm, maybe we can drop the event trigger explicitely first, then >wait a >> >>> little bit, then drop the remaining objects with DROP CASCADE? >> >> >> As I said, that's no fix; it just makes the timing harder to hit. >Another >> >> process could be paused at the critical point for longer than >whatever "a >> >> little bit" is. >> >> > Yeah, I was thinking we could play some games with the currently >running >> > XIDs from a txid_snapshot or some such, with a reasonable upper >limit on >> > the waiting time (for the rare cases with a server doing other >stuff >> > with long-running transactions.) >> >> Whether that's sane or not, the whole problem is so far out-of-scope >for >> a test of pg_get_object_address() that it's not even funny. I think >> we should adopt one of the two fixes I recommended and call it good. > >I think dropping the part involving an event trigger from the test is >reasonable. I will go do that. > >> If you want to work on making DROP EVENT TRIGGER safer in the long >run, >> that can be a separate activity. > >This sounds like a huge project -- it's not like event triggers are the >only objects in the system where this is an issue, is it? I'm sure >there is value in fixing it, but I have enough other projects. Can't we just move the test to run without parallelism? Its quite quick, so I don't it'd have noticeable consequences timewise. -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
Kevin Grittner writes: > Tom Lane wrote: >> Yes. This will deliver a less meaningful error code, > That depends entirely on whether you care more about whether the > problem was created by a concurrent transaction or exactly how that > concurrent transaction created the problem. Just for starters, a 40XXX error report will fail to provide the duplicated key's value. This will be a functional regression, on top of breaking existing code. I think an appropriate response to these complaints is to fix the documentation to point out that duplicate-key violations may also be worthy of retries. (I sort of thought it did already, actually, but I see no mention of the issue in chapter 13.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
Tom Lane wrote: > Kevin Grittner writes: >> Are there any objections to generating a write conflict instead >> of a duplicate key error if the duplicate key was added by a >> concurrent transaction? > > Yes. This will deliver a less meaningful error code, That depends entirely on whether you care more about whether the problem was created by a concurrent transaction or exactly how that concurrent transaction created the problem. For those using serializable transactions to manage concurrency the former is at least an order of magnitude more important. This is not the first time getting a constraint violation SQLSTATE for the actions of a concurrent serializable transaction has been reported as a bug. Going from memory, I think this might be about the fifth time users have reported it as a bug or potential bug on these lists. People using serializable transactions normally run all queries through common code with will retry the transaction from the start if there is a SQLSTATE starting with '40' (or perhaps picking out the specific codes '40001' and '40P01'). Not doing so for *some* types of errors generated by concurrent transactions reduces the application's level of user-friendliness, and may introduce surprising bugs. In this particular case the OP wants to do one thing if a row with a paricular value for a unique index exists, and something different if it doesn't. If we generate the write conflict for the case that it is concurrently added, it can retry the transaction and do one or the other; if we don't pay attention to that, they need weird heuristics for "the third case". That really is not more meaningful or useful. > *and* break existing code that is expecting the current behavior. Possibly, but my experience is more that failure to behave the way I suggest is biting people and causing them a lot of extra work and pain. I would be fine with limiting the new behavior to serializable transactions, since that seems to be where people want this behavior. It would bring us closer to "the transaction will run as though it were the only transaction running or roll back with a serialization failure" without having to add caveats and exceptions. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On December 26, 2014 6:05:34 PM CET, Bruce Momjian wrote: >On Fri, Dec 26, 2014 at 04:52:58PM +0100, Andres Freund wrote: >> On December 26, 2014 4:50:33 PM CET, Bruce Momjian >wrote: >> >On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote: >> >> Hi. >> >> >> >> Here's a proposed patch to use CPUID at startup to determine if >the >> >> SSE4.2 CRC instructions are available, to use them instead of the >> >> slice-by-8 implementation (posted earlier). >> >> >> >> A few notes: >> >> >> >> 1. GCC has included cpuid.h since 4.3.0, so I figured it was safe >to >> >>use. It can be replaced with some inline assembly otherwise. >> >> >> >> 2. I've also used the crc32b/crc32q instructions directly rather >than >> >>using ".bytes" to encode the instructions; bintuils versions >since >> >>2007 or so have supported them. >> >> >> >> 3. I've included the MSVC implementation mostly as an example of >how >> >to >> >>extend this to different compilers/platforms. It's written >> >according >> >>to the documentation for MSVC intrinsics, but I have not tested >> >it. >> >>Suggestions/improvements are welcome. >> > >> >Uh, what happens if the system is compiled on a different CPU that >it >> >is >> >run on? Seems we would need a run-time CPU test. >> >> That's the cpuid thing mentioned above. > >Oh, so cpuid is not a macro exported by the compiler but a C API. >Good Neither, really. It's a instruction returning CPU capabilities. -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> Alvaro Herrera writes: > >>> Hm, maybe we can drop the event trigger explicitely first, then wait a > >>> little bit, then drop the remaining objects with DROP CASCADE? > > >> As I said, that's no fix; it just makes the timing harder to hit. Another > >> process could be paused at the critical point for longer than whatever "a > >> little bit" is. > > > Yeah, I was thinking we could play some games with the currently running > > XIDs from a txid_snapshot or some such, with a reasonable upper limit on > > the waiting time (for the rare cases with a server doing other stuff > > with long-running transactions.) > > Whether that's sane or not, the whole problem is so far out-of-scope for > a test of pg_get_object_address() that it's not even funny. I think > we should adopt one of the two fixes I recommended and call it good. I think dropping the part involving an event trigger from the test is reasonable. I will go do that. > If you want to work on making DROP EVENT TRIGGER safer in the long run, > that can be a separate activity. This sounds like a huge project -- it's not like event triggers are the only objects in the system where this is an issue, is it? I'm sure there is value in fixing it, but I have enough other projects. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On Fri, Dec 26, 2014 at 04:52:58PM +0100, Andres Freund wrote: > On December 26, 2014 4:50:33 PM CET, Bruce Momjian wrote: > >On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote: > >> Hi. > >> > >> Here's a proposed patch to use CPUID at startup to determine if the > >> SSE4.2 CRC instructions are available, to use them instead of the > >> slice-by-8 implementation (posted earlier). > >> > >> A few notes: > >> > >> 1. GCC has included cpuid.h since 4.3.0, so I figured it was safe to > >>use. It can be replaced with some inline assembly otherwise. > >> > >> 2. I've also used the crc32b/crc32q instructions directly rather than > >>using ".bytes" to encode the instructions; bintuils versions since > >>2007 or so have supported them. > >> > >> 3. I've included the MSVC implementation mostly as an example of how > >to > >>extend this to different compilers/platforms. It's written > >according > >>to the documentation for MSVC intrinsics, but I have not tested > >it. > >>Suggestions/improvements are welcome. > > > >Uh, what happens if the system is compiled on a different CPU that it > >is > >run on? Seems we would need a run-time CPU test. > > That's the cpuid thing mentioned above. Oh, so cpuid is not a macro exported by the compiler but a C API. Good. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Alvaro Herrera writes: > Tom Lane wrote: >> Alvaro Herrera writes: >>> Hm, maybe we can drop the event trigger explicitely first, then wait a >>> little bit, then drop the remaining objects with DROP CASCADE? >> As I said, that's no fix; it just makes the timing harder to hit. Another >> process could be paused at the critical point for longer than whatever "a >> little bit" is. > Yeah, I was thinking we could play some games with the currently running > XIDs from a txid_snapshot or some such, with a reasonable upper limit on > the waiting time (for the rare cases with a server doing other stuff > with long-running transactions.) Whether that's sane or not, the whole problem is so far out-of-scope for a test of pg_get_object_address() that it's not even funny. I think we should adopt one of the two fixes I recommended and call it good. If you want to work on making DROP EVENT TRIGGER safer in the long run, that can be a separate activity. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> I've not proven this rigorously, but it seems obvious in hindsight: > >> what's happening is that when the object_address test drops everything > >> with DROP CASCADE, other processes are sometimes just starting to execute > >> the event trigger when the DROP commits. When they go to look up the > >> trigger function, they don't find it, leading to "cache lookup failed for > >> function". > > > Hm, maybe we can drop the event trigger explicitely first, then wait a > > little bit, then drop the remaining objects with DROP CASCADE? > > As I said, that's no fix; it just makes the timing harder to hit. Another > process could be paused at the critical point for longer than whatever "a > little bit" is. Yeah, I was thinking we could play some games with the currently running XIDs from a txid_snapshot or some such, with a reasonable upper limit on the waiting time (for the rare cases with a server doing other stuff with long-running transactions.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Alvaro Herrera writes: > Tom Lane wrote: >> I've not proven this rigorously, but it seems obvious in hindsight: >> what's happening is that when the object_address test drops everything >> with DROP CASCADE, other processes are sometimes just starting to execute >> the event trigger when the DROP commits. When they go to look up the >> trigger function, they don't find it, leading to "cache lookup failed for >> function". > Hm, maybe we can drop the event trigger explicitely first, then wait a > little bit, then drop the remaining objects with DROP CASCADE? As I said, that's no fix; it just makes the timing harder to hit. Another process could be paused at the critical point for longer than whatever "a little bit" is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> Still, I don't think this is a reasonable test design. We have > >> absolutely no idea what behaviors are being triggered in the other > >> tests, except that they are unrelated to what those tests think they > >> are testing. > > > I can of course move it to a separate parallel test, but I don't think > > that should be really necessary. > > I've not proven this rigorously, but it seems obvious in hindsight: > what's happening is that when the object_address test drops everything > with DROP CASCADE, other processes are sometimes just starting to execute > the event trigger when the DROP commits. When they go to look up the > trigger function, they don't find it, leading to "cache lookup failed for > function". Hm, maybe we can drop the event trigger explicitely first, then wait a little bit, then drop the remaining objects with DROP CASCADE? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On December 26, 2014 4:50:33 PM CET, Bruce Momjian wrote: >On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote: >> Hi. >> >> Here's a proposed patch to use CPUID at startup to determine if the >> SSE4.2 CRC instructions are available, to use them instead of the >> slice-by-8 implementation (posted earlier). >> >> A few notes: >> >> 1. GCC has included cpuid.h since 4.3.0, so I figured it was safe to >>use. It can be replaced with some inline assembly otherwise. >> >> 2. I've also used the crc32b/crc32q instructions directly rather than >>using ".bytes" to encode the instructions; bintuils versions since >>2007 or so have supported them. >> >> 3. I've included the MSVC implementation mostly as an example of how >to >>extend this to different compilers/platforms. It's written >according >>to the documentation for MSVC intrinsics, but I have not tested >it. >>Suggestions/improvements are welcome. > >Uh, what happens if the system is compiled on a different CPU that it >is >run on? Seems we would need a run-time CPU test. That's the cpuid thing mentioned above. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
Kevin Grittner writes: > Are there any objections to generating a write conflict instead of > a duplicate key error if the duplicate key was added by a > concurrent transaction? Yes. This will deliver a less meaningful error code, *and* break existing code that is expecting the current behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote: > Hi. > > Here's a proposed patch to use CPUID at startup to determine if the > SSE4.2 CRC instructions are available, to use them instead of the > slice-by-8 implementation (posted earlier). > > A few notes: > > 1. GCC has included cpuid.h since 4.3.0, so I figured it was safe to >use. It can be replaced with some inline assembly otherwise. > > 2. I've also used the crc32b/crc32q instructions directly rather than >using ".bytes" to encode the instructions; bintuils versions since >2007 or so have supported them. > > 3. I've included the MSVC implementation mostly as an example of how to >extend this to different compilers/platforms. It's written according >to the documentation for MSVC intrinsics, but I have not tested it. >Suggestions/improvements are welcome. Uh, what happens if the system is compiled on a different CPU that it is run on? Seems we would need a run-time CPU test. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
On 2014-12-26 17:23, Kevin Grittner wrote: Are there any objections to generating a write conflict instead of a duplicate key error if the duplicate key was added by a concurrent transaction? Only for transactions at isolation level REPEATABLE READ or higher? Is it possible to distinguish between an actual write conflict and a completely unrelated UNIQUE constraint ending up violated for some reason? .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other odd buildfarm failures
Alvaro Herrera writes: > Tom Lane wrote: >> Still, I don't think this is a reasonable test design. We have >> absolutely no idea what behaviors are being triggered in the other >> tests, except that they are unrelated to what those tests think they >> are testing. > I can of course move it to a separate parallel test, but I don't think > that should be really necessary. I've not proven this rigorously, but it seems obvious in hindsight: what's happening is that when the object_address test drops everything with DROP CASCADE, other processes are sometimes just starting to execute the event trigger when the DROP commits. When they go to look up the trigger function, they don't find it, leading to "cache lookup failed for function". The fact that the complained-of OID is slightly variable, but always in the range of OIDs that would be getting assigned around this point in a "make check" run, buttresses the theory. I thought about changing the object_address test so that it explicitly drops the event trigger first. But that would not be a fix, it would just make the timing harder to hit (ie, a victim process would need to lose control for longer at the critical point). Since I remain of the opinion that a test called object_address has no damn business causing global side-effects, I think there are two reasonable fixes: 1. Remove the event trigger. This would slightly reduce the test's coverage. 2. Run that whole test as a single transaction, so that the event trigger is created and dropped in one transaction and is never seen as valid by any concurrent test. A long-term idea is to try to fix things so that there's sufficient locking to make dropping an event trigger and immediately dropping its trigger function safe. But I'm not sure that's either possible or a good idea (the lock obtained by DROP would bring the entire database to a standstill ...). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #12330: ACID is broken for unique constraints
"nikita.y.vol...@mail.ru" wrote: > Executing concurrent transactions inserting the same value of a > unique key fails with the "duplicate key" error under code > "23505" instead of any of transaction conflict errors with a > "40***" code. This is true, and can certainly be inconvenient when using serializable transactions to simplify handling of race conditions, because you can't just test for a SQLSTATE of '40001' or '40P01' to indicate the need to retry the transaction. You have two reasonable ways to avoid duplicate keys if the values are synthetic and automatically generated. One is to use a SEQUENCE object to generate the values. The other (really only recommended if gaps in the sequence are a problem) is to have the serializable transaction update a row to "claim" the number. Otherwise you need to consider errors related to duplicates as possibly being caused by a concurrent transaction. You may want to do one transaction retry in such cases, and fail if an identical error is seen. Keep in mind that these errors do not allow serialization anomalies to appear in the committed data, so are arguably not violations of ACID principles -- more of a wart on the otherwise clean technique of using serializable transactions to simplify application programming under concurrency. Thinking about it just now I think we might be able to generate a write conflict instead of a duplicate key error for this case by checking the visibility information for the duplicate row. It might not even have a significant performance impact, since we need to check visibility information to generate the duplicate key error. That would still leave similar issues (where similar arguments can be made) relating to foreign keys; but those can largely be addressed already by declaring the constraints to be DEFERRED -- and anyway, that would be a separate fix. I'm moving this discussion to the -hackers list so that I can ask other developers: Are there any objections to generating a write conflict instead of a duplicate key error if the duplicate key was added by a concurrent transaction? Only for transactions at isolation level REPEATABLE READ or higher? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] explain sortorder
>Heikki Linnakangas writes: >> I would suggest just adding the information to the Sort Key line. As >> long as you don't print the modifiers when they are defaults (ASC and >> NULLS LAST), we could print the information even in non-VERBOSE mode. >+1. I had assumed without looking that that was what it did already, >else I'd have complained too. > regards, tom lane We will change the patch according to Heikkis suggestions. A nice Christmas & all the best in the New Year Arne Scheffer http://www.uni-muenster.de/ZIV/Mitarbeiter/ArneScheffer.shtml -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates
On 24 December 2014 at 16:04, Andreas Karlsson wrote: On 12/16/2014 11:04 AM, David Rowley wrote:> These are some very promising > performance increases. > >> >> I've done a quick pass of reading the patch. I currently don't have a >> system with a 128bit int type, but I'm working on that. >> > > Sorry for taking some time to get back. I have been busy before Christmas. > A new version of the patch is attached. > > Ok I've had another look at this, and I think the only things that I have to say have been mentioned already: 1. Do we need to keep the 128 byte aggregate state size for machines without 128 bit ints? This has been reduced to 48 bytes in the patch, which is in favour code being compiled with a compiler which has 128 bit ints. I kind of think that we need to keep the 128 byte estimates for compilers that don't support int128, but I'd like to hear any counter arguments. 2. References to int16 meaning 16 bytes. I'm really in two minds about this, it's quite nice to keep the natural flow, int4, int8, int16, but I can't help think that this will confuse someone one day. I think it'll be a long time before it confused anyone if we called it int128 instead, but I'm not that excited about seeing it renamed either. I'd like to hear what others have to say... Is there a chance that some sql standard in the distant future will have HUGEINT and we might regret not getting the internal names nailed down? I also checked the performance of some windowing function calls, since these call the final function for each row, I thought I'd better make sure there was no regression as the final function must perform a conversion from int128 to numeric for each row. It seems there's still an increase in performance: Setup: create table bi_win (i bigint primary key); insert into bi_win select x.x from generate_series(1,1) x(x); vacuum analyze; Query: select sum(i) over () from bi_win; ** Master ./pgbench -f ~/int128_window.sql -n -T 60 postgres transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 6567 latency average: 9.137 ms tps = 109.445841 (including connections establishing) tps = 109.456941 (excluding connections establishing) ** Patched ./pgbench -f ~/int128_window.sql -n -T 60 postgres transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 7841 latency average: 7.652 ms tps = 130.670253 (including connections establishing) tps = 130.675743 (excluding connections establishing) Setup: create table i_win (i int primary key); insert into i_win select x.x from generate_series(1,1) x(x); vacuum analyze; Query: select stddev(i) over () from i_win; ** Master ./pgbench -f ~/int128_window.sql -n -T 60 postgres transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 5084 latency average: 11.802 ms tps = 84.730362 (including connections establishing) tps = 84.735693 (excluding connections establishing) ** Patched ./pgbench -f ~/int128_window.sql -n -T 60 postgres transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 7557 latency average: 7.940 ms tps = 125.934787 (including connections establishing) tps = 125.943176 (excluding connections establishing) Regards David Rowley
Re: [HACKERS] Additional role attributes && superuser review
On Wed, Dec 24, 2014 at 6:48 PM, Adam Brightwell < adam.brightw...@crunchydatasolutions.com> wrote: > All, > > I want to revive this thread and continue to move these new role > attributes forward. > > In summary, the ultimate goal is to include new role attributes for common > operations which currently require superuser privileges. > > Initially proposed were the following attributes: > > * BACKUP - allows role to perform backup operations > * LOGROTATE - allows role to rotate log files > * MONITOR - allows role to view pg_stat_* details > * PROCSIGNAL - allows role to signal backend processes > > It seems that PROCSIGNAL and MONITOR were generally well received and > probably don't warrant much more discussion at this point. > > However, based on previous discussions, there seemed to be some > uncertainty on how to handle BACKUP and LOGROTATE. > > Concerns: > > * LOGROTATE - only associated with one function/operation. > * BACKUP - perceived to be too broad of a permission as it it would > provide the ability to run pg_start/stop_backend and the xlog related > functions. It is general sentiment is that these should be handled as > separate privileges. > * BACKUP - preferred usage is with pg_dump to giving a user the ability to > run pg_dump on the whole database without being superuser. > > Previous Recommendations: > > * LOGROTATE - Use OPERATOR - concern was expressed that this might be too > general of an attribute for this purpose. Also, concern for privilege > 'upgrades' as it includes more capabilities in later releases. > * LOGROTATE - Use LOG_OPERATOR - generally accepted, but concern was raise > for using extraneous descriptors such as '_OPERATOR' and '_ADMIN', etc. > * BACKUP - Use WAL_CONTROL for pg_start/stop_backup - no major > disagreement, though same concern regarding extraneous descriptors. > * BACKUP - Use XLOG_OPERATOR for xlog operations - no major disagreement, > though same concern regarding extraneous descriptors. > * BACKUP - Use BACKUP for granting non-superuser ability to run pg_dump on > whole database. > > Given the above and previous discussions: > > I'd like to propose the following new role attributes: > > BACKUP - allows role to perform pg_dump* backups of whole database. > I'd suggest it's called DUMP if that's what it allows, to keep it separate from the backup parts. > WAL - allows role to execute pg_start_backup/pg_stop_backup functions. > XLOG - allows role to execute xlog operations. That seems really bad names, IMHO. Why? Because we use WAL and XLOG throughout documentation and parameters and code to mean *the same thing*. And here they'd suddenly mean different things. If we need them as separate privileges, I think we need much better names. (And a better description - what is "xlog operations" really?) And the one under WAL would be very similar to what REPLICATION does today. Or are you saying that it should specifically *not* allow base backups done through the replication protocol, only the exclusive ones? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
2014-12-26 11:41 GMT+01:00 Pavel Stehule : > > > 2014-12-25 22:23 GMT+01:00 Alex Shulgin : > >> Trent Shipley writes: >> >> > On Friday 2007-12-14 16:22, Tom Lane wrote: >> >> Neil Conway writes: >> >> > By modifying COPY: COPY IGNORE ERRORS or some such would instruct >> COPY >> >> > to drop (and log) rows that contain malformed data. That is, rows >> with >> >> > too many or too few columns, rows that result in constraint >> violations, >> >> > and rows containing columns where the data type's input function >> raises >> >> > an error. The last case is the only thing that would be a bit tricky >> to >> >> > implement, I think: you could use PG_TRY() around the >> InputFunctionCall, >> >> > but I guess you'd need a subtransaction to ensure that you reset your >> >> > state correctly after catching an error. >> >> >> >> Yeah. It's the subtransaction per row that's daunting --- not only the >> >> cycles spent for that, but the ensuing limitation to 4G rows imported >> >> per COPY. >> > >> > You could extend the COPY FROM syntax with a COMMIT EVERY n clause. >> This >> > would help with the 4G subtransaction limit. The cost to the ETL >> process is >> > that a simple rollback would not be guaranteed send the process back to >> it's >> > initial state. There are easy ways to deal with the rollback issue >> though. >> > >> > A {NO} RETRY {USING algorithm} clause might be useful. If the NO RETRY >> > option is selected then the COPY FROM can run without subtransactions >> and in >> > excess of the 4G per transaction limit. NO RETRY should be the default >> since >> > it preserves the legacy behavior of COPY FROM. >> > >> > You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not >> give the >> > option of sending exceptions to a table since they are presumably >> malformed, >> > otherwise they would not be exceptions. (Users should re-process >> exception >> > files if they want an if good then table a else exception to table b >> ...) >> > >> > EXCEPTIONS TO and NO RETRY would be mutually exclusive. >> > >> > >> >> If we could somehow only do a subtransaction per failure, things would >> >> be much better, but I don't see how. >> >> Hello, >> >> Attached is a proof of concept patch for this TODO item. There is no >> docs yet, I just wanted to know if approach is sane. >> >> The added syntax is like the following: >> >> COPY [table] FROM [file/program/stdin] EXCEPTIONS TO [file or stdout] >> >> The way it's done it is abusing Copy Both mode and from my limited >> testing, that seems to just work. The error trapping itself is done >> using PG_TRY/PG_CATCH and can only catch formatting or before-insert >> trigger errors, no attempt is made to recover from a failed unique >> constraint, etc. >> >> Example in action: >> >> postgres=# \d test_copy2 >> Table "public.test_copy2" >> Column | Type | Modifiers >> +-+--- >> id | integer | >> val| integer | >> >> postgres=# copy test_copy2 from program 'seq 3' exceptions to stdout; >> 1 >> NOTICE: missing data for column "val" >> CONTEXT: COPY test_copy2, line 1: "1" >> 2 >> NOTICE: missing data for column "val" >> CONTEXT: COPY test_copy2, line 2: "2" >> 3 >> NOTICE: missing data for column "val" >> CONTEXT: COPY test_copy2, line 3: "3" >> NOTICE: total exceptions ignored: 3 >> >> postgres=# \d test_copy1 >> Table "public.test_copy1" >> Column | Type | Modifiers >> +-+--- >> id | integer | not null >> >> postgres=# set client_min_messages to warning; >> SET >> postgres=# copy test_copy1 from program 'ls /proc' exceptions to stdout; >> ... >> vmstat >> zoneinfo >> postgres=# >> >> Limited performance testing shows no significant difference between >> error-catching and plain code path. For example, timing >> >> copy test_copy1 from program 'seq 100' [exceptions to stdout] >> >> shows similar numbers with or without the added "exceptions to" clause. >> >> Now that I'm sending this I wonder if the original comment about the >> need for subtransaction around every loaded line still holds. Any >> example of what would be not properly rolled back by just PG_TRY? >> > > this method is unsafe .. exception handlers doesn't free memory usually - > there is risk of memory leaks, source leaks > > you can enforce same performance with block subtransactions - when you use > subtransaction for 1000 rows, then impact of subtransactions is minimal > > when block fails, then you can use row level subtransaction - it works > well when you expect almost correct data. > Two years ago I wrote a extension that did it - but I have not time to finish it and push to upstream. Regards Pavel > > Regards > > Pavel > > >> >> Happy hacking! >> -- >> Alex >> >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> >> > ftcopy-04.tgz Description: GNU Zip compresse
Re: [HACKERS] POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
2014-12-25 22:23 GMT+01:00 Alex Shulgin : > Trent Shipley writes: > > > On Friday 2007-12-14 16:22, Tom Lane wrote: > >> Neil Conway writes: > >> > By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY > >> > to drop (and log) rows that contain malformed data. That is, rows with > >> > too many or too few columns, rows that result in constraint > violations, > >> > and rows containing columns where the data type's input function > raises > >> > an error. The last case is the only thing that would be a bit tricky > to > >> > implement, I think: you could use PG_TRY() around the > InputFunctionCall, > >> > but I guess you'd need a subtransaction to ensure that you reset your > >> > state correctly after catching an error. > >> > >> Yeah. It's the subtransaction per row that's daunting --- not only the > >> cycles spent for that, but the ensuing limitation to 4G rows imported > >> per COPY. > > > > You could extend the COPY FROM syntax with a COMMIT EVERY n clause. This > > would help with the 4G subtransaction limit. The cost to the ETL > process is > > that a simple rollback would not be guaranteed send the process back to > it's > > initial state. There are easy ways to deal with the rollback issue > though. > > > > A {NO} RETRY {USING algorithm} clause might be useful. If the NO RETRY > > option is selected then the COPY FROM can run without subtransactions > and in > > excess of the 4G per transaction limit. NO RETRY should be the default > since > > it preserves the legacy behavior of COPY FROM. > > > > You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not > give the > > option of sending exceptions to a table since they are presumably > malformed, > > otherwise they would not be exceptions. (Users should re-process > exception > > files if they want an if good then table a else exception to table b ...) > > > > EXCEPTIONS TO and NO RETRY would be mutually exclusive. > > > > > >> If we could somehow only do a subtransaction per failure, things would > >> be much better, but I don't see how. > > Hello, > > Attached is a proof of concept patch for this TODO item. There is no > docs yet, I just wanted to know if approach is sane. > > The added syntax is like the following: > > COPY [table] FROM [file/program/stdin] EXCEPTIONS TO [file or stdout] > > The way it's done it is abusing Copy Both mode and from my limited > testing, that seems to just work. The error trapping itself is done > using PG_TRY/PG_CATCH and can only catch formatting or before-insert > trigger errors, no attempt is made to recover from a failed unique > constraint, etc. > > Example in action: > > postgres=# \d test_copy2 > Table "public.test_copy2" > Column | Type | Modifiers > +-+--- > id | integer | > val| integer | > > postgres=# copy test_copy2 from program 'seq 3' exceptions to stdout; > 1 > NOTICE: missing data for column "val" > CONTEXT: COPY test_copy2, line 1: "1" > 2 > NOTICE: missing data for column "val" > CONTEXT: COPY test_copy2, line 2: "2" > 3 > NOTICE: missing data for column "val" > CONTEXT: COPY test_copy2, line 3: "3" > NOTICE: total exceptions ignored: 3 > > postgres=# \d test_copy1 > Table "public.test_copy1" > Column | Type | Modifiers > +-+--- > id | integer | not null > > postgres=# set client_min_messages to warning; > SET > postgres=# copy test_copy1 from program 'ls /proc' exceptions to stdout; > ... > vmstat > zoneinfo > postgres=# > > Limited performance testing shows no significant difference between > error-catching and plain code path. For example, timing > > copy test_copy1 from program 'seq 100' [exceptions to stdout] > > shows similar numbers with or without the added "exceptions to" clause. > > Now that I'm sending this I wonder if the original comment about the > need for subtransaction around every loaded line still holds. Any > example of what would be not properly rolled back by just PG_TRY? > this method is unsafe .. exception handlers doesn't free memory usually - there is risk of memory leaks, source leaks you can enforce same performance with block subtransactions - when you use subtransaction for 1000 rows, then impact of subtransactions is minimal when block fails, then you can use row level subtransaction - it works well when you expect almost correct data. Regards Pavel > > Happy hacking! > -- > Alex > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Re: [HACKERS] btree_gin and ranges
Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are a good idea, but they probably should go into src/include/access/gin.h so that they can be used in all compare_partial implementations. STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and comments are improved. Split patch to two: gin and module -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ gin_macros.patch.gz Description: GNU Zip compressed data btree_gin_range-3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers