[HACKERS] SELECT FOR UPDATE regression in 9.5
Hello list While testing an application with PostgreSQL 9.5, we experienced an issue involving aborted subtransactions and SELECT FOR UPDATE. In certain situations, a locking query doesn't return rows that should be visible and already locked by the current transaction. I bisected this down to commit 27846f02c176eebe7e08ce51ed4d52140454e196 "Optimize locking a tuple already locked by another subxact" This issue is also reproducible on the current master branch. In an assertions-enabled build, it traps an assertion in HeapTupleHeaderGetCmax called by heap_lock_tuple. The following test case demonstrates the issue... CREATE TABLE IF NOT EXISTS testcase( id int PRIMARY KEY, balance numeric ); INSERT INTO testcase VALUES (1, 0); BEGIN; SELECT * FROM testcase WHERE testcase.id = 1 FOR UPDATE; UPDATE testcase SET balance = balance + 400 WHERE id=1; SAVEPOINT subxact; UPDATE testcase SET balance = balance - 100 WHERE id=1; ROLLBACK TO SAVEPOINT subxact; -- "division by zero" shouldn't occur because I never deleted any rows SELECT 1/count(*) from ( SELECT * FROM testcase WHERE id=1 FOR UPDATE )x; ROLLBACK; Regards, Marti Raudsepp
Re: [HACKERS] [PATCH] Skip ALTER x SET SCHEMA if the schema didn't change
Hi Haribabu Kommi Thank you so much for the review and patch update. I should have done that myself, but I've been really busy for the last few weeks. :( Regards, Marti On Mon, Nov 16, 2015 at 4:46 AM, Haribabu Kommi <kommi.harib...@gmail.com> wrote: > On Thu, Nov 5, 2015 at 10:20 PM, Haribabu Kommi > <kommi.harib...@gmail.com> wrote: > > On Tue, Sep 29, 2015 at 12:17 AM, Marti Raudsepp <ma...@juffo.org> > wrote: > >> Hi list > >> > >> The attached patch changes the behavior of multiple ALTER x SET SCHEMA > >> commands, to skip, rather than fail, when the old and new schema is > >> the same. > >> > >> The advantage is that it's now easier to write DDL scripts that are > indempotent. > >> > >> This already matches the behavior of ALTER EXTENSION SET SCHEMA in > >> earlier versions, as well as many other SET-ish commands, e.g. ALTER > >> TABLE SET TABLESPACE, OWNER TO, CLUSTER ON, SET (storage_parameter...) > >> etc. I don't see why SET SCHEMA should be treated any differently. > >> > >> The code is written such that object_access_hook is still called for > >> each object. > >> > >> Regression tests included. I couldn't find any documentation that > >> needs changing. > > > > I went through the patch, following are my observations, > > > > Patch applied with hunks and compiled with out warnings. > > Basic tests are passed. > > > > In AlterTableNamespaceInternal function, if a table or matview called > > for set schema, > > If the object contains any constraints, the constraint gets updated > > with new schema. > > > > In AlterTypeNamespaceInternal function, the InvokeObjectPostAlterHook > function > > doesn't get called if the type is of composite type, domain and array > > types as because > > it just returns from top of the function. > > Most of the community members didn't find any problem in changing the > behavior, so here I attached updated patch with the above two corrections. > > Regards, > Hari Babu > Fujitsu Australia >
Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?
Hi Gavin Note that Alexander Korotkov already started work in 2013 on a somewhat similar feature called partial sort: http://www.postgresql.org/message-id/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=gon-...@mail.gmail.com In particular, see the 2nd patch for KNN sort -- it uses known bounding boxes from the GiST index for sorting, which in many ways is similar to min/max BRIN ranges. > *partial-knn-1.patch* > > KNN-GiST provides ability to get ordered results from index, but this order > is based only on index information. For instance, GiST index contains > bounding rectangles for polygons, and we can't get exact distance to > polygon from index (similar situation is in PostGIS). In attached patch, > GiST distance method can set recheck flag (similar to consistent method). > This flag means that distance method returned lower bound of distance and > we should recheck it from heap. Unfortunatley this work has stalled. Regards, Marti On Sun, Sep 27, 2015 at 11:58 PM, Gavin Wahlwrote: > It seems trivial to accelerate a MAX or MIN query with a BRIN index. You > just find the page range with the largest/smallest value, and then only scan > that one. Would that be hard to implement? I'm interested in working on it > if someone can give me some pointers. > > Somewhat harder but still possible would be using BRIN indexes to accelerate > ORDER BY. This would require a sorting algorithm that can take advantage of > mostly-sorted inputs. You would sort the page ranges by their minimum or > maximum value, then feed the sorting algorithm in that order. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Skip ALTER x SET SCHEMA if the schema didn't change
Hi list The attached patch changes the behavior of multiple ALTER x SET SCHEMA commands, to skip, rather than fail, when the old and new schema is the same. The advantage is that it's now easier to write DDL scripts that are indempotent. This already matches the behavior of ALTER EXTENSION SET SCHEMA in earlier versions, as well as many other SET-ish commands, e.g. ALTER TABLE SET TABLESPACE, OWNER TO, CLUSTER ON, SET (storage_parameter...) etc. I don't see why SET SCHEMA should be treated any differently. The code is written such that object_access_hook is still called for each object. Regression tests included. I couldn't find any documentation that needs changing. Regards, Marti 0001-Skip-ALTER-x-SET-SCHEMA-if-the-schema-didn-t-change.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Less than ideal error reporting in pg_stat_statements
On Wed, Sep 23, 2015 at 3:01 AM, Peter Geogheganwrote: > I think that the real problem here is that garbage collection needs to > deal with OOM more appropriately. +1 I've also been seeing lots of log messages saying "LOG: out of memory" on a server that's hosting development databases. I put off debugging this until now because it didn't seem to have any adverse effects on the system. The file on my system is currently 5.1GB (!). I don't know how it got there -- under normal circumstances we don't have any enormous queries, but perhaps our application bugs during development triggered that. The configuration on this system is pg_stat_statements.max = 1 and pg_stat_statements.track = all. The comment near gc_qtexts says: * This won't be called often in the typical case, since it's likely that * there won't be too much churn, and besides, a similar compaction process * occurs when serializing to disk at shutdown or as part of resetting. * Despite this, it seems prudent to plan for the edge case where the file * becomes unreasonably large, with no other method of compaction likely to * occur in the foreseeable future. [...] * Load the old texts file. If we fail (out of memory, for instance) just * skip the garbage collection. So, as I understand it: if the system runs low on memory for an extended period, and/or the file grows beyond 1GB (MaxAlloc), garbage collection stops entirely, meaning it starts leaking disk space until a manual intervention. It's very frustrating when debugging aides cause further problems on a system. If the in-line compaction doesn't materialize (or it's decided not to backport it), I would propose instead to add a test to pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps even a lower limit). Surely dropping some statistics is better than this pathology. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace
On Wed, Jul 22, 2015 at 5:00 PM, Robert Haas robertmh...@gmail.com wrote: +1. I would recommend adding it to the CF *immediately* to have it get attention. The CF app is basically our patch tracker. Thanks, I have done so now: https://commitfest.postgresql.org/6/313/ Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace
On Mon, Jun 22, 2015 at 9:20 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 19, 2015 at 12:10 PM, Marti Raudsepp ma...@juffo.org wrote: One of my databases failed to upgrade successfully and produced this error in the copying phase: error while copying relation pg_catalog.pg_largeobject (/srv/ssd/PG_9.3_201306121/1/12023 to /PG_9.4_201409291/1/12130): No such file or directory I haven't looked at the patch, but this seems like a good thing to fix. Hopefully Bruce will take a look soon. Ping? This has gone a month without a reply. Should I add this patch to a commitfest? (I was under the impression that bugfixes don't normally go through the commitfest process) Regards, Marti Raudsepp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace
Hi list One of my databases failed to upgrade successfully and produced this error in the copying phase: error while copying relation pg_catalog.pg_largeobject (/srv/ssd/PG_9.3_201306121/1/12023 to /PG_9.4_201409291/1/12130): No such file or directory Turns out this happens when either the postgres or template1 databases have been moved to a non-default tablespace. pg_dumpall does not dump attributes (such as tablespace) for these databases; pg_upgrade queries the new cluster about the tablespace for these relations and builds a broken destination path for the copy/link operation. The least bad solution seems to be generating ALTER DATBASE SET TABLESPACE commands for these from pg_dumpall. Previously a --globals-only dump didn't generate psql \connect commands, but you can't run SET TABLESPACE from within the same database you're connected to. So to move postgres, it needs to connect to template1 and vice versa. That seems fine for the purpose of pg_upgrade which can assume a freshly created cluster with both databases intact. I have implemented a proof of concept patch for this. Currently I'm only tackling the binary upgrade failure and not general pg_dumpall. Alternatively, we could allow SET TABLESPACE in the current database, which seems less ugly to me. A code comment says Obviously can't move the tables of my own database, but it's not obvious to me why. If I'm the only connected backend, it seems that any caches and open files could be invalidated. But I don't know how big of an undertaking that would be. While working on this, I also noticed other things that pg_dumpall (and this pg_upgrade) misses: * Nothing at all is dumped for the template0 database, although ACLs, settings and the tablespace can be changed by the user * template1 postgres databases retain ACLs and settings, but not attributes like TABLESPACE or CONNECTION LIMIT. Other attributes like LC_COLLATE and LC_CTYPE can't be changed without recreating the DB, but those don't matter for the pg_upgrade case anyway. It seems those would be good material for another patch? Regards, Marti Raudsepp 0001-Fix-pg_upgrade-when-postgres-template1-aren-t-in-def.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] alter user/role CURRENT_USER
On Fri, Oct 24, 2014 at 11:29 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: - 0001-ALTER-ROLE-CURRENT_USER_v2.patch - the patch. +RoleId:CURRENT_USER{ $$ = current_user;} + | USER { $$ = current_user;} + | CURRENT_ROLE { $$ = current_user;} + | SESSION_USER { $$ = session_user;} This is kind of ugly, and it means you can't distinguish between a CURRENT_USER keyword and a quoted user name current_user. It's a legitimate user name, so the behavior of the following now changes: CREATE ROLE current_user; ALTER ROLE current_user SET work_mem='10MB'; There ought to be a better way to represent this than using magic string values. It accepts CURRENT_USER/USER/CURRENT_ROLE/SESSION_USER. On a stylistic note, do we really want to support the alternative spellings of CURRENT_USER, like CURRENT_ROLE and USER? The SQL standard is well-hated for inventing more keywords than necessary. There is no precedent for using/allowing these aliases in PostgreSQL DDL commands, and ALTER USER ROLE aren't SQL standard anyway. So maybe we should stick with just accepting one canonical syntax instead. I think the word USER may reasonably arise from an editing mistake, ALTER USER USER does not seem like sane syntax that should be accepted. From your original email: - Modify syntax of ALTER USER so as to be an alias of ALTER ROLE. - Added ALL syntax as user name to ALTER USER. But should ALTER USER ALL and ALTER ROLE ALL really do the same thing? A user is a role with the LOGIN option. Every user is a role, but not every role is a user. I suspect that ambiguity was why ALTER USER ALL wasn't originally implemented. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT
On Wed, Oct 22, 2014 at 1:37 PM, David Rowley dgrowle...@gmail.com wrote: I've had a bit of a look at this and here's a couple of things: Thanks. Sorry I didn't back to you earlier, I almost forgot about the review. /* +* LIMIT clause can be removed if it's a positive constant or ALL, to +* prevent it from being an optimization barrier. It's a common meme to put +* LIMIT 1 within EXISTS subqueries. +*/ I think this comment may be better explained along the lines of: A subquery which has a LIMIT clause with a positive value is effectively a no-op in this scenario. With EXISTS we only care about the first row anyway, so any positive limit value will have no behavioral change to the query, so we'll simply remove the LIMIT clause. If we're unable to prove that the LIMIT value is a positive number then we'd better not touch it. That's a bit redundant. Here's what I wrote instead, does this sound better? * A subquery that has a LIMIT clause with a positive value or NULL causes * no behavioral change to the query. With EXISTS we only care about the * first row anyway, so we'll simply remove the LIMIT clause. If the LIMIT * value does not reduce to a constant that's positive or NULL then do not * touch it. + /* Checking for negative values is done later; 0 is just silly */ + if (! limit-constisnull DatumGetInt64(limit-constvalue) = 0) I'm a bit confused by the comment here. You're saying that we'll check for negatives later... but you're checking for = 0 on the next line... Did I miss something or is this a mistake? I meant that the case when a negative value raises an error is checked later in the execution pass. But you're right, this code has nothing to do with that so I've removed the mention about negative values. It now says: /* If it's not NULL and not positive, keep it as a regular subquery */ I guess here you're just testing to ensure that you've not broken this and that the new code does not accidentally remove the LIMIT clause. I think it also might be worth adding some tests to ensure that co-related EXISTS clauses with a positive LIMIT value get properly changed into a SEMI JOIN instead of remaining as a subplan. And also make sure that a LIMIT 0 or LIMIT -1 gets left as a subplan. I'd say such a test would supersede the above test, and I think it's also the case you were talking about optimising anyway? Sure, this is a planner-only change so it makes sense to test EXPLAIN output. The dilemma I always have is: wouldn't this cause failures due to plan instability, if for example autoanalyze gets around to this table earlier or later and nudges it from a hash join to merge etc? Or is this not a problem? Patch attached with all above changes. Regards, Marti From 28543dda9febe8d8b5fc91060a4323c08f3c4a8a Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Wed, 1 Oct 2014 02:17:21 +0300 Subject: [PATCH] Simplify EXISTS subqueries containing LIMIT --- src/backend/optimizer/plan/subselect.c | 42 ++- src/test/regress/expected/subselect.out | 51 + src/test/regress/sql/subselect.sql | 14 + 3 files changed, 100 insertions(+), 7 deletions(-) diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c index 3e7dc85..66d1b90 100644 --- a/src/backend/optimizer/plan/subselect.c +++ b/src/backend/optimizer/plan/subselect.c @@ -70,7 +70,7 @@ static Node *convert_testexpr_mutator(Node *node, static bool subplan_is_hashable(Plan *plan); static bool testexpr_is_hashable(Node *testexpr); static bool hash_ok_operator(OpExpr *expr); -static bool simplify_EXISTS_query(Query *query); +static bool simplify_EXISTS_query(PlannerInfo *root, Query *query); static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect, Node **testexpr, List **paramIds); static Node *replace_correlation_vars_mutator(Node *node, PlannerInfo *root); @@ -452,7 +452,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery, * If it's an EXISTS subplan, we might be able to simplify it. */ if (subLinkType == EXISTS_SUBLINK) - simple_exists = simplify_EXISTS_query(subquery); + simple_exists = simplify_EXISTS_query(root, subquery); /* * For an EXISTS subplan, tell lower-level planner to expect that only the @@ -518,7 +518,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery, /* Make a second copy of the original subquery */ subquery = (Query *) copyObject(orig_subquery); /* and re-simplify */ - simple_exists = simplify_EXISTS_query(subquery); + simple_exists = simplify_EXISTS_query(root, subquery); Assert(simple_exists); /* See if it can be converted to an ANY query */ subquery = convert_EXISTS_to_ANY(root, subquery, @@ -1359,7 +1359,7 @@ convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink, * targetlist, we have to fail, because the pullup operation leaves us
Re: [HACKERS] btree_gin and ranges
Hi On Wed, Oct 22, 2014 at 1:55 PM, Teodor Sigaev teo...@sigaev.ru wrote: With patch it's possible to rewrite query with ranges SELECT * FROM test_int4 WHERE i @ '[-1, 1]'::int4range and GIN index will support this query with single scan from -1 to 1. Shouldn't this be implemented in a more generic manner? An ordinary btree index could very well support @ queries too, but your patch only adds this capability to btree-gin. The documentation describes btree-gin as providing GIN operator classes that implement B-tree equivalent behavior, but now the behavior diverges. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Add launchd Support
On Tue, Oct 21, 2014 at 3:53 AM, Wim Lewis w...@omnigroup.com wrote: I think the idea of OnDemand is for launchd items to act a bit like inetd does: launchd creates the listening socket (or mach port or file-change notification) on the port specified in the plist, and only starts the process when someone tries to connect to it. This might not be a terrible thing to support, but it would require more changes throughout postgres (postmaster would have to check in with launchd at start time to get the listen socket; ports and socket paths would no longer be specifiable in postgres’ config, etc) and I’m skeptical that it’d be worth the work without a more concrete motivation. systemd socket activation on Linux works pretty much the same way. If this will ever be implemented, it's most reasonable to tackle both launchd and systemd together. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT
Hi Thanks for taking a look. On Sun, Oct 19, 2014 at 1:22 PM, David Rowley dgrowle...@gmail.com wrote: the argument for this would have been much stronger if anti join support had just been added last week. It's been quite a few years now and the argument for this must be getting weaker with every release. I see your point, but I would put it another way: we've had this for a few years, but people haven't learned and are *still* using LIMIT 1. Actually I thought of a cleverer approach to solve this, that doesn't require calling eval_const_expressions and works with any expression. Given query: EXISTS (SELECT ... WHERE foo LIMIT any_expr()) we could turn it into the almost-equivalent form: EXISTS (SELECT ... WHERE foo AND any_expr() 0) The only problem is that we'd no longer be able to throw an error for negative values and that seems like a deal-breaker. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Oct 17, 2014 6:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: A more realistic objection goes like this: create table foo(f int, g int); update foo x set x = (1,2); -- works alter table foo add column x int; update foo x set x = (1,2,3); -- no longer works It's not a real good thing if a column addition or renaming can so fundamentally change the nature of a query. I think a significant use case for this feature is when you already have a row-value and want to persist it in the database, like you can do with INSERT: insert into foo select * from populate_record_json(null::foo, '{...}'); In this case the opposite is true: requiring explicit column names would break the query if you add columns to the table. The fact that you can't reasonably use populate_record/_json with UPDATE is a significant omission. IMO this really speaks for supporting shorthand whole-row assignment, whatever the syntax. Regards, Marti
Re: [HACKERS] Support UPDATE table SET(*)=...
Hi On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma atri.j...@gmail.com wrote: Please find attached a patch which implements support for UPDATE table1 SET(*)=... I presume you haven't read Tom Lane's proposal and discussion about multiple column assignment in UPDATE: http://www.postgresql.org/message-id/1783.1399054...@sss.pgh.pa.us (Assigning all columns was also discussed there) And there's a WIP patch: http://www.postgresql.org/message-id/20930.1402931...@sss.pgh.pa.us Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Cleanup: unify checks for catalog modification
Hi list, I happened to notice that there are no less than 14 places in the code that check whether a relation is a system catalog and throwing the error permission denied: foo is a system catalog The attached patch factors all of those into a single ForbidSystemTableMods() function. Is this considered an improvement? Should I add it to CommitFest? Regards, Marti 0001-Cleanup-unify-checks-for-catalog-modification.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Oct 9, 2014 at 11:11 AM, Peter Geoghegan p...@heroku.com wrote: On Thu, Oct 9, 2014 at 12:38 AM, Simon Riggs si...@2ndquadrant.com wrote: Do not use CONFLICTING() which looks like it is a function. So is ROW(). Or COALESCE(). ROW and COALESCE behave almost like functions: they operate on any expression or value you pass to them. db=# select coalesce('bar'); coalesce -- bar Not so with CONFLICTING(), it only accepts a column name -- not a value -- and has knowledge of the surrounding statement that ordinary function-like constructs don't. db=# INSERT into evt_type (name) values ('foo') on conflict UPDATE set name=conflicting('bar'); ERROR: syntax error at or near 'bar' LINE 1: ...lues ('foo') on conflict UPDATE set name=conflicting('bar'); If you don't have a word that you think would more clearly indicate the intent of the expression, I'm happy to hear suggestions from others. I also like NEW due to similarity with triggers, but I see your concern about it not actually being new. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Log notice that checkpoint is to be written on shutdown
On Thu, Oct 2, 2014 at 4:21 PM, Michael Banck michael.ba...@credativ.de wrote: we have seen repeatedly that users can be confused about why PostgreSQL is not shutting down even though they requested it. Usually, this is because `log_checkpoints' is not enabled and the final checkpoint is being written, delaying shutdown. Are you convinced that that's the *actual* reason? Maybe you're mis-attributing it. In my experience shutdown delays are often caused by using the default smart shutdown mode, which is another way of saying completely stupid. It waits until all connections to the server are closed -- meaning never in common situations with connection pooling or when an admin has forgotten their psql shell open. To add insult to injury, you can't open a new connection to invoke pg_terminate_backend() to kill them, either. In the case of a restart, it can cause much longer downtime than a fast/immediate restart would. Sorry for the rant, but am I the only one hating that default? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Wed, Oct 8, 2014 at 3:47 AM, Peter Geoghegan p...@heroku.com wrote: It seems like what you're talking about here is just changing the spelling of what I already have. I think there's a subtle difference in expectations too. The current BEFORE INSERT trigger behavior is somewhat defensible with an INSERT-driven syntax (though I don't like it even now [1]). But the MERGE syntax, to me, strongly implies that insertion doesn't begin before determining whether a conflict exists or not. [1] http://www.postgresql.org/message-id/CABRT9RD6zriK+t6mnqQOqaozZ5z1bUaKh+kNY=o9zqbzfoa...@mail.gmail.com Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Tue, Oct 7, 2014 at 2:27 PM, Marti Raudsepp ma...@juffo.org wrote: but the new approach seems surprising: changes from BEFORE INSERT get persisted in the database, but AFTER INSERT is not fired. I am sorry, I realize now that I misunderstood the current proposed trigger behavior, I thought what Simon Riggs wrote here already happens: https://groups.google.com/forum/#!msg/django-developers/hdzkoLYVjBY/bnXyBVqx95EJ But the point still stands: firing INSERT triggers when the UPDATE path is taken is counterintuitive. If we prevent changes of upsert key columns in BEFORE triggers then we get the benefits, including more straightforward trigger behavior and avoid problems with serial columns. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Wed, Oct 8, 2014 at 12:28 PM, Peter Geoghegan p...@heroku.com wrote: On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp ma...@juffo.org wrote: I think there's a subtle difference in expectations too. The current BEFORE INSERT trigger behavior is somewhat defensible with an INSERT-driven syntax (though I don't like it even now [1]). There is no way around it. We need to fire before row triggers to know what to insert on the one hand, but on the other hand (in general) we have zero ability to nullify the effects (or side-effects) of before triggers, since they may execute arbitrary user-defined code. With my proposal this problem disappears: if we prevent BEFORE triggers from changing key attributes of NEW in the case of upsert, then we can acquire value locks before firing any triggers (before even constructing the whole tuple), and have a guarantee that the value locks are still valid by the time we proceed with the actual insert/update. Other than changing NEW, the side effects of triggers are not relevant. Now, there may very well be reasons why this is tricky to implement, but I haven't heard any. Can you see any concrete reasons why this won't work? I can take a shot at implementing this, if you're willing to consider it. I think there is a good case to be made for severely restricting what before row triggers can do, but it's too late for that. There are no users of new upsert syntax out there yet, so it's not too late to rehash the semantics of that. This in no way affects users of old INSERT/UPDATE syntax. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPSERT wiki page, and SQL MERGE syntax
On Thu, Oct 9, 2014 at 1:51 AM, Peter Geoghegan p...@heroku.com wrote: On Wed, Oct 8, 2014 at 2:01 PM, Kevin Grittner kgri...@ymail.com wrote: Although the last go-around does suggest that there is at least one point of difference on the semantics. You seem to want to fire the BEFORE INSERT triggers before determining whether this will be an INSERT or an UPDATE. That seems like a bad idea to me Indeed, the current behavior breaks even the canonical keep track of how many posts are in a thread trigger example because INSERT triggers are fired for an effective row UPDATE. I can't see how that's acceptable. Well, it isn't that I'm doing it because I think that it is a great idea, with everything to recommend it. It's more like I don't see any practical alternative. I proposed an alternative that avoids this surprise and might allow some other benefits. Can you please look into that? Even a clarify this, this couldn't possibly work or you're not a major contributor so your arguments are invalid response. ;) I admit I initially misunderstood some details about the current behavior. I can dig into the code and write a clearer and/or more detailed spec if I had even *some* feedback. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPSERT wiki page, and SQL MERGE syntax
On Thu, Oct 9, 2014 at 2:46 AM, Peter Geoghegan p...@heroku.com wrote: Indeed, the current behavior breaks even the canonical keep track of how many posts are in a thread trigger example use an AFTER trigger for this kind of thing, and all of these issues go away. In the latest patches from CommitFest, the UPDATE case invokes BEFOREAFTER INSERT triggers, not AFTER UPDATE. Is that going to be changed? If so, I concede that. I proposed an alternative that avoids this surprise and might allow some other benefits. Can you please look into that? I looked at that. Thanks! You're talking about making the case where before row triggers clearly are appropriate (when you just want to change the tuple being inserted) fail Only in case the trigger changes *key* columns necessary for atomicity (i.e. from the WITHIN index). Other columns are fair game. The restriction seems justifiable to me: it's unreasonable to be atomic with respect to values that change mid-way. * It can distinguish between BEFORE INSERT and BEFORE UPDATE triggers, which your approach fundamentally cannot. * It can probably avoid evaluating column defaults in the UPDATE case, like nextval() for unrelated serial columns = reduced overhead * The semantics match better with MERGE-like syntax that seems to come up again and again. * And it appears to solve (part?) of the problem you had with naming key *colums* in the WITHIN clause, rather than using index name. If you don't see any reasons why it can't be done, these benefits seem clear to me. I think the tradeoffs at least warrant wider discussion. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPSERT wiki page, and SQL MERGE syntax
On Thu, Oct 9, 2014 at 4:56 AM, Marti Raudsepp ma...@juffo.org wrote: create trigger ev1 before insert on evt_type execute procedure ins(); create trigger ev2 before update on evt_type execute procedure upd(); create trigger ev3 after insert on evt_type execute procedure ins(); create trigger ev4 after update on evt_type execute procedure upd(); Oops, I missed for each row here. Sorry for wasting your time. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Sep 4, 2014 at 12:13 AM, Peter Geoghegan p...@heroku.com wrote: On Wed, Sep 3, 2014 at 9:51 AM, Robert Haas robertmh...@gmail.com wrote: INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN upsert_pkey UPDATE SET val = 'update'; It seems to me that it would be better to specify a conflicting column set rather than a conflicting index name. I'm open to pursuing that, provided there is a possible implementation that's robust against things like BEFORE triggers that modify constrained attributes. It must also work well with partial unique indexes. So I imagine we'd have to determine a way of looking up the unique index only after BEFORE triggers fire. Unless you're comfortable with punting on some of these cases by throwing an error, then all of this is actually surprisingly ticklish. Speaking of this, I really don't like the proposed behavior of firing BEFORE INSERT triggers even before we've decided whether to insert or update. In the classical upsert pattern, changes by a BEFORE INSERT trigger would get rolled back on conflict, but the new approach seems surprising: changes from BEFORE INSERT get persisted in the database, but AFTER INSERT is not fired. I haven't found any discussion about alternative triggers semantics for upsert. If there has been any, can you point me to it? How about this: use the original VALUES results for acquiring a value lock; if indeed the row didn't conflict, *then* fire BEFORE INSERT triggers, and throw an error if the trigger changed any columns of the (specified?) unique key. Advantages of this approach: 1. Would solve the above conundrum about specifying a unique index via columns. 2. In the UPDATE case we can skip evaluating INSERT triggers and DEFAULT expressions for columns 3. If I'm not missing anything, this approach may also let us get rid of the CONFLICTING() construct 4. Possibly be closer to MySQL's syntax? Point (2) is actually a major consideration IMO: if your query is mostly performing UPDATEs, on a table with SERIAL keys, and you're using a different key to perform the updates, then you waste sequence values unnecessarily. I believe this is a very common pattern, for example: create table evt_type (id serial primary key, name text unique, evt_count int); prepare upsert(text) as INSERT into evt_type (name, evt_count) values ($1, 1) on conflict within evt_type_name_key UPDATE set evt_count=evt_count+1; execute upsert('foo'); execute upsert('foo'); execute upsert('bar'); # table evt_type; id | name | evt_count +--+--- 1 | foo | 2 3 | bar | 1 -- id could very well be 2 Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 5:12 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Oct 6, 2014 at 4:17 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: CREATE INDEX ... [ IF NOT EXISTS [ name ] ] ON ... I think this one is wrong now. I see now, I think you meant: CREATE INDEX ... [ [ IF NOT EXISTS ] name ] ON ... If yes, +1 for this, there's no redundancy any more. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 4:27 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: create_index_if_not_exists_v7.patch Looks good to me. Marking ready for committer. If you have any feedback about my reviews, I would gladly hear it. I'm quite new to this. PS: You seem to be submitting many patches, but have you reviewed any recently? See: https://wiki.postgresql.org/wiki/Submitting_a_Patch#Mutual_Review_Offset_Obligations Each patch submitter to a CommitFest is expected to review at least one other patch Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
I'm a bit confused about who I should be replying to, but since you were the last one with a patch... On Mon, Oct 6, 2014 at 11:44 AM, Ali Akbar the.ap...@gmail.com wrote: Thanks for the review. Attached the formatted patch according to your suggestion. + select * from generate_series(0.1::numeric, 10.0::numeric, 0.1::numeric); + generate_series + - + 0.1 ... + 10.0 + (100 rows) Unless there is a good reason, can you please keep individual test output fewer than 100 lines? I think the 41-line result is an overkill as well. This just bloats the repository size unnecessarily. Also, I see that this patch was added to the 2014-10 commitfest and then deleted again (by user apaan). Why? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
On Mon, Oct 6, 2014 at 6:12 PM, Ali Akbar the.ap...@gmail.com wrote: User apaan is me. When i added to the commitfest, the patch is listed there by me (apaan). That's fine I think, it's just for tracking who made the changes in the CommitFest app. What actually matters is what you write in the Author field, which could contain all 3 names separated by commas. the one that tests values just before numeric overflow Actually I don't know if that's too useful. I think you should add a test case that causes an error to be thrown. Also, I noticed that there are a few trailing spaces in the patch that should be removed: +generate_series_numeric(PG_FUNCTION_ARGS) ... + if (NUMERIC_IS_NAN(start_num) || NUMERIC_IS_NAN(finish_num)) ... + if (PG_NARGS() == 3) ... + if (NUMERIC_IS_NAN(step_num)) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Fri, Oct 3, 2014 at 7:25 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: I agree with your grammar change. Cool. The version 5 (attached) contains all discussed until now. From documentation: CREATE INDEX ... [ IF NOT EXISTS name | name ] ON ... Maybe I'm just slow, but it took me a few minutes to understand what this means. :) I would add a human-language explanation to IF NOT EXISTS description: Index name is required when IF NOT EXISTS is specified You have resurrected this bit again, which now conflicts with git master... - write_msg(NULL, reading row-security enabled for table \%s\, + write_msg(NULL, reading row-security enabled for table \%s\\n, n-concurrent = $4; + n-if_not_exists = false; n-idxname = $5; Minor stylistic thing: now that this is a constant, I would move it to the end together with other constant assignments, and follow the struct's field ordering (in both code paths): n-isconstraint = false; n-deferrable = false; n-initdeferred = false; n-if_not_exists = false; Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 4:17 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Sun, Oct 5, 2014 at 9:52 AM, Marti Raudsepp ma...@juffo.org wrote: CREATE INDEX ... [ IF NOT EXISTS name | name ] ON ... Maybe I'm just slow, but it took me a few minutes to understand what this means. :) Well, I try to show that IF NOT EXISTS require the name. Is this wrong? No, I'm sorry, you misunderstood me. It was totally correct before, it just wasn't easy to understand at first. CREATE INDEX ... [ IF NOT EXISTS [ name ] ] ON ... I think this one is wrong now. It suggests these are valid syntaxes: CREATE INDEX ... ON ... CREATE INDEX ... IF NOT EXISTS ON ... -- wrong CREATE INDEX ... IF NOT EXISTS name ON ... Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Wed, Oct 1, 2014 at 2:42 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: So, what's the correct/best grammar? CREATE [ IF NOT EXISTS ] [ UNIQUE ] INDEX index_name or CREATE [ UNIQUE ] INDEX [ IF NOT EXISTS ] index_name I've elected myself as the reviewer for this patch. Here are some preliminary comments... I agree with José. The 2nd is more consistent given the other syntaxes: CREATE { TABLE | SCHEMA | EXTENSION | ... } IF NOT EXISTS name ... It's also compatible with SQLite's grammar: https://www.sqlite.org/lang_createindex.html Do we want to enforce an order on the keywords or allow both? CREATE INDEX IF NOT EXISTS CONCURRENTLY foo ... CREATE INDEX CONCURRENTLY IF NOT EXISTS foo ... It's probably very rare to use both keywords at the same time, so I'd prefer only the 2nd, unless someone else chimes in. Documentation: I would prefer if the explanation were consistent with the description for ALTER TABLE/EXTENSION; just copy it and replace relation with index. + ereport(NOTICE, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg(relation \%s\ already exists, skipping, + indexRelationName))); 1. Clearly relation should be index. 2. Use ERRCODE_DUPLICATE_OBJECT not TABLE + if (n-if_not_exists n-idxname == NULL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg(IF NOT EXISTS requires that you name the index.), I think ERRCODE_SYNTAX_ERROR makes more sense, it's something that we decided we *don't want* to support. - write_msg(NULL, reading row-security enabled for table \%s\, + write_msg(NULL, reading row-security enabled for table \%s\\n, ??? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements
On Tue, Aug 26, 2014 at 4:20 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04/14/2014 10:31 PM, Fabrízio de Royes Mello wrote: The attached patch contains CINE for sequences. I just strip this code from the patch rejected before. Committed with minor changes Hmm, the CommitFest app lists Marko Tiikkaja as the reviewer, but I can't find his review anywhere... The documentation claims: CREATE [ IF NOT EXISTS ] SEQUENCE name But grammar implements it the other way around: CREATE SEQUENCE IF NOT EXISTS name; Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Fri, Oct 3, 2014 at 2:15 AM, Marti Raudsepp ma...@juffo.org wrote: + ereport(NOTICE, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg(relation \%s\ already exists, skipping, + indexRelationName))); 1. Clearly relation should be index. 2. Use ERRCODE_DUPLICATE_OBJECT not TABLE My bad, this code is OK. The current code already uses relation and TABLE elsewhere because indexes share the same namespace with tables. + /* + * Throw an exception when IF NOT EXISTS is used without a named + * index + */ I'd say without an index name. And the line goes beyond 80 characters wide. I would also move this check to after all the attributes have been assigned, rather than splitting the assignments in half. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Fri, Sep 19, 2014 at 4:45 AM, Andrew Gierth and...@tao11.riddles.org.uk wrote: GroupAggregate (cost=1122.39..1197.48 rows=9 width=8) Group Key: two, four Group Key: two Group Key: () Grouping Sets: [ [two, four], [two], [] +1 looks good to me. (yaml format) Grouping Sets: - - two - four - - two - Now this is weird. But is anyone actually using YAML output format, or was it implemented simply because we can? Marti Do you think it would be reasonable to normalize single-set Marti grouping sets into a normal GROUP BY? It's certainly possible, though it would seem somewhat odd to write queries that way. The reason I bring this up is that queries are frequently dynamically generated by programs. Coders are unlikely to special-case SQL generation when there's just a single grouping set. And that's the power of relational databases: the optimization work is done in the database pretty much transparently to the coder (when it works, that is). would you want the original syntax preserved in views Doesn't matter IMO. Marti I'd expect GROUP BY () to be fully equivalent to having no Marti GROUP BY clause, but there's a difference in explain Marti output. The former displays Grouping Sets: () which is odd, Marti since none of the grouping set keywords were used. That's an implementation artifact, in the sense that we preserve the fact that GROUP BY () was used by using an empty grouping set. Is it a problem, really, that it shows up that way in explain? No, not really a problem. :) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join consolidation / Removing duplicate joins
On Wed, Sep 17, 2014 at 2:00 PM, David Rowley dgrowle...@gmail.com wrote: Anyway... I've been thinking of writing some code that converts these sub plans into left joins where it can be proved that the subquery would only at most produce 1 row Does anyone have any thoughts on this? +1, I've thought about this part before. There is already precedent for inlining FROM clause subqueries into the main query, it would be nice to do that for correlated subqueries as well. It seems we've been adding features to the planner without fully exploiting opportunities for normalization and consolidation of optimization techniques. I think it's not even necessary to prove uniqueness of the subquery as you describe. Now that 9.3 added LATERAL, a correlated subquery can be seen as a special case of LATERAL LEFT JOIN with an additional check to raise an error if 1 rows are returned from the inner side. And you could optionally elide the error check if you can prove uniqueness. Advantages: 1. Sufficiently simple lateral subqueries are already normalized into ordinary JOINs with hash/merge support, so you would get that for free (probably requires eliding the 1-row check). 2. We get rid of silliness like the explosion of SubPlan nodes for each reference (see examples below). 3. Based on some naive testing, it seems that 9.5devel performs slightly better with NestLoop LATERAL subqueries than SubPlan correlated ones. 4. EXPLAIN output is easier to read, I find. I suppose EXISTS/IN with correlated subqueries needs some different treatment, as it can currently take advantage of the hashed SubPlan optimization. Can anyone see any downsides? Perhaps one day we can get rid of SubPlan entirely, would anyone miss it? Example of SubPlan explosion: regression=# create view foo1 as select *, (select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1) from tenk1; regression=# explain analyze select * from foo1 where ten2 between 1 and 3; Seq Scan on tenk1 (cost=0.00..175782.08 rows= width=244) (actual time=0.052..49.288 rows=3000 loops=1) Filter: (((SubPlan 2) = 1) AND ((SubPlan 3) = 3)) Rows Removed by Filter: 7000 SubPlan 1 - Index Scan using tenk2_unique1 on tenk2 (cost=0.29..8.30 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=3000) Index Cond: (tenk1.unique1 = unique1) SubPlan 2 - Index Scan using tenk2_unique1 on tenk2 tenk2_1 (cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1) Index Cond: (tenk1.unique1 = unique1) SubPlan 3 - Index Scan using tenk2_unique1 on tenk2 tenk2_2 (cost=0.29..8.30 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=9000) Index Cond: (tenk1.unique1 = unique1) Execution time: 49.508 ms LATERAL is a win even when using OFFSET 0 to prevent inlining: regression=# create view foo3 as select * from tenk1 left join lateral (select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1 offset 0) x on true; regression=# explain analyze select * from foo3 where ten2 between 1 and 3; Nested Loop (cost=0.29..83733.00 rows=1 width=248) (actual time=0.043..28.963 rows=3000 loops=1) - Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=244) (actual time=0.008..1.177 rows=1 loops=1) - Subquery Scan on x (cost=0.29..8.32 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((x.ten2 = 1) AND (x.ten2 = 3)) Rows Removed by Filter: 1 - Index Scan using tenk2_unique1 on tenk2 (cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1) Index Cond: (tenk1.unique1 = unique1) Execution time: 29.186 ms And if you could prove uniqueness of the inner side and inline it, WHERE clauses can also be pushed down trivially: regression=# create view foo2 as select * from tenk1 left join lateral (select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1) x on true; regression=# explain analyze select * from foo2 where ten2 between 1 and 3; Hash Join (cost=532.50..1083.00 rows=3000 width=248) (actual time=1.848..4.480 rows=3000 loops=1) Hash Cond: (tenk1.unique1 = tenk2.unique1) - Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=244) (actual time=0.002..0.617 rows=1 loops=1) - Hash (cost=495.00..495.00 rows=3000 width=8) (actual time=1.837..1.837 rows=3000 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 118kB - Seq Scan on tenk2 (cost=0.00..495.00 rows=3000 width=8) (actual time=0.004..1.562 rows=3000 loops=1) Filter: ((ten = 1) AND (ten = 3)) Rows Removed by Filter: 7000 Execution time: 4.591 ms Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: gsp1.patch - phase 1 code patch (full syntax, limited functionality) gsp2.patch - phase 2 code patch (adds full functionality using the new chained aggregate mechanism) I gave these a try by converting my current CTE-based queries into CUBEs and it works as expected; query time is cut in half and lines of code is 1/4 of original. Thanks! I only have a few trivial observations; if I'm getting too nitpicky let me know. :) Since you were asking for feedback on the EXPLAIN output on IRC, I'd weigh in and say that having the groups on separate lines would be significantly more readable. It took me a while to understand what's going on in my queries due to longer table and column names and wrapping; The comma separators between groups are hard to distinguish. If that can be made to work with the EXPLAIN printer without too much trouble. So instead of: GroupAggregate Output: four, ten, hundred, count(*) Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four, onek.ten), (onek.four), () Perhaps print: Grouping Sets: (onek.four, onek.ten, onek.hundred) (onek.four, onek.ten) (onek.four) () Or maybe: Grouping Set: (onek.four, onek.ten, onek.hundred) Grouping Set: (onek.four, onek.ten) Grouping Set: (onek.four) Grouping Set: () Both seem to work with the explain.depesz.com parser, although the 1st won't be aligned as nicely. Do you think it would be reasonable to normalize single-set grouping sets into a normal GROUP BY? Such queries would be capable of using HashAggregate, but the current code doesn't allow that. For example: set enable_sort=off; explain select two, count(*) from onek group by grouping sets (two); Could be equivalent to: explain select two, count(*) from onek group by two; I'd expect GROUP BY () to be fully equivalent to having no GROUP BY clause, but there's a difference in explain output. The former displays Grouping Sets: () which is odd, since none of the grouping set keywords were used. # explain select count(*) from onek group by (); Aggregate (cost=77.78..77.79 rows=1 width=0) Grouping Sets: () - Index Only Scan using onek_stringu1 on onek (cost=0.28..75.28 rows=1000 width=0) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Wed, Sep 3, 2014 at 10:49 PM, Kevin Grittner kgri...@ymail.com wrote: Marti Raudsepp ma...@juffo.org wrote: The concept of lightweight relations that pop into existence when a certain kind of trigger definition is used somewhere in the function stack, without a CREATE TABLE, without being discoverable in information_schema etc., I find needs some more justification than I've seen in this thread. So far I've only heard that it's more convenient to implement in the current PostgreSQL code base. It is required by the SQL standard. I had a cursory read of the SQL 20nn draft and I don't get this impression. The only place I could find discussing the behavior of transition tables is in Foundation 4.39.1 General description of triggers, which says: Special variables make the data in the transition table(s) available to the triggered action. For a statement-level trigger the variable is one whose value is a transition table. There is no information about the scoping of such variables, so I assume it refers to a regular locally scoped variable. Did I miss something? Are you reading a different version of the spec? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Mon, Sep 1, 2014 at 9:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: OTOH, I agree with Kevin that the things we're talking about are lightweight relations not variables. My worry is that PL/pgSQL and Postgres's SQL dialect is turning into a Frankenstein monster with many ways to do the same thing, each having different semantics that require effort to reason about. Variables and function arguments are non-contriversial, every experienced coder understands their semantics without thinking twice -- even if they're not familiar with Postgres. The concept of lightweight relations that pop into existence when a certain kind of trigger definition is used somewhere in the function stack, without a CREATE TABLE, without being discoverable in information_schema etc., I find needs some more justification than I've seen in this thread. So far I've only heard that it's more convenient to implement in the current PostgreSQL code base. I'm sure more questions would pop up in practice, but as Heikki mentioned: Are such relations also visible to other functions called by the trigger function? * If yes, this introduces non-obvious dependencies between functions. What happens when one trigger with delta relations invokes another trigger, does the previous one get shadowed or overwritten? What are the interactions with search_path? Can an unprivileged function override relation names when calling a SECURITY DEFINER function? * If not, this further inhibits developers from properly modularizing their trigger code (this is already a problem due to the current magic trigger variables). Even if these questions have reasonable answers, it takes mental effort to remember the details. Procedure code debugging, especially triggers, is hard enough due to poor tooling; increasing the cognitive load should not be done lightly. You could argue that CREATE TEMP TABLE already has some of these problems, but it's very rare that people actually need to use that. If delta relations get built on this new mechanism, avoiding won't be an option any more. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Thu, Aug 28, 2014 at 12:03 AM, Kevin Grittner kgri...@ymail.com wrote: In essence, make the relations work like PL/pgSQL variables do. If you squint a little, the new/old relation is a variable from the function's point of view, and a parameter from the planner/executor's point of view. It's just a variable/parameter that holds a set of tuples, instead of a single Datum. will be surprised if someone doesn't latch onto this to create a new declared temporary table that only exists within the scope of a compound statement (i.e., a BEGIN/END block). You would DECLARE them just like you would a scalar variable in a PL, and they would have the same scope. I'll take a look at doing this in the next couple days, and see whether doing it that way is as easy as it seems on the face of it. We already have all this with refcursor variables. Admittedly cursors have some weird semantics (mutable position) and currently they're cumbersome to combine into queries, but would a new relation variable be sufficiently better to warrant a new type? Why not extend refcursors and make them easier to use instead? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_shmem_allocations view
Hi On Thu, May 8, 2014 at 4:28 PM, Andres Freund and...@2ndquadrant.com wrote: Ok. A new version of the patches implementing that are attached. Including a couple of small fixups and docs. The latter aren't extensive, but that doesn't seem to be warranted anyway. Is it really actually useful to expose the segment off(set) to users? Seems to me like unnecessary internal details leaking out. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb format is pessimal for toast compression
On Fri, Aug 8, 2014 at 10:50 PM, Hannu Krosing ha...@2ndquadrant.com wrote: How hard and how expensive would it be to teach pg_lzcompress to apply a delta filter on suitable data ? So that instead of integers their deltas will be fed to the real compressor Has anyone given this more thought? I know this might not be 9.4 material, but to me it sounds like the most promising approach, if it's workable. This isn't a made up thing, the 7z and LZMA formats also have an optional delta filter. Of course with JSONB the problem is figuring out which parts to apply the delta filter to, and which parts not. This would also help with integer arrays, containing for example foreign key values to a serial column. There's bound to be some redundancy, as nearby serial values are likely to end up close together. In one of my past projects we used to store large arrays of integer fkeys, deliberately sorted for duplicate elimination. For an ideal case comparison, intar2 could be as large as intar1 when compressed with a 4-byte wide delta filter: create table intar1 as select array(select 1::int from generate_series(1,100)) a; create table intar2 as select array(select generate_series(1,100)::int) a; In PostgreSQL 9.3 the sizes are: select pg_column_size(a) from intar1; 45810 select pg_column_size(a) from intar2; 420 So a factor of 87 difference. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostrgeSQL vs oracle doing 1 million sqrts am I doing it wrong?
On Mon, Aug 4, 2014 at 11:48 PM, testman1316 danilo.rami...@hmhco.com wrote: In both we ran code that did 1 million square roots (from 1 to 1 mill). Then did the same but within an If..Then statement. Note: once we started running queries on the exact same data in Oracle and PostgreSQL we saw a similar pattern. On basic queries little difference, but as they started to get more and more complex Oracle was around 3-5 faster. Looks like from the test cases you posted, you're not actually benchmarking any *queries*, you're comparing the speeds of the procedural languages. And yes, PL/pgSQL is known to be a farily slow language. If you want fair benchmark results, you should instead concentrate on what databases are supposed to do: store and retrieve data; finding the most optimal way to execute complicated SQL queries. In most setups, that's where the majority of database processor time is spent, not procedure code. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Thu, Jul 31, 2014 at 9:41 PM, Robert Haas robertmh...@gmail.com wrote: I certainly like that better than poor-man; but proxy, to me, fails to convey inexactness. Maybe abbreviated, abridged, minified? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Tue, Jul 29, 2014 at 9:49 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I dislike this proposal - it is strongly inconsistent with current trigger design The real point I was trying to convey (in my previous email) is that these declarations should be part of the trigger *function* not the function-to-table relationship. CREATE TRIGGER shouldn't be in the business of declaring new local variables for the trigger function. Whether we define new syntax for that or re-use the argument list is secondary. But the inconsistency is deliberate, I find the current trigger API horrible. Magic variables... Text-only TG_ARGV for arguments... RETURNS trigger... No way to invoke trigger functions directly for testing. By not imitating past mistakes, maybe we can eventually arrive at a language that makes sense. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Sat, Jul 5, 2014 at 5:38 PM, Kevin Grittner kgri...@ymail.com wrote: it seems to me that we need the full tuple to support triggers on FDWs, so the TID approach would be an optimization for a subset of the cases, and would probably be more appropriate, if we do it at all, in a follow-on patch If you disagree with that assessment, now would be a good time to explain your reasoning. Maybe I just have a limited imagination because I've never found a use for FDWs personally. But recording changes from a trigger on a FDW table doesn't seem that useful, since you can only capture changes done by the local node. I expect that in many situations there are multiple writers accessing the same underlying remote table. Certainly it's can't guarantee the consistency of materialized views. I took a look at whether I could avoid making OLD and NEW non-reserved keywords, but I didn't see how to do that without making FOR at least partially reserved. If someone sees a way to do this without creating three new unreserved keywords (REFERENCING, OLD, and NEW) I'm all ears. Sorry, I know I am very late to make this point, so feel free to ignore this. I'm not a fan of the SQL standard syntax for this feature. One nice thing about PostgreSQL's triggers is that you can declare the trigger function once and re-use it on many tables. It would make more sense if the same function declaration could say what variable/relation names it wants to use. They're more like function argument names, not some metadata about a table-function relationship. Putting these in the CREATE TRIGGER command means you have to repeat them for each table you want to apply the trigger to. It introduces the possibility of making more mistakes without any gain in flexibility. But then again, I understand that there's value in supporting standard syntax. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Mon, Jul 28, 2014 at 6:24 PM, Kevin Grittner kgri...@ymail.com wrote: Do you have some other suggestion? Keep in mind that it must allow the code which will *generate* the transition tables to know whether any of the attached triggers use a given transition table for the specific operation, regardless of the language of the trigger function. You will need to access the pg_proc record of the trigger function anyway, so it's just a matter of coming up with syntax that makes sense, right? What I had in mind was that we could re-use function argument declaration syntax. For instance, use the argmode specifier to declare OLD and NEW. Shouldn't cause grammar conflicts because the current OUT and INOUT aren't reserved keywords. We could also re-use the refcursor type, which already has bindings in some PLs, if that's not too much overhead. That would make the behavior straightforward without introducing new constructs, plus you can pass them around between functions. Though admittedly it's annoying to integrate cursor results into queries. Something like: CREATE FUNCTION trig(OLD old_rows refcursor, NEW new_rows refcursor) RETURNS trigger LANGUAGE plpgsql AS '...'; Or maybe if the grammar allows, we could spell out NEW TABLE, OLD TABLE, but that's redundant since you can already deduce that from the refcursor type. It could also be extended for different types, like tid[], and maybe record for the FOR EACH ROW variant (dunno if that can be made to work). Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
On Wed, Jun 4, 2014 at 1:47 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Here's a patch to make pg_xlogdump print summary statistics instead of individual records. Thanks! I had a use for this feature so I backported the (first) patch to PostgreSQL 9.3. It's a rush job so it's ugly and may have bugs, but it works for me. Just posting here, maybe someone else will find it useful too. Regards, Marti xlogdiff_stats_93.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
On Tue, Jul 1, 2014 at 11:13 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: In CF terms, did you form any opinion while porting the patch I posted about whether it's sensible/ready for inclusion in 9.5? I didn't look at the code more than necessary to make the build work. As far as functionality goes, it does exactly what I needed it to do; the output is very clear. I wanted to see how much WAL is being generated from SELECT FOR UPDATE locks. Here are some thoughts: The FPI acronym made me wonder at first. Perhaps Full page image size would be clearer (exactly 20 characters long, so it fits). But on the other hand, I find that the table is too wide, I always need to stretch my terminal window to fit it. I don't think we need to accommodate for 10^20 bytes ~ 87 exabytes of WAL files. :) You might also add units (kB/MB) to the table like pg_size_pretty, although that would make the magnitudes harder to gauge. P.S. In your patch, the documentation changes are duplicated. Oh right you are, I don't know how that happened. I also missed some record types (Btree/0x80, Btree/0xb0). Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Quantify small changes to predicate evaluation
On Tue, Jun 17, 2014 at 5:23 PM, Dennis Butterstein soullinu...@web.de wrote: I tried the proposed tweaks and see some differences regarding the measurements. Unfortunately the variance between the runs seems to remain high. Using these techniques I managed to get standard deviation below 1.5% in my read-only tests (and most below 1%). Not all workloads may be able to achieve that, but your should be able to do better than your results. Maybe your cooling is not sufficient? It seems your 2nd run is always slower than the first one, maybe the CPU is doing thermal throttling? Lowering the max frequency might be something to try to resolve that, like cpupower frequency-set --max 2GHz How do you run your benchmark, are you using pgbench? Single threaded? Is the client locked to the same CPU core? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Quantify small changes to predicate evaluation
On Fri, Jun 13, 2014 at 12:46 PM, Dennis Butterstein soullinu...@web.de wrote: I expect my current changes to be resposible for about 0.2-0.3s for this query but because of the huge time differences I am not able to quantify my changes. Maybe somebody can tell me about a better approach to quantify my changes or give me some input on how to get more stable postgres time measurements. There can be other reasons, but for read-only benchmarks, much of this variability seems to come from the operating system's power management and scheduler. I had some luck in reducing variance on Linux with some tricks. Disable CPU frequency scaling: for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo performance $i; done Disable the turbo boost feature if your CPU supports it: echo 1 /sys/devices/system/cpu/intel_pstate/no_turbo Always launch PostgreSQL and pgbench on a fixed CPU core, using taskset -c3 And exclude all other processes from this core (locking them to cores 0, 1, 2): ps -A -o pid h |xargs -n1 taskset -cp -a 0-2 /dev/null Transparent hugepage support may also interfere: echo never /sys/kernel/mm/transparent_hugepage/defrag echo never /sys/kernel/mm/transparent_hugepage/enabled I'm sure there are more tricks, but this should get you pretty far. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing NOT IN to use ANTI joins
On Wed, Jun 11, 2014 at 11:53 AM, David Rowley dgrowle...@gmail.com wrote: The only way to consistently guarantee nullability is through primary key constraints. Fortunately that addresses most of the use cases of NOT IN(), in my experience. See the comment in check_functional_grouping: I saw that, but I have to say I've not fully got my head around why that's needed just yet. I was wrong, see Tom's reply to my email. It's OK to rely on attnotnull for optimization decisions. The plan will be invalidated automatically when the nullability of a referenced column changes. check_functional_grouping needs special treatment because it decides whether to accept/reject views; and if it has allowed creating a view, it needs to guarantee that the dependent constraint isn't dropped for a longer term. as long as the transaction id is taken before we start planning in session 1 then it should not matter if another session drops the constraint and inserts a NULL value as we won't see that NULL value in our transaction... I'd assume that the transaction has to start before it grabs the table defs that are required for planning. Or have I got something wrong? 1. You're assuming that query plans can only survive for the length of a transaction. That's not true, prepared query plans can span many transactions. 2. Also a FOR UPDATE clause can return values from the future, if another transaction has modified the value and already committed. But this whole issue is moot anyway, the plan will get invalidated when the nullability changes. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing NOT IN to use ANTI joins
On Sun, Jun 8, 2014 at 3:36 PM, David Rowley dgrowle...@gmail.com wrote: Currently pull_up_sublinks_qual_recurse only changes the plan for NOT EXISTS queries and leaves NOT IN alone. The reason for this is because the values returned by a subquery in the IN clause could have NULLs. There's a bug in targetListIsGuaranteedNotToHaveNulls, you have to drill deeper into the query to guarantee the nullability of a result column. If a table is OUTER JOINed, it can return NULLs even if the original column specification has NOT NULL. This test case produces incorrect results with your patch: create table a (x int not null); create table b (x int not null, y int not null); insert into a values(1); select * from a where x not in (select y from a left join b using (x)); Unpatched version correctly returns 0 rows since y will be NULL. Your patch returns the value 1 from a. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing NOT IN to use ANTI joins
On Sun, Jun 8, 2014 at 3:36 PM, David Rowley dgrowle...@gmail.com wrote: Currently pull_up_sublinks_qual_recurse only changes the plan for NOT EXISTS queries and leaves NOT IN alone. The reason for this is because the values returned by a subquery in the IN clause could have NULLs. I believe the reason why this hasn't been done yet, is that the plan becomes invalid when another backend modifies the nullability of the column. To get it to replan, you'd have to introduce a dependency on the NOT NULL constraint, but it's impossible for now because there's no pg_constraint entry for NOT NULLs. The only way to consistently guarantee nullability is through primary key constraints. Fortunately that addresses most of the use cases of NOT IN(), in my experience. See the comment in check_functional_grouping: * Currently we only check to see if the rel has a primary key that is a * subset of the grouping_columns. We could also use plain unique constraints * if all their columns are known not null, but there's a problem: we need * to be able to represent the not-null-ness as part of the constraints added * to *constraintDeps. FIXME whenever not-null constraints get represented * in pg_constraint. The behavior you want seems somewhat similar to check_functional_grouping; maybe you could unify it with your targetListIsGuaranteedNotToHaveNulls at some level. (PS: that's one ugly function name :) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table
On Fri, Apr 25, 2014 at 8:58 PM, Josh Berkus j...@agliodbs.com wrote: Well, I've already had collisions with UUID-OSSP, in production, with only around 20 billion values. So clearly there aren't 122bits of true randomness in OSSP. I can't speak for other implementations because I haven't tried them. Interesting. The statistical chances of this happening should be approximately 4e-17. Are you certain that this was due to uuid-ossp and not an application bug? Can you say what kind of operating system and environment that was? I skimmed the sources of uuid-ossp 1.6.2 and it seems to be doing the right thing, using /dev/urandom or /dev/random on Unixes and CryptGenRandom on Windows. Barring any bugs, of course. However, if these fail for whatever reason (e.g. out of file descriptors), it falls back to libc random(), which is clearly broken. On Fri, Apr 25, 2014 at 6:18 PM, Greg Stark st...@mit.edu wrote: The difficulty lies not really in the PRNG implementation (which is hard but well enough understood that it's not much of an issue these days). The difficulty lies in obtaining enough entropy. There are ways of obtaining enough entropy and they are available. Obtaining enough entropy requires access to hardware devices which means a kernel system call. This is a solved problem in most environments, too. The kernel collects entropy from unpredictable events and then seeds a global CSPRNG with that. This collection happens always regardless of whether you request random numbers or not, so essentially comes for free. Applications can then request output from this CSPRNG. Reason being, this infrastructure is necessary for more critical tasks than generating UUIDs: pretty much all of cryptography requires random numbers. They also deplete the available entropy pool for other sources which may means they have security consequences. This only affects the Linux /dev/random, which is discouraged these days for that reason. Applications should use urandom instead. To my knowledge, there are no other operating systems that have this depletion behavior. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.4 Proposal: Initdb creates a single table
On Wed, Apr 23, 2014 at 11:26 AM, David Rowley dgrowle...@gmail.com wrote: but for a long time I've thought that it would be nice if PostgreSQL came with an example database that had a number of tables, perhaps that mock up some easy to relate to real-world application. These would be very useful to use as examples in the documents instead of inventing them in the ad-hoc way that we currently do. Like here: http://www.postgresql.org/docs/9.3/static/tutorial-window.html I think that's a great idea. I'm not convinced it should be created by default in initdb, but a CREATE EXTENSION sample_data seems easy enough for newbies to use and has a good chance of getting merged into contrib. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table
On Thu, Apr 24, 2014 at 8:40 PM, Josh Berkus j...@agliodbs.com wrote: A pseudo-random UUID is frankly pretty useless to me because (a) it's not really unique This is FUD. A pseudorandom UUID contains 122 bits of randomness. As long as you can trust the random number generator, the chances of a value occurring twice can be estimated using the birthday paradox: there's a 50% chance of having *one* collision in a set of 2^61 items. Storing this amount of UUIDs alone requires 32 exabytes of storage. Factor in the tuple and indexing overheads and you'd be needing close to all the hard disk space ever manufactured in the world. If you believe there's a chance of ever seeing a pseudorandom UUID collision in practice, you should be buying lottery tickets. To the contrary. Combined with the fact that pseudorandom UUID generation doesn't require any configuration (node ID), doesn't leak any private data (MAC address) and relies on infrastructure that's ubiquitous anyway (cryptographic PRNG) it's almost always the right answer. (b) it doesn't help me route data at all. That's really out of scope for UUIDs. They're about generating identifiers, not describing what the identifier means. UUIDs also don't happen to cure cancer. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table
On Fri, Apr 25, 2014 at 3:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Of course, the weak spot in this analysis is the assumption that there are actually 122 independent bits in the value. It's not difficult to imagine that systems with crummy random() implementations might only have something like 32 bits worth of real randomness. Obviously you can't use random(). That's why I talked about cryptographic PRNGs, crypto libraries do proper seeding and generate reliably random numbers all the time. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
On Sun, Mar 23, 2014 at 7:57 AM, Amit Kapila amit.kapil...@gmail.com wrote: Anyone has any objection for this behaviour difference between usage of ::regclass and to_regclass()? No, I think that makes a lot of sense given the behavior -- if the object is not there, to_regclass() just returns NULL. It doesn't require the object to be present, it should not create a dependency. This allows you to, for example, drop and recreate sequences while tables are still using them. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
On Tue, Mar 18, 2014 at 9:29 PM, Petr Jelinek p...@2ndquadrant.com wrote: Attached V4 uses shadowed-variables instead of shadow. I think that should be shadowed_variables for consistency; GUC values usually have underscores, not dashes. (e.g. intervalstyle=sql_standard, backslash_quote=safe_encoding, synchronous_commit=remote_write etc) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disk usage for intermediate results in join
On Tue, Mar 11, 2014 at 1:24 PM, Parul Lakkad parul.lak...@gmail.com wrote: I am trying to figure out when disk is used to store intermediate results while performing joins in postgres. Joins can also cause a Nested Loop+Materialize plan, which spills to disk if the materialize result set is too large for work_mem. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Display oprcode and its volatility in \do+
On Thu, Jan 16, 2014 at 5:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: but adding volatility here seems like probably a waste of valuable terminal width. I think that the vast majority of operators have immutable or at worst stable underlying functions, so this doesn't seem like the first bit of information I'd need about the underlying function. For a data point, just today I wanted to look up the volatility of pg_trgm operators, which made me remember this patch. The \do+ output is narrow enough, I think an extra volatility column wouldn't be too bad. But even just having the function name is a huge improvement, at least that allows looking up volatility using \commands without accessing pg_operator directly. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Selecting large tables gets killed
On Thu, Feb 20, 2014 at 12:07 PM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: That seems a good idea. We will get rid of FETCH_COUNT then, wouldn't we? No, I don't think we want to do that. FETCH_COUNT values greater than 1 are still useful to get reasonably tabulated output without hogging too much memory. For example: db=# \set FETCH_COUNT 3 db=# select repeat('a', i) a, 'x'x from generate_series(1,9)i; a | x -+--- a | x aa | x aaa | x | x a | x aa | x aaa | x | x a | x Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes up for review
On Sun, Feb 16, 2014 at 10:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Any comments before I start transposing them into the back branches? Sorry I'm late. Shore up GRANT ... WITH ADMIN OPTION restrictions (Noah Misch) I'm not familiar with the phrase Shore up, I think it should use more precise language: are the privilege checks getting more strict or less strict? Wow, there are quite a lot of items this time. Have you considered grouping the items by their impact, for example security/data corruption/crash/correctness/other? I think that would make it easier for readers to find items they're interested in. Most changes seem pretty straightforward to categorize; there are always outliers, but even if a few items are miscategorized, that's an improvement over what we have now. Of course someone has to be willing to do that work. If this warrants more discussion, I can draft out a proposal in a new topic. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp ma...@juffo.org wrote: With partial-sort-basic-1 and this fix on the same test suite, the planner overhead is now a more manageable 0.5% to 1.3%; one test is faster by 0.5%. Ping, Robert or anyone, does this overhead seem bearable or is that still too much? Do these numbers look conclusive enough or should I run more tests? I think the 1st patch now has a bug in initial_cost_mergejoin; you still pass the presorted_keys argument to cost_sort, making it calculate a partial sort cost Ping, Alexander? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Mon, Feb 10, 2014 at 8:59 PM, Alexander Korotkov aekorot...@gmail.com wrote: Done. Patch is splitted. Thanks! I think the 1st patch now has a bug in initial_cost_mergejoin; you still pass the presorted_keys argument to cost_sort, making it calculate a partial sort cost, but generated plans never use partial sort. I think 0 should be passed instead. Patch attached, needs to be applied on top of partial-sort-basic-1 and then reverse-applied on partial-sort-merge-1. With partial-sort-basic-1 and this fix on the same test suite, the planner overhead is now a more manageable 0.5% to 1.3%; one test is faster by 0.5%. Built with asserts disabled, ran on Intel i5-3570K. In an effort to reduce variance, I locked the server and pgbench to a single CPU core (taskset -c 3), but there are still noticeable run-to-run differences, so these numbers are a bit fuzzy. The faster result is definitely not a fluke, however; it happens every time. On Mon, Feb 10, 2014 at 2:33 PM, Marti Raudsepp ma...@juffo.org wrote: AFAICT this only happens once per plan and the overhead is O(n) to the number of pathkeys? I was of course wrong about that, it also adds extra overhead when iterating over the paths list. Test taskset -c 3 run2.sh from https://github.com/intgr/benchjunk/tree/master/partial_sort Overhead percentages (between best of each 3 runs): join5.sql 0.7 star5.sql 0.8 line5.sql 0.5 lim_join5.sql -0.5 lim_star5.sql 1.3 lim_line5.sql 0.5 limord_join5.sql 0.6 limord_star5.sql 0.5 limord_line5.sql 0.7 Raw results: git 48870dd join5.sql tps = 509.328070 (excluding connections establishing) join5.sql tps = 509.772190 (excluding connections establishing) join5.sql tps = 510.651517 (excluding connections establishing) star5.sql tps = 499.208698 (excluding connections establishing) star5.sql tps = 498.200314 (excluding connections establishing) star5.sql tps = 496.269315 (excluding connections establishing) line5.sql tps = 797.968831 (excluding connections establishing) line5.sql tps = 797.011690 (excluding connections establishing) line5.sql tps = 796.379258 (excluding connections establishing) lim_join5.sql tps = 394.946024 (excluding connections establishing) lim_join5.sql tps = 395.417689 (excluding connections establishing) lim_join5.sql tps = 395.482958 (excluding connections establishing) lim_star5.sql tps = 423.434393 (excluding connections establishing) lim_star5.sql tps = 423.774305 (excluding connections establishing) lim_star5.sql tps = 424.386099 (excluding connections establishing) lim_line5.sql tps = 733.007330 (excluding connections establishing) lim_line5.sql tps = 731.794731 (excluding connections establishing) lim_line5.sql tps = 732.356280 (excluding connections establishing) limord_join5.sql tps = 385.317921 (excluding connections establishing) limord_join5.sql tps = 385.915870 (excluding connections establishing) limord_join5.sql tps = 384.747848 (excluding connections establishing) limord_star5.sql tps = 417.992615 (excluding connections establishing) limord_star5.sql tps = 416.944685 (excluding connections establishing) limord_star5.sql tps = 418.262647 (excluding connections establishing) limord_line5.sql tps = 708.979203 (excluding connections establishing) limord_line5.sql tps = 710.926866 (excluding connections establishing) limord_line5.sql tps = 710.928907 (excluding connections establishing) 48870dd + partial-sort-basic-1.patch.gz + fix-cost_sort.patch join5.sql tps = 505.488181 (excluding connections establishing) join5.sql tps = 507.222759 (excluding connections establishing) join5.sql tps = 506.549654 (excluding connections establishing) star5.sql tps = 495.432915 (excluding connections establishing) star5.sql tps = 494.906793 (excluding connections establishing) star5.sql tps = 492.623808 (excluding connections establishing) line5.sql tps = 789.315968 (excluding connections establishing) line5.sql tps = 793.875456 (excluding connections establishing) line5.sql tps = 790.545990 (excluding connections establishing) lim_join5.sql tps = 396.956732 (excluding connections establishing) lim_join5.sql tps = 397.515213 (excluding connections establishing) lim_join5.sql tps = 397.578669 (excluding connections establishing) lim_star5.sql tps = 417.459963 (excluding connections establishing) lim_star5.sql tps = 418.024803 (excluding connections establishing) lim_star5.sql tps = 418.830234 (excluding connections establishing) lim_line5.sql tps = 729.186915 (excluding connections establishing) lim_line5.sql tps = 726.288788 (excluding connections establishing) lim_line5.sql tps = 728.123296 (excluding connections establishing) limord_join5.sql tps = 383.484767 (excluding connections establishing) limord_join5.sql tps = 383.021960 (excluding connections establishing) limord_join5.sql tps = 383.722051 (excluding connections establishing) limord_star5.sql tps = 414.138460 (excluding connections establishing) limord_star5.sql tps = 414.063766 (excluding connections establishing) limord_star5.sql
Re: [HACKERS] PoC: Partial sort
On Sun, Feb 9, 2014 at 7:37 PM, Alexander Korotkov aekorot...@gmail.com wrote: This is not only place that worry me about planning overhead. See get_cheapest_fractional_path_for_pathkeys. I had to estimate number of groups for each sorting column in order to get right fractional path. AFAICT this only happens once per plan and the overhead is O(n) to the number of pathkeys? I can't get worried about that, but I guess it's better to test anyway. PS: You didn't answer my questions about splitting the patch. I guess I'll have to do that anyway to run the tests. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Thu, Feb 6, 2014 at 5:31 AM, Robert Haas robertmh...@gmail.com wrote: Hmm, sounds a little steep. Why is it so expensive? I'm probably missing something here, because I would have thought that planner support for partial sorts would consist mostly of considering the same sorts we consider today, but with the costs reduced by the batching. I guess it's because the patch undoes some optimizations in the mergejoin planner wrt caching merge clauses and adds a whole lot of code to find_mergeclauses_for_pathkeys. In other code paths the overhead does seem to be negligible. Notice the removal of: /* Select the right mergeclauses, if we didn't already */ /* * Avoid rebuilding clause list if we already made one; * saves memory in big join trees... */ Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
On Tue, Jan 28, 2014 at 10:38 AM, Yugo Nagata nag...@sraoss.co.jp wrote: I revised the patch. Could you please review this? I didn't test the patch due to the duplicate OID compilation error. But a few things stuck out from the diffs: * You added some unnecessary spaces at the beginning of the linein OpernameGetCandidates. * regclass_guts and regtype_guts can be simplified further by moving the ereport() code into regclassin, thereby saving the if (missing_ok) * I would rephrase the documentation paragraph as: to_regclass, to_regproc, to_regoper and to_regtype are functions similar to the regclass, regproc, regoper and regtype casts, except that they return InvalidOid (0) when the object is not found, instead of raising an error. On Wed, Jan 22, 2014 at 1:04 PM, Tatsuo Ishii is...@postgresql.org wrote: I thought the consensus was that returning NULL is better than InvalidOid? From an earlier message: Not sure. There's already at least one counter example: pg_my_temp_schema() oid OID of session's temporary schema, or 0 if none And there are pg_relation_filenode, pg_stat_get_backend_dbid, pg_stat_get_backend_userid which return NULL::oid. In general I don't like magic values like 0 in SQL. For example, this query would give unexpected results because InvalidOid has some other meaning: select * from pg_aggregate where aggfinalfn=to_regproc('typo'); Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Thu, Feb 6, 2014 at 9:15 PM, Robert Haas robertmh...@gmail.com wrote: It may be that having the capability to do a partial sort makes it seem worth spending more CPU looking for merge joins, but I'd vote for making any such change a separate patch. Agreed. Alexander, should I work on splitting up the patch in two, or do you want to do it yourself? Should I merge my coding style and enable_partialsort patches while at it, or do you still have reservations about those? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Tue, Jan 28, 2014 at 7:51 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Tue, Jan 28, 2014 at 7:41 AM, Marti Raudsepp ma...@juffo.org wrote: But some benchmarks of planning performance are certainly warranted. I didn't test it, but I worry that overhead might be high. If it's true then it could be like constraint_exclusion option which id off by default because of planning overhead. Sorry I didn't get around to this before. I ran some synthetic benchmarks with single-column inner joins between 5 tables, with indexes on both joined columns, using only EXPLAIN (so measuring planning time, not execution) in 9 scenarios to excercise different code paths. According to these measurements, the overhead ranges between 1.0 and 4.5% depending on the scenario. Merge join with partial sort children seems like a fairly obscure use case (though I'm sure it can help a lot in those cases). The default should definitely allow partial sort in normal ORDER BY queries. What's under question here is whether to enable partial sort for mergejoin. So I see 3 possible resolutions: 1. The overhead is deemed acceptable to enable by default, in which case we're done here. 2. Add a three-value runtime setting like: enable_partialsort = [ off | no_mergejoin | on ], defaulting to no_mergejoin (just to get the point across, clearly we need better naming). This is how constraint_exclusion works. 3. Remove the partialsort mergejoin code entirely, keeping the rest of the cases. What do you think? All the tests are available here: https://github.com/intgr/benchjunk/tree/master/partial_sort (using script run2.sh) Overhead by test (partial-sort-7.patch.gz): join5.sql 2.9% (all joins on the same column) star5.sql 1.7% (star schema kind of join) line5.sql 1.9% (joins chained to each other) lim_join5.sql 4.5% (same as above, with LIMIT 1) lim_star5.sql 2.8% lim_line5.sql 1.8% limord_join5.sql 4.3% (same as above, with ORDER BY LIMIT 1) limord_star5.sql 3.9% limord_line5.sql 1.0% Full data: PostgreSQL @ git ac8bc3b join5.sql tps = 499.490173 (excluding connections establishing) join5.sql tps = 503.756335 (excluding connections establishing) join5.sql tps = 504.814072 (excluding connections establishing) star5.sql tps = 492.799230 (excluding connections establishing) star5.sql tps = 492.570615 (excluding connections establishing) star5.sql tps = 491.949985 (excluding connections establishing) line5.sql tps = 773.945050 (excluding connections establishing) line5.sql tps = 773.858068 (excluding connections establishing) line5.sql tps = 774.551240 (excluding connections establishing) lim_join5.sql tps = 392.539745 (excluding connections establishing) lim_join5.sql tps = 391.867549 (excluding connections establishing) lim_join5.sql tps = 393.361655 (excluding connections establishing) lim_star5.sql tps = 418.431804 (excluding connections establishing) lim_star5.sql tps = 419.258985 (excluding connections establishing) lim_star5.sql tps = 419.434697 (excluding connections establishing) lim_line5.sql tps = 713.852506 (excluding connections establishing) lim_line5.sql tps = 713.636694 (excluding connections establishing) lim_line5.sql tps = 712.971719 (excluding connections establishing) limord_join5.sql tps = 381.068465 (excluding connections establishing) limord_join5.sql tps = 380.379359 (excluding connections establishing) limord_join5.sql tps = 381.182385 (excluding connections establishing) limord_star5.sql tps = 412.997935 (excluding connections establishing) limord_star5.sql tps = 411.401352 (excluding connections establishing) limord_star5.sql tps = 413.209784 (excluding connections establishing) limord_line5.sql tps = 688.906406 (excluding connections establishing) limord_line5.sql tps = 689.445483 (excluding connections establishing) limord_line5.sql tps = 688.758042 (excluding connections establishing) partial-sort-7.patch.gz join5.sql tps = 479.508034 (excluding connections establishing) join5.sql tps = 488.263674 (excluding connections establishing) join5.sql tps = 490.127433 (excluding connections establishing) star5.sql tps = 482.106063 (excluding connections establishing) star5.sql tps = 484.179687 (excluding connections establishing) star5.sql tps = 483.027372 (excluding connections establishing) line5.sql tps = 758.092993 (excluding connections establishing) line5.sql tps = 759.697814 (excluding connections establishing) line5.sql tps = 759.792792 (excluding connections establishing) lim_join5.sql tps = 375.517211 (excluding connections establishing) lim_join5.sql tps = 375.539109 (excluding connections establishing) lim_join5.sql tps = 375.841645 (excluding connections establishing) lim_star5.sql tps = 407.683110 (excluding connections establishing) lim_star5.sql tps = 407.414409 (excluding connections establishing) lim_star5.sql tps = 407.526613 (excluding connections establishing) lim_line5.sql tps = 699.905101 (excluding connections establishing) lim_line5.sql tps = 700.349675 (excluding
Re: [HACKERS] PoC: Partial sort
On Tue, Jan 28, 2014 at 7:51 AM, Alexander Korotkov aekorot...@gmail.com wrote: I didn't test it, but I worry that overhead might be high. If it's true then it could be like constraint_exclusion option which id off by default because of planning overhead. I see, that makes sense. I will try to find the time to run some benchmarks in the coming few days. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
On Sun, Jan 26, 2014 at 11:19 AM, Simon Riggs si...@2ndquadrant.com wrote: For 9.4, we should cut down the patch so it has plpgsql.warnings = none (default) | all | [individual item list] plpgsql.warnings_as_errors = off (default) | on I hope I'm not late for the bikeshedding :) Why not have 2 similar options: plpgsql.warnings = none (default) | all | [individual item list] plpgsql.errors = none (default) | all | [individual item list] That would be cleaner, more flexible, and one less option to set if you just want errors and no warnings. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Mon, Jan 27, 2014 at 9:26 PM, Alexander Korotkov aekorot...@gmail.com wrote: For now, I have attempt to fix extra columns in mergejoin problem. It would be nice if you test it. Yes, it solves the test cases I was trying with, thanks. 1) With enable_partialsort = off all mergejoin logic should behave as without partial sort patch. 2) With partial sort patch get_cheapest_fractional_path_for_pathkeys function is much more expensive to execute. With enable_partialsort = off it should be as cheap as without partial sort patch. When it comes to planning time, I really don't think you should bother. The planner enable_* settings are meant for troubleshooting, debugging and learning about the planner. You should not expect people to disable them in a production setting. It's not worth complicating the code for that rare case. This is stated in the documentation (http://www.postgresql.org/docs/current/static/runtime-config-query.html) and repeatedly on the mailing lists. But some benchmarks of planning performance are certainly warranted. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Mon, Jan 20, 2014 at 2:43 PM, Alexander Korotkov aekorot...@gmail.com wrote: Another changes in this version of patch: 1) Applied patch to don't compare skipCols in tuplesort by Marti Raudsepp 2) Adjusting sort bound after processing buckets. Hi, Here's a patch with some whitespace and coding style fixes for partial-sort-6.patch I tried to understand the mergejoin regression, but this code still looks like Chinese to me. Can anyone else have a look at it? Test case: http://www.postgresql.org/message-id/cabrt9rdd-p2rlrdhsmq8rcob46k4a5o+bgz_up2brgeeh4r...@mail.gmail.com Original report: http://www.postgresql.org/message-id/CABRT9RCLLUyJ=bkeB132aVA_mVNx5==lvvvqmvuqdgufztw...@mail.gmail.com Regards, Marti From a3cedb922c5a12e43ee94b9d6f5a2aefba701708 Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Sun, 26 Jan 2014 16:25:45 +0200 Subject: [PATCH 1/2] Whitespace coding style fixes --- src/backend/executor/nodeSort.c | 17 + src/backend/optimizer/path/costsize.c | 8 src/backend/optimizer/path/pathkeys.c | 18 +- src/backend/optimizer/plan/createplan.c | 2 +- src/backend/optimizer/plan/planner.c| 6 +++--- src/backend/utils/sort/tuplesort.c | 2 +- 6 files changed, 27 insertions(+), 26 deletions(-) diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c index f38190d..2e50497 100644 --- a/src/backend/executor/nodeSort.c +++ b/src/backend/executor/nodeSort.c @@ -27,13 +27,14 @@ static bool cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b) { - int n = ((Sort *)node-ss.ps.plan)-skipCols, i; + int i, +n = ((Sort *)node-ss.ps.plan)-skipCols; for (i = 0; i n; i++) { - Datum datumA, datumB; - bool isnullA, isnullB; - AttrNumber attno = node-skipKeys[i].ssup_attno; + Datum datumA, datumB; + bool isnullA, isnullB; + AttrNumber attno = node-skipKeys[i].ssup_attno; datumA = heap_getattr(a, attno, tupDesc, isnullA); datumB = slot_getattr(b, attno, isnullB); @@ -147,7 +148,7 @@ ExecSort(SortState *node) tuplesort_set_bound(tuplesortstate, node-bound - node-bound_Done); /* - * Put next group of tuples where skipCols sort values are equal to + * Put next group of tuples where skipCols' sort values are equal to * tuplesort. */ for (;;) @@ -177,10 +178,10 @@ ExecSort(SortState *node) } else { -bool cmp; -cmp = cmpSortSkipCols(node, tupDesc, node-prev, slot); +bool equal; +equal = cmpSortSkipCols(node, tupDesc, node-prev, slot); node-prev = ExecCopySlotTuple(slot); -if (!cmp) +if (!equal) break; } } diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 9e79c6d..3a18632 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -1331,13 +1331,13 @@ cost_sort(Path *path, PlannerInfo *root, */ if (presorted_keys 0) { - List *groupExprs = NIL; - ListCell *l; - int i = 0; + List *groupExprs = NIL; + ListCell *l; + int i = 0; foreach(l, pathkeys) { - PathKey *key = (PathKey *)lfirst(l); + PathKey *key = (PathKey *) lfirst(l); EquivalenceMember *member = (EquivalenceMember *) lfirst(list_head(key-pk_eclass-ec_members)); diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 55d8ef4..1e1a09a 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -319,10 +319,9 @@ compare_pathkeys(List *keys1, List *keys2) int pathkeys_common(List *keys1, List *keys2) { - int n; + int n = 0; ListCell *key1, *key2; - n = 0; forboth(key1, keys1, key2, keys2) { @@ -460,7 +459,7 @@ get_cheapest_fractional_path_for_pathkeys(List *paths, num_groups = (double *)palloc(sizeof(double) * list_length(pathkeys)); foreach(l, pathkeys) { - PathKey *key = (PathKey *)lfirst(l); + PathKey *key = (PathKey *) lfirst(l); EquivalenceMember *member = (EquivalenceMember *) lfirst(list_head(key-pk_eclass-ec_members)); @@ -1085,7 +1084,6 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root, List *mergeclauses = NIL; ListCell *i; bool *used = (bool *)palloc0(sizeof(bool) * list_length(restrictinfos)); - int k; List *unusedRestrictinfos = NIL; List *usedPathkeys = NIL; @@ -1103,6 +1101,7 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root, EquivalenceClass *pathkey_ec = pathkey-pk_eclass; List *matched_restrictinfos = NIL; ListCell *j; + int k = 0; /*-- * A mergejoin clause matches a pathkey if it has the same EC. @@ -1140,7 +1139,6 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root, * deal with the case in create_mergejoin_plan(). *-- */ - k = 0; foreach(j, restrictinfos) { RestrictInfo *rinfo = (RestrictInfo *) lfirst(j); @@ -1182,7 +1180,9 @@ find_mergeclauses_for_pathkeys
Re: [HACKERS] Proposal: variant of regclass
On Tue, Jan 14, 2014 at 9:28 AM, Yugo Nagata nag...@sraoss.co.jp wrote: Here is the patch to implement to_regclass, to_regproc, to_regoper, and to_regtype. + static Datum regclass_guts(char *class_name_or_oid, bool raiseError); Minor bikeshedding, a lot of code currently uses an argument named missing_ok for this purpose (with inverse meaning of course). Any reasons why you chose raiseError instead? I only had a brief look at the patch, so maybe I'm missing something. But I don't think you should create 3 variants of these functions: * parseTypeString calls parseTypeString_guts with false * parseTypeStringMissingOk calls parseTypeString_guts with true * parseTypeString_guts And this is just silly: if (raiseError) parseTypeString(typ_name_or_oid, result, typmod); else parseTypeStringMissingOk(typ_name_or_oid, result, typmod); Just add an argument to parseTypeString and patch all the callers. if requested object is not found, returns InvalidOid, rather than raises an error. I thought the consensus was that returning NULL is better than InvalidOid? From an earlier message: On Thu, Dec 5, 2013 at 5:25 PM, Robert Haas robertmh...@gmail.com wrote: Another advantage of this approach is that, IIUC, type input functions can't return a NULL value. So 'pg_klass'::regclass could return 0, but not NULL. On the other hand, toregclass('pg_klass') *could* return NULL, which seems conceptually cleaner. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: variant of regclass
On Wed, Jan 22, 2014 at 1:44 PM, Yugo Nagata nag...@sraoss.co.jp wrote: On Wed, 22 Jan 2014 20:04:12 +0900 (JST) Tatsuo Ishii is...@postgresql.org wrote: parseTypeString() is called by some other functions and I avoided influences of modifying the definition on them, since this should raise errors in most cases. This is same reason for other *MissingOk functions in parse_type.c. Is it better to write definitions of these function and all there callers? Yes, for parseTypeString certainly. There have been many refactorings like that in the past and all of them use this pattern. typenameTypeIdAndMod is less clear since the code paths differ so much, maybe keep 2 versions (merging back to 1 function is OK too, but in any case you don't need 3). typenameTypeIdAndModMissingOk(...) { Type tup = LookupTypeName(pstate, typeName, typmod_p); if (tup == NULL || !((Form_pg_type) GETSTRUCT(tup))-typisdefined) *typeid_p = InvalidOid; else *typeid_p = HeapTupleGetOid(tup); if (tup) ReleaseSysCache(tup); } typenameTypeIdAndMod(...) { Type tup = typenameType(pstate, typeName, typmod_p); *typeid_p = HeapTupleGetOid(tup); ReleaseSysCache(tup); } Also, there's no need for else here: if (raiseError) ereport(ERROR, ...); else return InvalidOid; Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improve the help message about psql -F
On Mon, Jan 20, 2014 at 2:04 PM, Jov am...@amutu.com wrote: reasonable,I removed the set,patch attached. Hi Jov, A new commitfest was just opened, due on 2014-06. Please add your patch here: https://commitfest.postgresql.org/action/commitfest_view?id=22 (You'll need a community account if you don't already have one) Sometimes simple fixes like yours are merged outside a CommitFest, but adding it there makes sure it won't get lost. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance
On Mon, Jan 20, 2014 at 1:51 AM, Dave Chinner da...@fromorbit.com wrote: Postgres is far from being the only application that wants this; many people resort to tmpfs because of this: https://lwn.net/Articles/499410/ Yes, we covered the possibility of using tmpfs much earlier in the thread, and came to the conclusion that temp files can be larger than memory so tmpfs isn't the solution here. :) What I meant is: lots of applications want this behavior. If Linux filesystems had support for delaying writeback for temporary files, then there would be no point in mounting tmpfs on /tmp at all and we'd get the best of both worlds. Right now people resort to tmpfs because of this missing feature. And then have their machines end up in swap hell if they overuse it. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
Hi, On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Tue, Jan 14, 2014 at 12:54 AM, Marti Raudsepp ma...@juffo.org wrote: I've been trying it out in a few situations. I implemented a new enable_partialsort GUC to make it easier to turn on/off, this way it's a lot easier to test. The attached patch applies on top of partial-sort-5.patch I though about such option. Generally not because of testing convenience, but because of overhead of planning. This way you implement it is quite naive :) I don't understand. I had another look at this and cost_sort still seems like the best place to implement this, since that's where the patch decides how many pre-sorted columns to use. Both mergejoin and simple order-by plans call into it. If enable_partialsort=false then I skip all pre-sorted options except full sort, making cost_sort behave pretty much like it did before the patch. I could change pathkeys_common to return 0, but that seems like a generic function that shouldn't be tied to partialsort. The old code paths called pathkeys_contained_in anyway, which has similar complexity. (Apart for initial_cost_mergejoin, but that doesn't seem special enough to make an exception for). Or should I use?: enable_partialsort ? pathkeys_common(...) : 0 For instance, merge join rely on partial sort which will be replaced with simple sort. Are you saying that enable_partialsort=off should keep partialsort-based mergejoins enabled? Or are you saying that merge joins shouldn't use simple sort at all? But merge join was already able to use full Sort nodes before your patch. What am I missing? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add value to error message when size extends
On Sun, Jan 19, 2014 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Complaining about a too-long varchar string in this style seems practically guaranteed to violate that. Agreed. But I think it would be useful to add the length of the string being inserted; we already have it in the len variable. One thing that occurs to me just now is that perhaps we could report the failure as if it were a syntax error That would be cool, if it can be made to work. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improve the help message about psql -F
2014/1/17 Jov am...@amutu.com but in the psql --help,-F say: set field separator (default: |) if user don't read the offical doc carefully,he can use: psql -F , -c 'select ...' But can't get what he want. It is a bad user Experience. +1 from me, patch applies and is helpful. After patching this line in psql --help is 82 characters long; I think it's best to keep help screens below 80 characters wide (although there's already 1 other line violating this rule). I think the word set is pretty useless there anyway, maybe remove that so the message becomes field separator for unaligned output (default: |) PS: There isn't an open CommitFest to add this to. Shouldn't we always open a new CF when the last one goes in progress? If there's no date, it may be simply called next Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.2
On Tue, Nov 19, 2013 at 6:38 PM, Andres Freund and...@2ndquadrant.com wrote: The attached patches compile and make check successfully on linux x86, amd64 and msvc x86 and amd64. This time and updated configure is included. The majority of operations still rely on CAS emulation. Note that the atomics-generic-acc.h file has ® characters in the Latin-1 encoding (Intel ® Itanium ®). If you have to use weird characters, I think you should make sure they're UTF-8 Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
Hi, There's another small regression with this patch when used with expensive comparison functions, such as long text fields. If we go through all this trouble in cmpSortSkipCols to prove that the first N sortkeys are equal, it would be nice if Tuplesort could skip their comparisons entirely; that's another nice optimization this patch can provide. I've implemented that in the attached patch, which applies on top of your partial-sort-5.patch Should the Sort Key field in EXPLAIN output be changed as well? I'd say no, I think that makes the partial sort steps harder to read. Generate test data: create table longtext as select (select repeat('a', 1000*100)) a, generate_series(1,1000) i; create index on longtext(a); Unpatched (using your original partial-sort-5.patch): =# explain analyze select * from longtext order by a, i limit 10; Limit (cost=2.34..19.26 rows=10 width=1160) (actual time=13477.739..13477.756 rows=10 loops=1) - Partial sort (cost=2.34..1694.15 rows=1000 width=1160) (actual time= 13477.737..13477.742 rows=10 loops=1) Sort Key: a, i Presorted Key: a Sort Method: top-N heapsort Memory: 45kB - Index Scan using longtext_a_idx on longtext (cost=0.65..1691.65 rows=1000 width=1160) (actual time=0.015..2.364 rows=1000 loops=1) Total runtime: 13478.158 ms (7 rows) =# set enable_indexscan=off; =# explain analyze select * from longtext order by a, i limit 10; Limit (cost=198.61..198.63 rows=10 width=1160) (actual time=6970.439..6970.458 rows=10 loops=1) - Sort (cost=198.61..201.11 rows=1000 width=1160) (actual time=6970.438..6970.444 rows=10 loops=1) Sort Key: a, i Sort Method: top-N heapsort Memory: 45kB - Seq Scan on longtext (cost=0.00..177.00 rows=1000 width=1160) (actual time=0.007..1.763 rows=1000 loops=1) Total runtime: 6970.491 ms Patched: =# explain analyze select * from longtext order by a, i ; Partial sort (cost=2.34..1694.15 rows=1000 width=1160) (actual time=0.024..4.603 rows=1000 loops=1) Sort Key: a, i Presorted Key: a Sort Method: quicksort Memory: 27kB - Index Scan using longtext_a_idx on longtext (cost=0.65..1691.65 rows=1000 width=1160) (actual time=0.013..2.094 rows=1000 loops=1) Total runtime: 5.418 ms Regards, Marti From fbc6c31528018bca64dc54f65e1cd292f8de482a Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Sat, 18 Jan 2014 19:16:15 +0200 Subject: [PATCH] Batched sort: skip comparisons for known-equal columns --- src/backend/executor/nodeSort.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c index cf1f79e..5abda1d 100644 --- a/src/backend/executor/nodeSort.c +++ b/src/backend/executor/nodeSort.c @@ -125,10 +125,10 @@ ExecSort(SortState *node) { tuplesortstate = tuplesort_begin_heap(tupDesc, plannode-numCols - skipCols, - (plannode-sortColIdx)[skipCols], - plannode-sortOperators, - plannode-collations, - plannode-nullsFirst, + (plannode-sortColIdx[skipCols]), + (plannode-sortOperators[skipCols]), + (plannode-collations[skipCols]), + (plannode-nullsFirst[skipCols]), work_mem, node-randomAccess); if (node-bounded) -- 1.8.5.3 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
Funny, I just wrote a patch to do that some minutes ago (didn't see your email yet). http://www.postgresql.org/message-id/CABRT9RCK=wmFUYZdqU_+MOFW5PDevLxJmZ5B=etjjnubvya...@mail.gmail.com Regards, Marti On Sat, Jan 18, 2014 at 7:10 PM, Jeremy Harris j...@wizmail.org wrote: On 13/01/14 18:01, Alexander Korotkov wrote: Thanks. It's included into attached version of patch. As wall as estimation improvements, more comments and regression tests fix. Would it be possible to totally separate the two sets of sort-keys, only giving the non-index set to the tuplesort? At present tuplesort will, when it has a group to sort, make wasted compares on the indexed set of keys before starting on the non-indexed set. -- Cheers, Jeremy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Sat, Jan 18, 2014 at 7:22 PM, Marti Raudsepp ma...@juffo.org wrote: Total runtime: 5.418 ms Oops, shouldn't have rushed this. Clearly the timings should have tipped me off that it's broken. I didn't notice that cmpSortSkipCols was re-using tuplesort's sortkeys. Here's a patch that actually works; I added a new skipKeys attribute to SortState. I had to extract the SortSupport-creation code from tuplesort_begin_heap to a new function; but that's fine, because it was already duplicated in ExecInitMergeAppend too. I reverted the addition of tuplesort_get_sortkeys, which is not needed now. Now the timings are: Unpatched partial sort: 13478.158 ms Full sort: 6802.063 ms Patched partial sort: 6618.962 ms Regards, Marti From 7d9f34c09e7836504725ff11be7e63a2fc438ae9 Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Mon, 13 Jan 2014 20:38:45 +0200 Subject: [PATCH] Partial sort: skip comparisons for known-equal columns --- src/backend/executor/nodeMergeAppend.c | 18 +- src/backend/executor/nodeSort.c| 26 +- src/backend/utils/sort/sortsupport.c | 29 + src/backend/utils/sort/tuplesort.c | 31 +-- src/include/nodes/execnodes.h | 1 + src/include/utils/sortsupport.h| 3 +++ src/include/utils/tuplesort.h | 2 -- 7 files changed, 60 insertions(+), 50 deletions(-) diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c index 74fa40d..db6ec23 100644 --- a/src/backend/executor/nodeMergeAppend.c +++ b/src/backend/executor/nodeMergeAppend.c @@ -126,19 +126,11 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags) * initialize sort-key information */ mergestate-ms_nkeys = node-numCols; - mergestate-ms_sortkeys = palloc0(sizeof(SortSupportData) * node-numCols); - - for (i = 0; i node-numCols; i++) - { - SortSupport sortKey = mergestate-ms_sortkeys + i; - - sortKey-ssup_cxt = CurrentMemoryContext; - sortKey-ssup_collation = node-collations[i]; - sortKey-ssup_nulls_first = node-nullsFirst[i]; - sortKey-ssup_attno = node-sortColIdx[i]; - - PrepareSortSupportFromOrderingOp(node-sortOperators[i], sortKey); - } + mergestate-ms_sortkeys = MakeSortSupportKeys(mergestate-ms_nkeys, + node-sortColIdx, + node-sortOperators, + node-collations, + node-nullsFirst); /* * initialize to show we have not run the subplans yet diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c index 55cdb05..7645645 100644 --- a/src/backend/executor/nodeSort.c +++ b/src/backend/executor/nodeSort.c @@ -28,20 +28,19 @@ static bool cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b) { int n = ((Sort *)node-ss.ps.plan)-skipCols, i; - SortSupport sortKeys = tuplesort_get_sortkeys(node-tuplesortstate); for (i = 0; i n; i++) { Datum datumA, datumB; bool isnullA, isnullB; - AttrNumber attno = sortKeys[i].ssup_attno; + AttrNumber attno = node-skipKeys[i].ssup_attno; datumA = heap_getattr(a, attno, tupDesc, isnullA); datumB = slot_getattr(b, attno, isnullB); if (ApplySortComparator(datumA, isnullA, - datumB, isnullB, - sortKeys[i])) +datumB, isnullB, +node-skipKeys[i])) return false; } return true; @@ -123,12 +122,21 @@ ExecSort(SortState *node) tuplesort_reset((Tuplesortstate *) node-tuplesortstate); else { + /* Support structures for cmpSortSkipCols - already sorted columns */ + if (skipCols) + node-skipKeys = MakeSortSupportKeys(skipCols, + plannode-sortColIdx, + plannode-sortOperators, + plannode-collations, + plannode-nullsFirst); + + /* Only pass on remaining columns that are unsorted */ tuplesortstate = tuplesort_begin_heap(tupDesc, - plannode-numCols, - plannode-sortColIdx, - plannode-sortOperators, - plannode-collations, - plannode-nullsFirst, + plannode-numCols - skipCols, + (plannode-sortColIdx[skipCols]), + (plannode-sortOperators[skipCols]), + (plannode-collations[skipCols]), + (plannode-nullsFirst[skipCols]), work_mem, node-randomAccess); if (node-bounded) diff --git a/src/backend/utils/sort/sortsupport.c b/src/backend/utils/sort/sortsupport.c index 347f448..df82f5f 100644 --- a/src/backend/utils/sort/sortsupport.c +++ b/src/backend/utils/sort/sortsupport.c @@ -85,6 +85,35 @@ PrepareSortSupportComparisonShim(Oid cmpFunc, SortSupport ssup) } /* + * Build an array of SortSupportData structures from separated arrays. + */ +SortSupport +MakeSortSupportKeys(int nkeys, AttrNumber *attNums, + Oid *sortOperators, Oid *sortCollations, + bool *nullsFirstFlags) +{ + SortSupport
[HACKERS] [patch] Potential relcache leak in get_object_address_attribute
Hi list, It looks like the get_object_address_attribute() function has a potential relcache leak. When missing_ok=true, the relation is found but attribute is not, then the relation isn't closed, nor is it returned to the caller. I couldn't figure out any ways to trigger this, but it's best to fix anyway. Regards, Marti diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c index f8fd4f8..e22aa66 100644 --- a/src/backend/catalog/objectaddress.c +++ b/src/backend/catalog/objectaddress.c @@ -1024,6 +1024,7 @@ get_object_address_attribute(ObjectType objtype, List *objname, address.classId = RelationRelationId; address.objectId = InvalidOid; address.objectSubId = InvalidAttrNumber; + relation_close(relation, lockmode); return address; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Linux kernel impact on PostgreSQL performance
On Wed, Jan 15, 2014 at 5:34 AM, Jim Nasby j...@nasby.net wrote: it's very common to create temporary file data that will never, ever, ever actually NEED to hit disk. Where I work being able to tell the kernel to avoid flushing those files unless the kernel thinks it's got better things to do with that memory would be EXTREMELY valuable Windows has the FILE_ATTRIBUTE_TEMPORARY flag for this purpose. ISTR that there was discussion about implementing something analogous in Linux when ext4 got delayed allocation support, but I don't think it got anywhere and I can't find the discussion now. I think the proposed interface was to create and then unlink the file immediately, which serves as a hint that the application doesn't care about persistence. Postgres is far from being the only application that wants this; many people resort to tmpfs because of this: https://lwn.net/Articles/499410/ Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.consistent_into
On Wed, Jan 15, 2014 at 8:23 AM, Jim Nasby j...@nasby.net wrote: Do we actually support = right now? We already support v_field := field FROM table ... ; and I think it's a bad idea to have different meaning for = and :=. That was already discussed before. Yes, we support both = and := and they have exactly the same behavior; I agree, we should keep them equivalent. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Capturing EXPLAIN ANALYZE for a query without discarding the normal result set
On Tue, Jan 14, 2014 at 5:13 AM, Marti Raudsepp ma...@juffo.org wrote: You can use the auto_explain contrib module I just remembered that there's also the pg_stat_plans extension, which is closer to what you asked: https://github.com/2ndQuadrant/pg_stat_plans . This one you'll have to build yourself (it's not in contrib). Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.consistent_into
I've always hated INTO in procedures since it makes the code harder to follow and has very different behavior on the SQL level, in addition to the multi-row problem you bring up. If we can make assignment syntax more versatile and eventually replace INTO, then that solves multiple problems in the language without breaking backwards compatibility. On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja ma...@joh.to wrote: On 2014-01-14 02:54, Marti Raudsepp wrote: But PL/pgSQL already has an assignment syntax with the behavior you want: According to the docs, that doesn't set FOUND which would make this a pain to deal with.. Right you are. If we can extend the syntax then we could make it such that = SELECT sets FOUND and other diagnostics, and a simple assignment doesn't. Which makes sense IMO: a = 10; -- simple assignments really shouldn't affect FOUND With explicit SELECT, clearly the intent is to perform a query: a = SELECT foo FROM table; And this could also work: a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id; AFAICT the fact that this works is more of an accident and should be discouraged. We can leave it as is for compatibility's sake: a = foo FROM table; Now, another question is whether it's possible to make the syntax work. Is this an assignment from the result of a subquery, or is it a query by itself? a = (SELECT foo FROM table); Does anyone consider this proposal workable? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inheritance and indexes
On Tue, Jan 14, 2014 at 12:07 PM, knizhnik knizh...@garret.ru wrote: But is it possible to use index for derived table at all? Yes, the planner will do an index scan when it makes sense. Why sequential search is used for derived table in the example below: insert into derived_table values (2,2); create index derived_index on derived_table(x); explain select * from base_table where x=0; With only 1 row in the table, the planner decides there's no point in scanning the index. Try with more realistic data. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov aekorot...@gmail.com wrote: I implemented a new enable_partialsort GUC to make it easier to turn on/off I though about such option. Generally not because of testing convenience, but because of overhead of planning. This way you implement it is quite naive :) For instance, merge join rely on partial sort which will be replaced with simple sort. Oh, this actually highlights a performance regression with the partial sort patch. I assumed the planner will discard the full sort because of higher costs, but it looks like the new code always assumes that a Partial sort will be cheaper than a Join Filter without considering costs. When doing a join USING (unique_indexed_value, something), the new plan is significantly worse. Unpatched: marti=# explain analyze select * from release a join release b using (id, name); Merge Join (cost=0.85..179810.75 rows=12 width=158) (actual time=0.011..1279.596 rows=1232610 loops=1) Merge Cond: (a.id = b.id) Join Filter: ((a.name)::text = (b.name)::text) - Index Scan using release_id_idx on release a (cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.005..211.928 rows=1232610 loops=1) - Index Scan using release_id_idx on release b (cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.004..371.592 rows=1232610 loops=1) Total runtime: 1309.049 ms Patched: Merge Join (cost=0.98..179810.87 rows=12 width=158) (actual time=0.037..5034.158 rows=1232610 loops=1) Merge Cond: ((a.id = b.id) AND ((a.name)::text = (b.name)::text)) - Partial sort (cost=0.49..82201.56 rows=1232610 width=92) (actual time=0.013..955.938 rows=1232610 loops=1) Sort Key: a.id, a.name Presorted Key: a.id Sort Method: quicksort Memory: 25kB - Index Scan using release_id_idx on release a (cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.007..449.332 rows=1232610 loops=1) - Materialize (cost=0.49..85283.09 rows=1232610 width=92) (actual time=0.019..1352.377 rows=1232610 loops=1) - Partial sort (cost=0.49..82201.56 rows=1232610 width=92) (actual time=0.018..1223.251 rows=1232610 loops=1) Sort Key: b.id, b.name Presorted Key: b.id Sort Method: quicksort Memory: 25kB - Index Scan using release_id_idx on release b (cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.004..597.258 rows=1232610 loops=1) Total runtime: 5166.906 ms There's another wishlist kind of thing with top-N heapsort bounds; if I do a query with LIMIT 1000 then every sort batch has Tuplesortstate.bound set to 1000, but it could be reduced after each batch. If the first batch is 900 rows then the 2nd batch only needs the top 100 rows at most. Also, I find the name partial sort a bit confusing; this feature is not actually sorting *partially*, it's finishing the sort of partially-sorted data. Perhaps batched sort would explain the feature better? Because it does the sort in multiple batches instead of all at once. But maybe that's just me. Regards, Marti
Re: [HACKERS] PoC: Partial sort
On Tue, Jan 14, 2014 at 9:28 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Tue, Jan 14, 2014 at 11:16 PM, Marti Raudsepp ma...@juffo.org wrote: Oh, this actually highlights a performance regression with the partial sort patch. Interesting. Could you share the dataset? It occurs with many datasets if work_mem is sufficiently low (10MB in my case). Here's a quicker way to reproduce a similar issue: create table foo as select i, i as j from generate_series(1,1000) i; create index on foo(i); explain analyze select * from foo a join foo b using (i, j); The real data is from the release table from MusicBrainz database dump: https://musicbrainz.org/doc/MusicBrainz_Database/Download . It's nontrivial to set up though, so if you still need the real data, I can upload a pgdump for you. Regards, Marti
Re: [HACKERS] PoC: Partial sort
pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers From 3f05447e7feb99583336b381df60ff013a144bab Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Mon, 13 Jan 2014 22:24:20 +0200 Subject: [PATCH] Add enable_partialsort GUC for disabling partial sorts --- doc/src/sgml/config.sgml | 13 + src/backend/optimizer/path/costsize.c | 3 ++- src/backend/optimizer/path/pathkeys.c | 1 + src/backend/utils/misc/guc.c | 10 ++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/optimizer/cost.h | 1 + 6 files changed, 28 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 0f2f2bf..1995625 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2808,6 +2808,19 @@ include 'filename' /listitem /varlistentry + varlistentry id=guc-enable-partialsort xreflabel=enable_partialsort + termvarnameenable_partialsort/varname (typeboolean/type)/term + indexterm + primaryvarnameenable_partialsort/ configuration parameter/primary + /indexterm + listitem + para +Enables or disables the query planner's use of partial sort steps. +The default is literalon/. + /para + /listitem + /varlistentry + varlistentry id=guc-enable-tidscan xreflabel=enable_tidscan termvarnameenable_tidscan/varname (typeboolean/type)/term indexterm diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index da64825..cefd480 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -112,6 +112,7 @@ bool enable_indexonlyscan = true; bool enable_bitmapscan = true; bool enable_tidscan = true; bool enable_sort = true; +bool enable_partialsort = true; bool enable_hashagg = true; bool enable_nestloop = true; bool enable_material = true; @@ -1329,7 +1330,7 @@ cost_sort(Path *path, PlannerInfo *root, /* * Estimate number of groups which dataset is divided by presorted keys. */ - if (presorted_keys 0) + if (presorted_keys 0 enable_partialsort) { List *groupExprs = NIL; ListCell *l; diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 55d8ef4..d5a1357 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -22,6 +22,7 @@ #include nodes/nodeFuncs.h #include nodes/plannodes.h #include optimizer/clauses.h +#include optimizer/cost.h #include optimizer/pathnode.h #include optimizer/paths.h #include optimizer/tlist.h diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 1217098..c3f2f29 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -1,3 +1,4 @@ + /* * guc.c * @@ -724,6 +725,15 @@ static struct config_bool ConfigureNamesBool[] = NULL, NULL, NULL }, { + {enable_partialsort, PGC_USERSET, QUERY_TUNING_METHOD, + gettext_noop(Enables the planner's use of partial sort steps.), + NULL + }, + enable_partialsort, + true, + NULL, NULL, NULL + }, + { {enable_hashagg, PGC_USERSET, QUERY_TUNING_METHOD, gettext_noop(Enables the planner's use of hashed aggregation plans.), NULL diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 27791cc..20072fb 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -270,6 +270,7 @@ #enable_nestloop = on #enable_seqscan = on #enable_sort = on +#enable_partialsort = on #enable_tidscan = on # - Planner Cost Constants - diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index 47aef12..30203c7 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -56,6 +56,7 @@ extern bool enable_indexonlyscan; extern bool enable_bitmapscan; extern bool enable_tidscan; extern bool enable_sort; +extern bool enable_partialsort; extern bool enable_hashagg; extern bool enable_nestloop; extern bool enable_material; -- 1.8.5.2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Where do we stand on 9.3 bugs?
On Mon, Jan 13, 2014 at 5:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: What remaining issues are there blocking a 9.3.3 release? Well hardly a blocker since this has missed 2 releases already, but I'm still hopeful to get many PGXS-based extensions to build again without the dreaded install: will not overwrite just-created ... http://www.postgresql.org/message-id/52406191.6040...@dunslane.net Thankfully I'm mostly using Debian so it's already patched for me. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.consistent_into
On Sun, Jan 12, 2014 at 7:51 AM, Marko Tiikkaja ma...@joh.to wrote: the behaviour of SELECT .. INTO when the query returns more than one row. Some of you might know that no exception is raised in this case Agreed. But I also agree with the rest of the thread about changing current INTO behavior and introducing new GUC variables. But PL/pgSQL already has an assignment syntax with the behavior you want: DECLARE foo int; BEGIN foo = generate_series(1,1); -- this is OK foo = generate_series(1,2); -- fails foo = 10 WHERE FALSE; -- sets foo to NULL -- And you can actually do: foo = some_col FROM some_table WHERE bar=10; END; So if we extend this syntax to support multiple columns, it should satisfy the use cases you care about. foo, bar = col1, col2 FROM some_table WHERE bar=10; It's ugly without the explicit SELECT though. Perhaps make the SELECT optional: foo, bar = SELECT col1, col2 FROM some_table WHERE bar=10; I think that's more aesthetically pleasing than INTO and also looks more familiar to other languages. Plus, now you can copy-paste the query straight to an SQL shell without another footgun involving creating new tables in your database. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Capturing EXPLAIN ANALYZE for a query without discarding the normal result set
On Tue, Jan 14, 2014 at 5:06 AM, Dave Cole davejohnc...@gmail.com wrote: It would be really cool if you could direct the EXPLAIN ANALYZE output to a temporary table so that the query being analyzed could execute normally. You can use the auto_explain contrib module to log the query plans of slow(er) queries: http://www.postgresql.org/docs/current/static/auto-explain.html If you Really Need To, you can use the csvlog log format and import that to a table, but really it's easier to use less/grep/etc. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install
Hi Andrew, On Mon, Sep 23, 2013 at 6:43 PM, Andrew Dunstan and...@dunslane.net wrote: I'm working on it. It appears to have a slight problem or two I want to fix at the same time, rather than backpatch something broken. Any progress on this? I notice that the fixes didn't make it into 9.3.1. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install
On Fri, Sep 13, 2013 at 8:17 PM, Marti Raudsepp ma...@juffo.org wrote: On Fri, Sep 13, 2013 at 6:42 PM, Cédric Villemain ced...@2ndquadrant.com wrote: Andrew is about to commit (well...I hope) a doc patch about that and also a little fix. Imho this is a bugfix so I hope it will be applyed in older branches. Oh I see, indeed commit 6697aa2bc25c83b88d6165340348a31328c35de6 Improve support for building PGXS modules with VPATH fixes the problem and I see it's not present in REL9_3_0. Andrew and others, does this seem safe enough to backport to 9.3.1? Ping? Will this be backported to 9.3 or should I report to extension authors to fix their Makefiles? I understand that the 9.3.1 release might still be weeks away, I'd just like to get a vague confirmation about what committers think. Note that this patch is already applied to Debian/Ubuntu packages (including those on apt.postgresql.org). Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers