[HACKERS] Extending constraint exclusion for implied constraints/conditions
Hi, Here's a table I have postgres=# \d+ tab1 Table public.tab1 Column | Type | Modifiers | Storage | Stats target | Description +-+---+-+--+- val| integer | | plain | | val2 | integer | | plain | | Check constraints: tab1_val2_check CHECK (val2 30) tab1_val_check CHECK (val 30) and I run following query on it, postgres=# set constraint_exclusion to on; SET postgres=# explain verbose select * from tab1 where val = val2; QUERY PLAN - Seq Scan on public.tab1 (cost=0.00..36.75 rows=11 width=8) Output: val, val2 Filter: (tab1.val = tab1.val2) Given the constraints and the condition in WHERE clause, it can be inferred that the query will not return any row. However, PostgreSQL scans the table as seen in the plan. Right now, constraint exclusion code looks only at the provided conditions. If we want avoid table scan based on constraints in the above example, it will need to look at the implied conditions as well. E.g. val2 30 AND val = val2 = val 30. Then the constraint exclusion can see that val 30 AND val 30 are contradictory and infer that the result is going to be empty. We will need to add information about the transitive inferences between operators. Can we do that in PostgreSQL? Will the benefits be worth the code, that needs to be added? I can see some more benefits. We can use it to eliminate joins where the constraints on joining relations may cause the join to be empty e.g. postgres=# \d+ tab2 Table public.tab2 Column | Type | Modifiers | Storage | Stats target | Description +-+---+-+--+- val| integer | | plain | | Check constraints: tab2_val_check CHECK (val 30) postgres=# \d+ tab1 Table public.tab1 Column | Type | Modifiers | Storage | Stats target | Description +-+---+-+--+- val| integer | | plain | | Check constraints: tab1_val_check CHECK (val 30) and the query with join is - select * from tab1, tab2 where tab1.val = tab2. val OR select * from tab1, tab2 where tab1.val tab2.val. This can be extended further for the case of join between two parent tables, where we will be eliminate joins between some pairs of children. There are limitations in this case though, 1. Benefits of this optimization will be visible in case of nested loop joins. Hash and Merge join implicitly eliminate the non-joining children. 2. We need a way to push join down append path, which PostgreSQL doesn't do today. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] 9.5 CF1
Hi. We're into the last full week of this CommitFest. Here's an update on the current state of the queue. Needs review: 31 Waiting on author: 18 Ready for committer:13 Committed: 19 (I'll send separate topic-wise updates too.) Reviewers, please try to post whatever you have in the next few days to give the authors a chance to respond. Even if you haven't completed the review, please consider posting what you have now and updating it later. If for any reason you don't expect to be able to complete a review you are working on, please let me know as soon as possible. Patch authors, please let me know if you intend to resubmit revisions of patches that are currently marked waiting on author during this CF. If it's likely to take more than a few days, it should be moved to the August CF. Committers… well, have fun looking at the queue, I guess! Thanks again to everyone for their help in keeping things moving during this review cycle. Starting at 97 patches, this wasn't the largest CF we've ever had, but it's right up there near the top. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 10:36:23 +0530, a...@2ndquadrant.com wrote: Server features --- Lag Lead Window Functions Can Ignore Nulls Latest patch currently pending review by Jeff and Álvaro. No updates so far. Jeff has posted a review, and is working on the patch further. Using Levenshtein distance to HINT a candidate column name No real progress in any direction. Everyone wants the feature, but nobody is sure how many suggestions would be useful and how many would be annoying. Peter posted a new patch. Nobody has reviewed it yet. Buffer capture facility: check WAL replay consistency Some discussion, but no reviewers. Status unclear. http://archives.postgresql.org/message-id/CAB7nPqTFK4=fcrto=lji4vlbx9ah+fv1z1ke6r98ppxuuxw...@mail.gmail.com No updates. Custom Plan API Shigeru Hanada has said he plans to post a design review soon. Any updates? Should this be moved to the next CF? delta relations in AFTER triggers No code review yet, but there's a proof-of-concept extension that uses the feature, and Kevin is working on PL changes to make the feature usable, and hopes to commit in this cycle. Ditto. Minmax indexes Álvaro will respond to the design questions Heikki raised. Álvaro is still working on this and hopes to post an update soon. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 11:38:31 +0530, a...@2ndquadrant.com wrote: Performance --- scalable LWLocks Generic atomics implementation Both under active discussion. Work continues on these. KNN-GiST with recheck Partial sort Some earlier discussion, but no conclusions, and as far as I know nobody is working on these patches at the moment. Heikki said he won't be able to look at these, so unless someone else plans to review them, they should be moved to the next CF. Alexander is working on an updated version of the KNN-GiST patch anyway. Don't require a NBuffer sized PrivateRefCount array of local buffer pins Pending performance tests before commit. Is anyone working on this? No updates. Allow more join types to be removed Updated patch pending review from Simon. Updates welcome. Tom is looking at this now. Sort support for text with strxfrm() poor man's keys Lots of earlier discussion, but no conclusions as far as I can tell. Does anyone want to take a look at this? Scaling shared buffer eviction Pending review by Andres. Any updates? Use unique index for longer pathkeys No reviews, no reviewers. I took a quick look, and it's not clear if this is useful in itself, or if it just enables more interesting optimisations later. Does anyone want to look at this? No updates for any of these. XLogLockBlockRangeForCleanup Amit Khandekar plans to post a review this week. Pending performance results from me, but also a final conclusion from Amit's review. SKIP LOCKED Two patches available: original by Simon, updated/merged version by Thomas Munro. Simon said he'll look into both with a view towards committing the functionality. Timeline not yet clear. Allow NOT IN to use anti joins Revised patch marked pending review. Updates welcome. CSN snapshots Work in progress, plenty of discussion, no reviews. Probably nothing to do here. (Heikki, are you waiting for feedback or anything?) No updates for any of these. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
Hi, On 2014-07-04 22:59:15 +0900, MauMau wrote: My customer reported a strange connection hang problem. He and I couldn't reproduce it. I haven't been able to understand the cause, but I can think of one hypothesis. Could you give me your opinions on whether my hypothesis is correct, and a direction on how to fix the problem? I'm willing to submit a patch if necessary. The connection attempt is waiting for a reply from the standby. This is strange, because we didn't anticipate that the connection establishment (and subsequent SELECT queries) would update something and write some WAL. The doc says: http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server. ... Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits. [Hypothesis] Why does the connection processing emit WAL? Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes: [How to fix] Of course, adding -o '-c synchronous_commit=local' or -o '-c synchronous_standby_names=' to pg_ctl start in the recovery script would prevent the problem. But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit. I think we should rework RecordTransactionCommit() to only wait for the standby if `markXidCommitted' and not if `wrote_xlog'. There really isn't a reason to make a readonly transaction's commit wait just because it did some hot pruning. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
Custom Plan API Shigeru Hanada has said he plans to post a design review soon. Any updates? Should this be moved to the next CF? Now I'm working to revise the patch according to his suggestion; will be completed within a couple of days. A few issues needs design-level suggestion from committers. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 11:21:53 +0530, a...@2ndquadrant.com wrote: SQL commands Event triggers: object creation support Enormous patch, no reviews, no reviewers, but it's known to work already. Does anyone want to have a look at this? (I thought it was being reviewed, and didn't realise otherwise until a couple of days ago. Sorry about that.) No updates. I'll look at the patch myself, but I don't think I'll be able to post any useful review comments until the beginning of next week. change alter user to be a true alias for alter role Original patch worked (Vik) but didn't go far enough towards preventing future repetitions of the mistake (Tom) but the suggested change didn't turn out to be easy (Vik). I don't think anyone is working on this at the moment. No updates. RETURNING PRIMARY KEY syntax extension Patch works and has been revised as suggested. Some questions about how useful it is. Updates welcome. Ian will respond to comments soon. Allow an unlogged table to be changed to logged GSoC 2014 Stephen has looked at the patch a little, but isn't sure how much time he'll have to complete the review. More eyes are welcome. I recommend this patch to anyone who's looking to get started with reviewing something substantial: it looks very nicely-done and straightforward, but still non-trivial. Christoph Berg is looking at this, but doesn't expect to be able to post a review for a few days yet. (The recommendation still stands.) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 11:14:52 +0530, a...@2ndquadrant.com wrote: Security Row-security based on Updatable security barrier views Lots of discussion that I haven't dared to look at properly yet. I gather there's still plenty of design-level work needed, and this is not in any imminent danger of being committed. Work continues. Replication Recovery -- pg_receivexlog add synchronous mode Patch works, is generally acceptable. Fujii-san proposed a refactoring patch to be applied before this one, and plans to commit it soon. Refactoring patch committed, other patch still pending. Compression of Full Page Writes Currently under development, timeline unclear. Probably needs to be marked returned with feedback and moved to August CF. Revised patch posted, but lots of work still needed. WAL format API changes I'm not sure what's happening here. Will look more closely, but updates are welcome from the people who've been participating in the discussion/review. Waiting on Heikki. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 11:09:19 +0530, a...@2ndquadrant.com wrote: System administration - pg_hibernator Nice feature, some discussion, no code reviews. Status not entirely clear, but updated patch available. Now being reviewed by MauMau. Monitoring control Reducing impact of hints/cleanup for SELECTs Pending performance review by Emanuel Calvo. (He said he's posted a review, but I couldn't find it. He's travelling right now, will send me a pointer later.) No updates. issue log message to suggest VACUUM FULL if a table is nearly empty Euler has said he plans to post a review this week. pg_shmem_allocations view Euler has said he plans to post a review this week. No updates from Euler. pg_xlogdump --stats Pending review by Dilip Kumar (who plans to post a review this week), but Osamu Furuya and Marti Raudsepp have said it looks OK generally. Latest patch posted, some discussion needed on the calling conventions for the new rm_identify method. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 10:54:05 +0530, a...@2ndquadrant.com wrote: Functions - min/max support for inet datatypes Pending review by Muhammad Asif Naeem. No updates. Selectivity estimation for inet operators Dilip Kumar plans to post a review this week. Reviewed, and updated patch from Emre available. Clients --- Gaussian distribution pgbench Fabien Coelho plans to post a review of the latest patch. Some questions raised about whether we want this feature in this form at all. Status unclear. possibility to set double style for unicode lines No reviews, no reviewers. \N{DOUBLE LINED UNHAPPY FACE}[1] No updates. Allow parallel cores to be used by vacuumdb Status not entirely clear. Do we want this? Does the patch have too much duplicated code? Can it be fixed in this cycle? I'm not sure what the status of this patch is. add line number as prompt option to psql Sawada Masahiko plans to post an updated patch soon. New patch posted, pending review. Should be ready for committer soon. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 CF1
At 2014-07-02 10:44:23 +0530, a...@2ndquadrant.com wrote: Miscellaneous - contrib/fastbloat - tool for quickly assessing bloat stats for a table Pending review by Jaime. showing index update time on EXPLAIN Pending updated patch by Jaime. idle_in_transaction_session_timeout Marked as ready for committer, but as far as I can tell the patch needs revisions per Tom's comments. It's unclear if Vik is working on these, or Kevin, or nobody. Anyone? logging of replication commands Two patches posted: one generally accepted, the other contentious. Pending status update. Refactor SSL code to support other SSL implementations Some discussion, but no code reviews yet. Jeff Janes tried it and it hung; Andreas Karlsson fixed it and it worked for him. Everyone does seem to agree that it's a good idea. There is also one other ready for committer patch in this category. Bug fixes - Correctly place DLLs for ECPG apps in bin folder Pending review by Muhammad Asif Naeem. Per table autovacuum vacuum cost parameters behavior change Pending review by Álvaro. Ignore Ctrl-C/Break on Windows Christian Ullrich has said he will post an updated patch this week. There are also three ready for committer patches in this category. No updates at all. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wrapping in extended mode doesn't work well with default pager
So what's wrong with the patch? And what should I change in it for 9.5? 2014-07-07 3:12 GMT+04:00 Greg Stark st...@mit.edu: On Sun, Jul 6, 2014 at 8:40 AM, Sergey Muraviov sergey.k.murav...@gmail.com wrote: Is there anyone who can commit the patch? So what I'm inclined to do here (sigh) is commit it into 9.5 and revert it in 9.4. I think it's an improvement but I there's enough confusion and surprise about the changes from people who weren't expecting them and enough surprising side effects from things like check_postgres that letting it simmer for a full release so people can get used to it might be worthwhile. -- greg -- Best regards, Sergey Muraviov
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote: On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote: Thanks for the detailed feedback, I'm sorry it took so long to incorporate it. I've attached the latest version of the patch, fixing in particular: Looking a little more: * No tests exercise non-const offsets * No tests for default clauses with IGNORE NULLS * The use of bitmapsets is quite ugly. It would be nice if the API would grow the BMS within the memory context in which it was allocated, but I don't even see that the BMS is necessary. Why not just allocate a fixed-size array of bits, and forget the BMS? * Is there a reason you're leaving out first_value/last_value/nth_value? I think they could be supported without a lot of extra work. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] introduce XLogLockBlockRangeForCleanup()
On 4 July 2014 19:11, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Updated patch attached, thanks. Amit: what's your conclusion from the review? Other than some minor comments as mentioned below, I don't have any more issues, it looks all good. XLogLockBlockRangeForCleanup() function header comments has the function name spelled: XLogBlockRangeForCleanup In GetBufferWithoutRelcache(), we can call BufferDescriptorGetBuffer(buf) rather than BufferDescriptorGetBuffer(BufferDescriptors[buf_id]). -- Abhijit
Re: [HACKERS] [PATCH] introduce XLogLockBlockRangeForCleanup()
At 2014-07-07 14:02:15 +0530, amit.khande...@enterprisedb.com wrote: Other than some minor comments as mentioned below, I don't have any more issues, it looks all good. Thank you. (Updated patch attached.) -- Abhijit diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c index 5f9fc49..dc90c02 100644 --- a/src/backend/access/nbtree/nbtxlog.c +++ b/src/backend/access/nbtree/nbtxlog.c @@ -501,33 +501,9 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record) * here during crash recovery. */ if (HotStandbyActiveInReplay()) - { - BlockNumber blkno; - - for (blkno = xlrec-lastBlockVacuumed + 1; blkno xlrec-block; blkno++) - { - /* - * We use RBM_NORMAL_NO_LOG mode because it's not an error - * condition to see all-zero pages. The original btvacuumpage - * scan would have skipped over all-zero pages, noting them in FSM - * but not bothering to initialize them just yet; so we mustn't - * throw an error here. (We could skip acquiring the cleanup lock - * if PageIsNew, but it's probably not worth the cycles to test.) - * - * XXX we don't actually need to read the block, we just need to - * confirm it is unpinned. If we had a special call into the - * buffer manager we could optimise this so that if the block is - * not in shared_buffers we confirm it as unpinned. - */ - buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, blkno, - RBM_NORMAL_NO_LOG); - if (BufferIsValid(buffer)) - { -LockBufferForCleanup(buffer); -UnlockReleaseBuffer(buffer); - } - } - } + XLogLockBlockRangeForCleanup(xlrec-node, MAIN_FORKNUM, + xlrec-lastBlockVacuumed + 1, + xlrec-block); /* * If we have a full-page image, restore it (using a cleanup lock) and diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index b7829ff..a9aea31 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -287,10 +287,6 @@ XLogReadBuffer(RelFileNode rnode, BlockNumber blkno, bool init) * * In RBM_ZERO and RBM_ZERO_ON_ERROR modes, if the page doesn't exist, the * relation is extended with all-zeroes pages up to the given block number. - * - * In RBM_NORMAL_NO_LOG mode, we return InvalidBuffer if the page doesn't - * exist, and we don't check for all-zeroes. Thus, no log entry is made - * to imply that the page should be dropped or truncated later. */ Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum, @@ -331,8 +327,6 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum, log_invalid_page(rnode, forknum, blkno, false); return InvalidBuffer; } - if (mode == RBM_NORMAL_NO_LOG) - return InvalidBuffer; /* OK to extend the file */ /* we do this in recovery only - no rel-extension lock needed */ Assert(InRecovery); @@ -375,6 +369,62 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum, return buffer; } +/* + * XLogLockBlockRangeForCleanup is used in Hot Standby mode to emulate + * the locking effects imposed by VACUUM for nbtrees. + */ +void +XLogLockBlockRangeForCleanup(RelFileNode rnode, ForkNumber forkNum, + BlockNumber startBlkno, + BlockNumber uptoBlkno) +{ + BlockNumber blkno; + BlockNumber lastblock; + BlockNumber endingBlkno; + Buffer buffer; + SMgrRelation smgr; + + Assert(startBlkno != P_NEW); + Assert(uptoBlkno != P_NEW); + + /* Open the relation at smgr level */ + smgr = smgropen(rnode, InvalidBackendId); + + /* + * Create the target file if it doesn't already exist. This lets us cope + * if the replay sequence contains writes to a relation that is later + * deleted. (The original coding of this routine would instead suppress + * the writes, but that seems like it risks losing valuable data if the + * filesystem loses an inode during a crash. Better to write the data + * until we are actually told to delete the file.) + */ + smgrcreate(smgr, forkNum, true); + + lastblock = smgrnblocks(smgr, forkNum); + + endingBlkno = uptoBlkno; + if (lastblock endingBlkno) + endingBlkno = lastblock; + + for (blkno = startBlkno; blkno endingBlkno; blkno++) + { + /* + * All we need to do here is prove that we can lock each block + * with a cleanup lock. It's not an error to see all-zero pages + * here because the original btvacuumpage would not have thrown + * an error either. + * + * We don't actually need to read the block, we just need to + * confirm it is unpinned. + */ + buffer = GetBufferWithoutRelcache(rnode, forkNum, blkno); + if (BufferIsValid(buffer)) + { + LockBufferForCleanup(buffer); + UnlockReleaseBuffer(buffer); + } + } +} /* * Struct actually returned by XLogFakeRelcacheEntry, though the declared diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 07ea665..f7976a0 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@
Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max
On Mon, Jun 30, 2014 at 4:45 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: At 2014-06-30 16:35:45 +0500, anaeem...@gmail.com wrote: pc1dotnetpk:postgresql asif$ patch -p0 ~/core/min_max_support_for_inet_datatypes/inet_agg_v4.patch can't find file to patch at input line 3 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -- |*** a/src/backend/utils/adt/network.c |--- b/src/backend/utils/adt/network.c -- You need to use patch -p1, to strip off the a/b prefix. Thank you Abhijit, It worked. -- Abhijit
Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max
Hi Haribabu, Thank you for sharing the patch. I have spent some time to review the changes. Overall patch looks good to me, make check and manual testing seems run fine with it. There seems no related doc/sgml changes ?. Patch added network_smaller() and network_greater() functions but in PG source code, general practice seems to be to use “smaller and “larger” as related function name postfix e.g. timestamp_smaller()/timestamp_larger(), interval_smaller/interval_larger(), cashsmaller()/cashlarger() etc. Thanks. Regards, Muhammad Asif Naeem On Mon, Jul 7, 2014 at 1:56 PM, Asif Naeem anaeem...@gmail.com wrote: On Mon, Jun 30, 2014 at 4:45 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: At 2014-06-30 16:35:45 +0500, anaeem...@gmail.com wrote: pc1dotnetpk:postgresql asif$ patch -p0 ~/core/min_max_support_for_inet_datatypes/inet_agg_v4.patch can't find file to patch at input line 3 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -- |*** a/src/backend/utils/adt/network.c |--- b/src/backend/utils/adt/network.c -- You need to use patch -p1, to strip off the a/b prefix. Thank you Abhijit, It worked. -- Abhijit
Re: [HACKERS] pg_xlogdump --stats
On 2014-07-04 18:59:07 +0530, Abhijit Menon-Sen wrote: At 2014-07-04 11:34:21 +0200, and...@2ndquadrant.com wrote: I think we're going to have to redefine the PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT callsite in configure.in to define INT64_MODIFIER='ll/l/I64D' I've attached a patch to do this, and also add INT64_MODIFIER to pg_config.h.in: 0001-modifier.diff I reran autoconf, and just for convenience I've attached the resulting changes to configure: 0002-configure.diff Then there are the rm_identify changes: 0003-rmid.diff Finally, the xlogdump patch using INT64_MODIFIER: 0004-xlogdump.diff I can confirm that this series applies in-order to master, and that the result builds cleanly (including after each patch) on my machine, and that the resulting pg_xlogdump works as expected. Two of the extra calls to psprintf in pg_xlogdump happen at most RM_MAX_ID*16 (i.e. O(record-types) not O(records)) times, and the other two happen just before exit. It would be easy to use a static buffer and snprintf, but I don't think it's worth doing in this case. Agreed. diff --git a/config/c-library.m4 b/config/c-library.m4 index 8f45010..4821a61 100644 --- a/config/c-library.m4 +++ b/config/c-library.m4 @@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS -# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT +# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER # --- -# Determine which format snprintf uses for long long int. We handle -# %lld, %qd, %I64d. The result is in shell variable -# LONG_LONG_INT_FORMAT. +# Determine which length modifier snprintf uses for long long int. We +# handle ll, q, and I64. The result is in shell variable +# LONG_LONG_INT_MODIFIER. # # MinGW uses '%I64d', though gcc throws an warning with -Wall, # while '%lld' doesn't generate a warning, but doesn't work. # -AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT], -[AC_MSG_CHECKING([snprintf format for long long int]) -AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_format, -[for pgac_format in '%lld' '%qd' '%I64d'; do +AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER], +[AC_MSG_CHECKING([snprintf length modifier for long long int]) +AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_modifier, +[for pgac_modifier in 'll' 'q' 'I64'; do AC_TRY_RUN([#include stdio.h typedef long long int ac_int64; -#define INT64_FORMAT $pgac_format +#define INT64_FORMAT %${pgac_modifier}d I'd rather not define INT64_FORMAT here. +INT64_FORMAT=\%${LONG_LONG_INT_MODIFIER}d\ +UINT64_FORMAT=\%${LONG_LONG_INT_MODIFIER}u\ +INT64_MODIFIER=\$LONG_LONG_INT_MODIFIER\ + AC_DEFINE_UNQUOTED(INT64_FORMAT, $INT64_FORMAT, [Define to the appropriate snprintf format for 64-bit ints.]) AC_DEFINE_UNQUOTED(UINT64_FORMAT, $UINT64_FORMAT, [Define to the appropriate snprintf format for unsigned 64-bit ints.]) +AC_DEFINE_UNQUOTED(INT64_MODIFIER, $INT64_MODIFIER, + [Define to the appropriate snprintf length modifier for 64-bit ints.]) + I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT, UINT64_FORMAT ontop, in c.h. NOTE: I do not know what to do about pg_config.h.win32. If someone tells me what to do, I can submit another patch. Which would also take care of pg_config.h.win32 - just define INT64_MODIFIER in there instead of INT64_FORMAT/UINT64_FORMAT and you should be good. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog add synchronous mode
On Fri, Jul 4, 2014 at 7:45 PM, furu...@pm.nttdata.co.jp wrote: Thanks for reviewing the patch! I think that this refactoring patch is useful for improving source code readability and making the future patches simpler, whether we adopt your patch or not. So, barring any objections, I'm thinking to commit this refactoring patch. Committed! It is a patch that added the --fsync-interval option. Interface of --fsync-interval option was referring to the pg_recvlogical.c. It is not judgement the flush on a per-message basis. It is judgment at the time of receipt stop of the message. If you specify a zero --fsync-interval make the flush at the same timing as the walreceiver . If you do not specify --fsync-interval, you will flush only when switching as WAL files as in the past. Thanks for revising the patch! Could you update the status of this patch from Waiting on Author to Needs Review when you post the revised version of the patch? +How often should applicationpg_receivexlog/application issue sync +commands to ensure the received WAL file is safely +flushed to disk without being asked by the server to do so. without being asked by the server to do so implies that the server asks pg_receivexlog to flush WAL files periodically? Specifying +an interval of literal0/literal together the consecutive data. This text looks corrupted. What does this mean? +Not specifying an interval disables issuing fsyncs altogether, +while still reporting progress the server. No. Even when the option is not specified, WAL files should be flushed at WAL file switch, like current pg_receivexlog does. If you want to disable the flush completely, you can change the option so that it accepts -1 which means to disable the flush, for example. +printf(_( -F --fsync-interval=SECS\n + frequency of syncs to the output file (default: file close only)\n)); It's better to use transaction log files rather than output file here. SECS should be INTERVAL for the sake of consistency with --stat-interval's help message? + * fsync_interval controls how often we flush to the received + * WAL file, in seconds. seconds should be miliseconds? The patch changes pg_receivexlog so that it keep processing the continuous messages without calling stream_stop(). But as I commented before, stream_stop should be called every time the message is received? Otherwise pg_basebackup background WAL receiver might not be able to stop streaming at proper point. The flush interval is checked and WAL files are flushed only when CopyStreamReceive() returns 0, i.e., there is no message available and the timeout occurs. Why did you do that? I'm afraid that CopyStreamReceive() always waits for at least one second before WAL files are flushed even when --fsync-interval is set to 0. Why don't you update output_last_status when WAL file is flushed and closed? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
On Mon, Jul 7, 2014 at 4:14 PM, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2014-07-04 22:59:15 +0900, MauMau wrote: My customer reported a strange connection hang problem. He and I couldn't reproduce it. I haven't been able to understand the cause, but I can think of one hypothesis. Could you give me your opinions on whether my hypothesis is correct, and a direction on how to fix the problem? I'm willing to submit a patch if necessary. The connection attempt is waiting for a reply from the standby. This is strange, because we didn't anticipate that the connection establishment (and subsequent SELECT queries) would update something and write some WAL. The doc says: http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server. ... Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits. [Hypothesis] Why does the connection processing emit WAL? Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes: [How to fix] Of course, adding -o '-c synchronous_commit=local' or -o '-c synchronous_standby_names=' to pg_ctl start in the recovery script would prevent the problem. But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit. I think we should rework RecordTransactionCommit() to only wait for the standby if `markXidCommitted' and not if `wrote_xlog'. There really isn't a reason to make a readonly transaction's commit wait just because it did some hot pruning. Sounds good direction. One question is: Can RecordTransactionCommit() avoid waiting for not only replication but also local WAL flush safely in such read-only transaction case? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Thank you for review comments. There are still numerous formatting changes required, e.g. spaces around = and correct formatting of comments. And git diff --check still has a few whitespace problems. I won't point these out one by one, but maybe you should run pgindent I will do this. Could you look into his suggestions of other places to do the allocation, please? I will get back to you on this Wouldn't it be better to set bkpb-block_compression = compress_backup_block; once earlier instead of setting it that way once and setting it to BACKUP_BLOCK_COMPRESSION_OFF in two other places Yes. If you're using VARATT_IS_COMPRESSED() to detect compression, don't you need SET_VARSIZE_COMPRESSED() in CompressBackupBlock? pglz_compress() does it for you, but the other two algorithms don't. Yes we need SET_VARSIZE_COMPRESSED. It is present in wrappers around snappy and LZ4 namely pg_snappy_compress and pg_LZ4_compress. But now that you've added bkpb.block_compression, you should be able to avoid VARATT_IS_COMPRESSED() altogether, unless I'm missing something. What do you think? You are right. It can be removed. Thank you, On Fri, Jul 4, 2014 at 9:35 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: At 2014-07-04 21:02:33 +0530, a...@2ndquadrant.com wrote: +/* + */ +static const struct config_enum_entry backup_block_compression_options[] = { Oh, I forgot to mention that the configuration setting changes are also pending. I think we had a working consensus to use full_page_compression as the name of the GUC. As I understand it, that'll accept an algorithm name as an argument while we're still experimenting, but eventually once we select an algorithm, it'll become just a boolean (and then we don't need to put algorithm information into BkpBlock any more either). -- Abhijit
Re: [HACKERS] add line number as prompt option to psql
Hi, Found two (A and B) issues with latest patch: A: -- Set prompts postgres=# \set PROMPT1 '%/[%l]%R%# ' postgres[1]=# \set PROMPT2 '%/[%l]%R%# ' postgres[1]=# postgres[1]=# select postgres[2]-# * postgres[3]-# from postgres[4]-# abc; ERROR: relation abc does not exist LINE 4: abc; ^ Now I used \e to edit the source. Deleted last line i.e. abc; and returned to prompt, landed at line 4, typed abc; postgres[1]=# \e postgres[4]-# abc; ERROR: relation abc does not exist LINE 5: abc; ^ In above steps, error message says LINE 5, where as on prompt abc is at line 4. postgres[1]=# select * from abc; ERROR: relation abc does not exist LINE 4: abc; ^ Here I again see error at line 4. Something fishy. Please investigate. Looks like bug in LINE numbering in error message, not sure though. But with prompt line number feature, it should be sync to each other. B: However, I see that you have removed the code changes related to INT_MAX. Why? I have set cur_line to INT_MAX - 2 and then observed that after 2 lines I start getting negative numbers. Thanks On Sun, Jul 6, 2014 at 10:48 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Fri, Jun 20, 2014 at 7:17 PM, Jeevan Chalke jeevan.cha...@enterprisedb.com wrote: Hi Sawada Masahiko, I liked this feature. So I have reviewed it. Changes are straight forward and looks perfect. No issues found with make/make install/initdb/regression. However I would suggest removing un-necessary braces at if, as we have only one statement into it. if (++cur_line = INT_MAX) { cur_line = 1; } Also following looks wrong: postgres[1]=# select postgres[2]-# * postgres[3]-# from postgres[4]-# tab; a --- (0 rows) postgres[1]=# select * from tab postgres[2]-# where t 10; ERROR: column t does not exist LINE 5: where t 10; ^ Line number in ERROR is 5 which is correct. But line number in psql prompt is wrong. To get first 4 lines I have simply used up arrow followed by an enter for which I was expecting 5 in psql prompt. But NO it was 2 which is certainly wrong. Need to handle above carefully. Thanks On Thu, Jun 12, 2014 at 10:46 PM, Sawada Masahiko sawada.m...@gmail.com wrote: Hi all, The attached IWP patch is one prompt option for psql, which shows current line number. If the user made syntax error with too long SQL then psql outputs message as following. ERROR: syntax error at or near a LINE 250: hoge ^ psql teaches me where syntax error is occurred, but it is not kind when SQL is too long. We can use write SQL with ¥e(editor) command(e.g., emacs) , and we can know line number. but it would make terminal log dirty . It will make analyzing of log difficult after error is occurred. (I think that we usually use copy paste) After attached this patch, we will be able to use %l option as prompting option. e.g., $ cat ~/.psqlrc \set PROMPT2 '%/[%l]%R%# ' \set PROMPT1 '%/[%l]%R%# ' $ psql -d postgres postgres[1]=# select postgres[2]-# * postgres[3]-# from postgres[4]-# hoge; The past discussion is following. http://www.postgresql.org/message-id/cafj8prc1rupk6+cha1jpxph3us_zipabdovmceex4wpbp2k...@mail.gmail.com Please give me feedback. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: +91 20 30589500 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. Thank you for reviewing patch, and sorry for late response. I have updated this patch, and attached it. postgres[1]=# select * from tab postgres[2]-# where t 10; ERROR: column t does not exist LINE 5: where t 10; Attached patch can handle this case. Please give me feedback. Regards, --- Sawada Masahiko -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone:
Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder
Hi MauMau, Other than my pervious comments, patch looks good to me. Thanks. Regards, Muhammad Asif Naeem On Wed, Feb 26, 2014 at 2:14 AM, Peter Eisentraut pete...@gmx.net wrote: You should be able to do this without specifically referencing the names libpq or ecpg. Look into the Makefile, if it defines SO_MAJOR_VERSION, then it's a library you are concerned with, and then do the copying or moving.
Re: [HACKERS] tab completion for setting search_path
On Tue, Jun 24, 2014 at 11:57 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Jun 23, 2014 at 6:25 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Jun 23, 2014 at 3:53 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-23 13:10:34 -0400, Robert Haas wrote: On Mon, Jun 23, 2014 at 9:10 AM, Kevin Grittner kgri...@ymail.com wrote: I would be for excluding the pg_toast, pg_toast_temp_n, and pg_temp_n schemas, and including public and pg_catalog. +1. Jeff, are you willing to update the patch that way? Seems we have something several people can live with ;) I was hoping not to add a third set of filters (separate from the ones already implicit in either \dn or \dnS), but that's OK; as long as I get a doghouse I won't worry much about the color. I've included information_schema as well, in analogy to the inclusion of pg_catalog. Is there any blocker on this patch? The patch looks good to me. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing join removals for more join types
On Mon, Jul 7, 2014 at 4:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: David Rowley dgrow...@gmail.com writes: On 6 July 2014 03:20, Tom Lane t...@sss.pgh.pa.us wrote: Just to note that I've started looking at this, and I've detected a rather significant omission: there's no check that the join operator has anything to do with the subquery's grouping operator. hmm, good point. If I understand this correctly we can just ensure that the same operator is used for both the grouping and the join condition. Well, that's sort of the zero-order solution, but it doesn't work if the join operators are cross-type. I poked around to see if we didn't have some code already for that, and soon found that not only do we have such code (equality_ops_are_compatible) but actually almost this entire patch duplicates logic that already exists in optimizer/util/pathnode.c, to wit create_unique_path's subroutines query_is_distinct_for et al. So I'm thinking what this needs to turn into is an exercise in refactoring to allow that logic to be used for both purposes. Well, it seems that might just reduce the patch size a little! I currently have this half hacked up to use query_is_distinct_for, but I see there's no code that allows Const's to exist in the join condition. I had allowed for this in groupinglist_is_unique_on_restrictinfo() and I tested it worked in a regression test (which now fails). I think to fix this, all it would take would be to modify query_is_distinct_for to take a list of Node's rather than a list of column numbers then just add some logic that skips if it's a Const and checks it as it does now if it's a Var Would you see a change of this kind a valid refactor for this patch? I notice that create_unique_path is not paying attention to the question of whether the subselect's tlist contains SRFs or volatile functions. It's possible that that's a pre-existing bug. *shrug*, perhaps the logic for that is best moved into query_is_distinct_for then? It might save a bug in the future too that way. Regards David Rowley
Re: [HACKERS] IMPORT FOREIGN SCHEMA statement
On Mon, May 26, 2014 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: Albe Laurenz laurenz.a...@wien.gv.at writes: In addition to data type mapping questions (which David already raised) I have one problem when I think of the Oracle FDW: Oracle follows the SQL standard in folding table names to upper case. So this would normally result in a lot of PostgreSQL foreign tables with upper case names, which would be odd and unpleasant. I cannot see a way out of that, but I thought I should mention it. It seems like it would often be desirable for the Oracle FDW to smash all-upper-case names to all-lower-case while importing, so that no quoting is needed on either side. I doubt though that this is so desirable that it should happen unconditionally. Between this and the type-mapping questions, it seems likely that we're going to need a way for IMPORT FOREIGN SCHEMA to accept user-supplied control options, which would in general be specific to the FDW being used. (Another thing the SQL committee failed to think about.) Is this part of the SQL standard? What is it defined to do about non-table objects? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IMPORT FOREIGN SCHEMA statement
Le lundi 7 juillet 2014 07:58:33 Robert Haas a écrit : On Mon, May 26, 2014 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: Albe Laurenz laurenz.a...@wien.gv.at writes: In addition to data type mapping questions (which David already raised) I have one problem when I think of the Oracle FDW: Oracle follows the SQL standard in folding table names to upper case. So this would normally result in a lot of PostgreSQL foreign tables with upper case names, which would be odd and unpleasant. I cannot see a way out of that, but I thought I should mention it. It seems like it would often be desirable for the Oracle FDW to smash all-upper-case names to all-lower-case while importing, so that no quoting is needed on either side. I doubt though that this is so desirable that it should happen unconditionally. Between this and the type-mapping questions, it seems likely that we're going to need a way for IMPORT FOREIGN SCHEMA to accept user-supplied control options, which would in general be specific to the FDW being used. (Another thing the SQL committee failed to think about.) Is this part of the SQL standard? What is it defined to do about non-table objects? The OPTIONS clause is not part of the SQL Standard. Regarding non-table objects, the standard only talks about tables, and nothing else. -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] RETURNING PRIMARY KEY syntax extension
On Fri, Jul 4, 2014 at 12:36 AM, Rushabh Lathia rushabh.lat...@gmail.com wrote: On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: Tom Dunstan pg...@tomd.cc writes: On 4 July 2014 00:07, Tom Lane t...@sss.pgh.pa.us wrote: It sounds to me like designing this for JDBC's getGeneratedKeys method is a mistake. There isn't going to be any way that the driver can support that without having looked at the table's metadata for itself, and if it's going to do that then it doesn't need a crutch that only partially solves the problem anyhow. Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet from whatever was returned. It's CURRENTLY doing that, but it's appending RETURNING * and leaving it up to the caller of getGeneratedKeys() to work out which columns the caller is interested in. I might be mistaken, but as I read the spec for getGeneratedKeys, whether or not a column is in a primary key has got nothing to do with whether that column should be returned. It looks to me like what you're supposed to return is columns with computed default values, eg, serial columns. (Whether we should define it as being *exactly* columns created by the SERIAL macro is an interesting question, but for sure those ought to be included.) Now, the pkey might be a serial column ... but it equally well might not be; it might not have any default expression at all. And there certainly could be serial columns that weren't in the pkey. 100% agree with Tom. Well, we could have a RETURNING GENERATED COLUMNS construct, or something like that. I can certainly see the usefulness of such a thing. As a general note not necessarily related to this specific issue, I think we should be careful not to lose sight of the point of this feature, which is to allow pgsql-jdbc to more closely adhere to the JDBC specification. While it's great if this feature has general utility, if it doesn't help pgsql-jdbc, then it fails to achieve the author's original intention. We need to make sure we're not throwing up too many obstacles in the way of better driver compliance if we want to have good drivers. As a code-level comment, I am not too find of adding yet another value for IndexAttrBitmapKind; every values we compute in RelationGetIndexAttrBitmap adds overhead for everyone, even people not using the feature. Admittedly, it's only paid once per relcache load, but does map_primary_key_to_list() really need this information cached, or can it just look through the index list for the primary key, and then work directly from that? Also, map_primary_key_to_list() tacitly assumes that an auto-updatable view has only 1 from-list item. That seems like a good thing to check with an Assert(), just in case we should support some other case in the future. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use unique index for longer pathkeys.
Hi. I took a quick look at this patch, more or less because nobody else did. Duing last CF, I proposed to match long pathkeys against index columns during creating index paths. This worked fine but also it is difficult to make sure that all side-effects are eliminated. Finally Tom Lane suggested to truncate pathkeys while generation of the pathkeys itself. So this patch comes. I found your older patch quite straightforward to understand, but the new one much more difficult to follow (but that's not saying much, I am not very familiar with the planner code in general). Do you have any references to the discussion about the side-effects that needed to be eliminated with the earlier patch? -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On Fri, Jul 4, 2014 at 1:15 AM, Dilip kumar dilip.ku...@huawei.com wrote: In attached patch, I have moved pgpipe, piperead functions to src/port/pipe.c If we want to consider proceeding with this approach, you should probably separate this into a refactoring patch that doesn't do anything but move code around and a feature patch that applies on top of it. (As to whether this is the right approach, I'm not sure.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster name in ps output
On Fri, Jul 4, 2014 at 10:33 AM, Tom Lane t...@sss.pgh.pa.us wrote: Vik Fearing vik.fear...@dalibo.com writes: You mean that if synchronous_standby_names is an empty it automatically should be set to cluster_name? No, I mean that synchronous_standby_names should look at cluster_name first, and if it's not set then unfortunately look at application_name for backward compatibility. I think you're failing to explain yourself very well. What you really mean is that we should use cluster_name not application_name to determine the name that a standby server reports to the master. There's something to that, perhaps, but I think that the mechanism we use for doing the reporting involves the application_name parameter passed through libpq. It might be a bit painful to change that, and I'm not entirely sure it'd be less confusing. I actually prefer it the way it is now, I think. Arguably, application_name is not quite the right thing for identifying a synchronous standby, because it's intended to answer the question What *class* of connection is this? rather than Which *precise* connection is this?. But I don't think there's anything especially wrong with having a class that has only 1 member when synchronous replication is in use; indeed, in some ways it seems quite natural. That connection is after all unique. OTOH, AIUI, cluster_name is all about what shows up in the ps output, and is intended to distinguish multiple clusters running on the same server. If you're not doing that, you likely want cluster_name to be empty, but you might still want to use synchronous replication. If you are doing it, you may well want to set it to a value that distinguishes the server only from others running on the same box, like maybe the version number or port -- whereas for matching against synchronous_standby_names, we need a value that's meaningful across all related servers, but not necessarily distinguishing from other stuff on the same server. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
Andres Freund and...@2ndquadrant.com writes: I think we should rework RecordTransactionCommit() to only wait for the standby if `markXidCommitted' and not if `wrote_xlog'. There really isn't a reason to make a readonly transaction's commit wait just because it did some hot pruning. Well, see the comment that explains why the logic is like this now: * If we didn't create XLOG entries, we're done here; otherwise we * should flush those entries the same as a commit record. (An * example of a possible record that wouldn't cause an XID to be * assigned is a sequence advance record due to nextval() --- we want * to flush that to disk before reporting commit.) I agree that HOT pruning isn't a reason to make a commit wait, but nextval() is. We could perhaps add more flags that would keep track of which sorts of xlog entries justify a wait at commit, but TBH I'm skeptical of the entire proposition. Having synchronous replication on with no live slave *will* result in arbitrary hangs, and the argument that this particular case should be exempt seems a bit thin to me. The sooner the user realizes he's got a problem, the better. If read-only transactions don't show a problem, the user might not realize he's got one until he starts to wonder why autovac/autoanalyze aren't working. I think a more useful line of thought would be to see if we can't complain more loudly when we have no synchronous standby. Perhaps a WARNING: waiting forever for lack of a synchronous standby could be emitted when a transaction starts to wait. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: Right now, constraint exclusion code looks only at the provided conditions. If we want avoid table scan based on constraints in the above example, it will need to look at the implied conditions as well. E.g. val2 30 AND val = val2 = val 30. Then the constraint exclusion can see that val 30 AND val 30 are contradictory and infer that the result is going to be empty. We will need to add information about the transitive inferences between operators. Can we do that in PostgreSQL? Will the benefits be worth the code, that needs to be added? I doubt it. The extra code isn't the problem so much, it's the extra planning cycles spent trying to make proofs that will usually fail. What you propose will create a combinatorial explosion in the number of proof paths to be considered. I can see some more benefits. We can use it to eliminate joins where the constraints on joining relations may cause the join to be empty e.g. ... and applying constraint exclusion to join relations would make that problem even worse. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pg_upgrade and toast tables bug discovered
On Fri, Jul 4, 2014 at 11:12 PM, Bruce Momjian br...@momjian.us wrote: On Fri, Jul 4, 2014 at 12:01:37AM -0400, Bruce Momjian wrote: The most robust, but not trivial, approach seems to be to prevent toast table creation if there wasn't a set_next_toast_pg_class_oid(). Then, after all relations are created, iterate over all pg_class entries that possibly need toast tables and recheck if they now might need one. Wow, that is going to be kind of odd in that there really isn't a good way to create toast tables except perhaps add a dummy TEXT column and remove it. There also isn't an easy way to not create a toast table, but also find out that one was needed. I suppose we would have to insert some dummy value in the toast pg_class column and come back later to clean it up. I am wondering what the probability of having a table that didn't need a toast table in the old cluster, and needed one in the new cluster, and there being an oid collision. I think the big question is whether we want to go down that route. Here is an updated patch. It was a little tricky because if the mismatched non-toast table is after the last old relation, you need to test differently and emit a different error message as there is no old relation to complain about. As far as the reusing of oids, we don't set the oid counter until after the restore, so any new unmatched toast table would given a very low oid. Since we restore in oid order, for an oid to be assigned that was used in the old cluster, it would have to be a very early relation, so I think that reduces the odds considerably. I am not inclined to do anything more to avoid this until I see an actual error report --- trying to address it might be invasive and introduce new errors. That sounds pretty shaky from where I sit. I wonder if the pg_upgrade_support module should have a function create_toast_table_if_needed(regclass) or something like that. Or maybe just create_any_missing_toast_tables(), iterating over all relations. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
On 2014-07-07 09:57:20 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I think we should rework RecordTransactionCommit() to only wait for the standby if `markXidCommitted' and not if `wrote_xlog'. There really isn't a reason to make a readonly transaction's commit wait just because it did some hot pruning. Well, see the comment that explains why the logic is like this now: * If we didn't create XLOG entries, we're done here; otherwise we * should flush those entries the same as a commit record. (An * example of a possible record that wouldn't cause an XID to be * assigned is a sequence advance record due to nextval() --- we want * to flush that to disk before reporting commit.) I think we should 'simply' make sequences assign a toplevel xid - then we can get rid of that special case in RecordTransactionCommit(). And I think the performance benefit of not having to wait on XLogFlush() for readonly xacts due to hot prunes far outweighs the decrease due to the xid assignment/commit record. I don't think that nextval()s are called overly much without a later xid assigning statement. I agree that HOT pruning isn't a reason to make a commit wait, but nextval() is. Agreed. We could perhaps add more flags that would keep track of which sorts of xlog entries justify a wait at commit, but TBH I'm skeptical of the entire proposition. Having synchronous replication on with no live slave *will* result in arbitrary hangs, and the argument that this particular case should be exempt seems a bit thin to me. The sooner the user realizes he's got a problem, the better. If read-only transactions don't show a problem, the user might not realize he's got one until he starts to wonder why autovac/autoanalyze aren't working. Well, the user might just want to log in to diagnose the problem. If he can't even login to see pg_stat_replication it's a pretty screwed up situation. I think a more useful line of thought would be to see if we can't complain more loudly when we have no synchronous standby. Perhaps a WARNING: waiting forever for lack of a synchronous standby could be emitted when a transaction starts to wait. In the OP's case the session wasn't even started - so proper feedback isn't that easy... We could special case that by forcing s_c=off until the session started properly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
Andres Freund and...@2ndquadrant.com writes: On 2014-07-07 09:57:20 -0400, Tom Lane wrote: Well, see the comment that explains why the logic is like this now: I think we should 'simply' make sequences assign a toplevel xid - then we can get rid of that special case in RecordTransactionCommit(). And I think the performance benefit of not having to wait on XLogFlush() for readonly xacts due to hot prunes far outweighs the decrease due to the xid assignment/commit record. I don't think that nextval()s are called overly much without a later xid assigning statement. Yeah, that could well be true. I'm not sure if there are any other cases where we have non-xid-assigning operations that are considered part of what has to be flushed before reporting commit; if there are not, I'd be okay with changing nextval() this way. I think a more useful line of thought would be to see if we can't complain more loudly when we have no synchronous standby. Perhaps a WARNING: waiting forever for lack of a synchronous standby could be emitted when a transaction starts to wait. In the OP's case the session wasn't even started - so proper feedback isn't that easy... Perhaps I'm wrong, but I think a WARNING emitted here would be seen in psql even though we're still in InitPostgres. If it isn't, we have a problem there anyhow, IMO. We could special case that by forcing s_c=off until the session started properly. Ugh. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
On Tue, Jul 8, 2014 at 1:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-07-07 09:57:20 -0400, Tom Lane wrote: Well, see the comment that explains why the logic is like this now: I think we should 'simply' make sequences assign a toplevel xid - then we can get rid of that special case in RecordTransactionCommit(). And I think the performance benefit of not having to wait on XLogFlush() for readonly xacts due to hot prunes far outweighs the decrease due to the xid assignment/commit record. I don't think that nextval()s are called overly much without a later xid assigning statement. Yeah, that could well be true. I'm not sure if there are any other cases where we have non-xid-assigning operations that are considered part of what has to be flushed before reporting commit; Maybe pg_switch_xlog(). if there are not, I'd be okay with changing nextval() this way. +1 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing join removals for more join types
David Rowley dgrowle...@gmail.com writes: On Mon, Jul 7, 2014 at 4:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: I poked around to see if we didn't have some code already for that, and soon found that not only do we have such code (equality_ops_are_compatible) but actually almost this entire patch duplicates logic that already exists in optimizer/util/pathnode.c, to wit create_unique_path's subroutines query_is_distinct_for et al. So I'm thinking what this needs to turn into is an exercise in refactoring to allow that logic to be used for both purposes. Well, it seems that might just reduce the patch size a little! I currently have this half hacked up to use query_is_distinct_for, but I see there's no code that allows Const's to exist in the join condition. I had allowed for this in groupinglist_is_unique_on_restrictinfo() and I tested it worked in a regression test (which now fails). I think to fix this, all it would take would be to modify query_is_distinct_for to take a list of Node's rather than a list of column numbers then just add some logic that skips if it's a Const and checks it as it does now if it's a Var Would you see a change of this kind a valid refactor for this patch? I'm a bit skeptical as to whether testing for that case is actually worth any extra complexity. Do you have a compelling use-case? But anyway, if we do want to allow it, why does it take any more than adding a check for Consts to the loops in query_is_distinct_for? It's the targetlist entries where we'd want to allow Consts, not the join conditions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster name in ps output
Tom Lane wrote: Also, -1 for adding another log_line_prefix escape. If you're routing multiple clusters logging to the same place (which is already a bit unlikely IMO), you can put distinguishing strings in log_line_prefix already. And it's not like we've got an infinite supply of letters for those escapes. FWIW if we find ourselves short on log_line_prefix letters, we can always invent more verbose notation -- for instance we could have escape for each GUC var such as %{config:port} for the port number, and so on. Probably this will be mostly useless for most GUC params, but it might come in handy for a few of them. And of course we could have more prefixes apart from config:. Not that this affects the current patch in any way. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
On 2014-07-07 12:06:14 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-07-07 09:57:20 -0400, Tom Lane wrote: Well, see the comment that explains why the logic is like this now: I think we should 'simply' make sequences assign a toplevel xid - then we can get rid of that special case in RecordTransactionCommit(). And I think the performance benefit of not having to wait on XLogFlush() for readonly xacts due to hot prunes far outweighs the decrease due to the xid assignment/commit record. I don't think that nextval()s are called overly much without a later xid assigning statement. Yeah, that could well be true. I'm not sure if there are any other cases where we have non-xid-assigning operations that are considered part of what has to be flushed before reporting commit; if there are not, I'd be okay with changing nextval() this way. I'm not aware of any adhoc, but I think to actually change it someone would have to iterate over all wal record types to make sure. I think a more useful line of thought would be to see if we can't complain more loudly when we have no synchronous standby. Perhaps a WARNING: waiting forever for lack of a synchronous standby could be emitted when a transaction starts to wait. In the OP's case the session wasn't even started - so proper feedback isn't that easy... Perhaps I'm wrong, but I think a WARNING emitted here would be seen in psql even though we're still in InitPostgres. Yes, it is visible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pg_upgrade and toast tables bug discovered
On Fri, Jul 4, 2014 at 11:12:58PM -0400, Bruce Momjian wrote: On Fri, Jul 4, 2014 at 12:01:37AM -0400, Bruce Momjian wrote: The most robust, but not trivial, approach seems to be to prevent toast table creation if there wasn't a set_next_toast_pg_class_oid(). Then, after all relations are created, iterate over all pg_class entries that possibly need toast tables and recheck if they now might need one. Wow, that is going to be kind of odd in that there really isn't a good way to create toast tables except perhaps add a dummy TEXT column and remove it. There also isn't an easy way to not create a toast table, but also find out that one was needed. I suppose we would have to insert some dummy value in the toast pg_class column and come back later to clean it up. I am wondering what the probability of having a table that didn't need a toast table in the old cluster, and needed one in the new cluster, and there being an oid collision. I think the big question is whether we want to go down that route. Here is an updated patch. It was a little tricky because if the mismatched non-toast table is after the last old relation, you need to test differently and emit a different error message as there is no old relation to complain about. As far as the reusing of oids, we don't set the oid counter until after the restore, so any new unmatched toast table would given a very low oid. Since we restore in oid order, for an oid to be assigned that was used in the old cluster, it would have to be a very early relation, so I think that reduces the odds considerably. I am not inclined to do anything more to avoid this until I see an actual error report --- trying to address it might be invasive and introduce new errors. Patch applied back through 9.2. 9.1 and earlier had code that was different enought that I thought it would cause too many problems, and I doubt many users are doing major uprades to 9.1, and the bug is rare. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pg_upgrade and toast tables bug discovered
On Mon, Jul 7, 2014 at 11:24:51AM -0400, Robert Haas wrote: As far as the reusing of oids, we don't set the oid counter until after the restore, so any new unmatched toast table would given a very low oid. Since we restore in oid order, for an oid to be assigned that was used in the old cluster, it would have to be a very early relation, so I think that reduces the odds considerably. I am not inclined to do anything more to avoid this until I see an actual error report --- trying to address it might be invasive and introduce new errors. That sounds pretty shaky from where I sit. I wonder if the pg_upgrade_support module should have a function create_toast_table_if_needed(regclass) or something like that. Or maybe just create_any_missing_toast_tables(), iterating over all relations. Uh, I guess we could write some code that iterates over all tables and finds the tables that should have TOAST tables, but don't (because binary-upgrade backend mode suppressed their creation), and adds them. However, that would be a lot of code and might be risky to backpatch. The error is so rare I am not sure it is worth it. I tried to create the failure case and it was very difficult, requiring me to create the problem table first, then some dummy stuff to get the offsets right so the oid would collide. Based on the rareness of the bug, I am not sure it is worth it, but if it is, it would be something only for 9.4 (maybe) and 9.5, not something to backpatch. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS Design
On Thu, Jul 3, 2014 at 1:14 AM, Stephen Frost sfr...@snowman.net wrote: Alright, apologies for it being a bit later than intended, but here's what I've come up with thus far. -- policies defined at a table scope -- allows using the same policy name for different tables -- with quals appropriate for each table ALTER TABLE t1 ADD POLICY p1 USING p1_quals; ALTER TABLE t1 ADD POLICY p2 USING p2_quals; -- used to drop a policy definition from a table ALTER TABLE t1 DROP POLICY p1; -- cascade required when references exist for the policy -- from roles ALTER TABLE t1 DROP POLICY p1 CASCADE; ALTER TABLE t1 ALTER POLICY p1 USING new_quals; -- Controls if any RLS is applied to this table or not -- If enabled, all users must access through some policy ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; -- Associates roles to policies ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING p1; ALTER TABLE table_name REVOKE ROW ACCESS FROM role_name USING p1; If you're going to have predicates be table-level and access grants be table-level, then what's the value in having policies? You could just do: ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals; As I see it, the only value in having policies as separate objects is that you can then, by granting access to the policy, give a particular user a bundle of rights rather than having to grant each right individually. But with this design, you've got to create the policy, then add the quals to it for each table, and then you still have to give access individually for every row, table combination, so what value is the policy object itself providing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions
On Mon, Jul 7, 2014 at 3:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: I doubt it. The extra code isn't the problem so much, it's the extra planning cycles spent trying to make proofs that will usually fail. What you propose will create a combinatorial explosion in the number of proof paths to be considered. Well, not necessarily. You only need to consider constraints on matching columns and only when there's a join column on those columns. So you could imagine, for example, sorting all the constraints by the eclass for the non-const side of their expression, then going through them linearly to see where you have multiple constraints on the same eclass. For what it's worth I think there is a case where this is a common optimization. When you have multiple tables partitioned the same way. Then you ideally want to be able to turn that from an join of multiple appends into an append of multiple joins. This is even more important when it comes to a parallelizing executor since it lets you do the joins in parallel. However to get from here to there I guess you would need to turn the join of the appends into NxM joins of every pair of subscans and then figure out which ones to exclude. That would be pretty nuts. To do it reasonably we probably need the partitioning infrastructure we've been talking about where Postgres would know what the partitioning key is and the order and range of the partitions so it can directly generate the matching subjoins in less than n^2 time. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
At 2014-07-07 09:48:44 +0200, and...@2ndquadrant.com wrote: I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT, UINT64_FORMAT ontop, in c.h. Oh, I see. I'm sorry, I misread your earlier suggestion. Regenerated patches attached. Is this what you had in mind? -- Abhijit diff --git a/src/include/c.h b/src/include/c.h index a48a57a..999f297 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -864,6 +864,9 @@ typedef NameData *Name; * */ +#define INT64_FORMAT % INT64_MODIFIER d +#define UINT64_FORMAT % INT64_MODIFIER u + /* msb for char */ #define HIGHBIT (0x80) #define IS_HIGHBIT_SET(ch) ((unsigned char)(ch) HIGHBIT) diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index 2a40d61..c2e01fd 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -650,8 +650,8 @@ /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */ #undef HAVE__VA_ARGS -/* Define to the appropriate snprintf format for 64-bit ints. */ -#undef INT64_FORMAT +/* Define to the appropriate snprintf length modifier for 64-bit ints. */ +#undef INT64_MODIFIER /* Define to 1 if `locale_t' requires xlocale.h. */ #undef LOCALE_T_IN_XLOCALE @@ -741,9 +741,6 @@ /* Define to 1 if your sys/time.h declares `struct tm'. */ #undef TM_IN_SYS_TIME -/* Define to the appropriate snprintf format for unsigned 64-bit ints. */ -#undef UINT64_FORMAT - /* Define to 1 to build with assertion checks. (--enable-cassert) */ #undef USE_ASSERT_CHECKING diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32 index 1c9cd82..a238a72 100644 --- a/src/include/pg_config.h.win32 +++ b/src/include/pg_config.h.win32 @@ -529,8 +529,8 @@ /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */ #define HAVE__VA_ARGS 1 -/* Define to the appropriate snprintf format for 64-bit ints, if any. */ -#define INT64_FORMAT %lld +/* Define to the appropriate snprintf length modifier for 64-bit ints. */ +#undef INT64_MODIFIER /* Define to 1 if `locale_t' requires xlocale.h. */ /* #undef LOCALE_T_IN_XLOCALE */ @@ -601,10 +601,6 @@ /* Define to 1 if your sys/time.h declares `struct tm'. */ /* #undef TM_IN_SYS_TIME */ -/* Define to the appropriate snprintf format for unsigned 64-bit ints, if any. - */ -#define UINT64_FORMAT %llu - /* Define to 1 to build with assertion checks. (--enable-cassert) */ /* #undef USE_ASSERT_CHECKING */ diff --git a/configure.in b/configure.in index c938a5d..7750c30 100644 --- a/configure.in +++ b/configure.in @@ -1636,34 +1636,29 @@ fi # If we found long int is 64 bits, assume snprintf handles it. If # we found we need to use long long int, better check. We cope with # snprintfs that use %lld, %qd, or %I64d as the format. If none of these -# work, fall back to our own snprintf emulation (which we know uses %lld). +# works, fall back to our own snprintf emulation (which we know uses %lld). if test $HAVE_LONG_LONG_INT_64 = yes ; then if test $pgac_need_repl_snprintf = no; then -PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT -if test $LONG_LONG_INT_FORMAT = ; then +PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER +if test $LONG_LONG_INT_MODIFIER = ; then # Force usage of our own snprintf, since system snprintf is broken pgac_need_repl_snprintf=yes - LONG_LONG_INT_FORMAT='%lld' + LONG_LONG_INT_MODIFIER='ll' fi else # Here if we previously decided we needed to use our own snprintf -LONG_LONG_INT_FORMAT='%lld' +LONG_LONG_INT_MODIFIER='ll' fi - LONG_LONG_UINT_FORMAT=`echo $LONG_LONG_INT_FORMAT | sed 's/d$/u/'` - INT64_FORMAT=\$LONG_LONG_INT_FORMAT\ - UINT64_FORMAT=\$LONG_LONG_UINT_FORMAT\ else # Here if we are not using 'long long int' at all - INT64_FORMAT='%ld' - UINT64_FORMAT='%lu' + LONG_LONG_INT_MODIFIER='l' fi -AC_DEFINE_UNQUOTED(INT64_FORMAT, $INT64_FORMAT, - [Define to the appropriate snprintf format for 64-bit ints.]) +INT64_MODIFIER=\$LONG_LONG_INT_MODIFIER\ -AC_DEFINE_UNQUOTED(UINT64_FORMAT, $UINT64_FORMAT, - [Define to the appropriate snprintf format for unsigned 64-bit ints.]) +AC_DEFINE_UNQUOTED(INT64_MODIFIER, $INT64_MODIFIER, + [Define to the appropriate snprintf length modifier for 64-bit ints.]) # Also force use of our snprintf if the system's doesn't support the %z flag. if test $pgac_need_repl_snprintf = no; then diff --git a/config/c-library.m4 b/config/c-library.m4 index 8f45010..4821a61 100644 --- a/config/c-library.m4 +++ b/config/c-library.m4 @@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS -# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT +# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER # --- -# Determine which format snprintf uses for long long int. We handle -# %lld, %qd, %I64d. The result is
[HACKERS] things I learned from working on memory allocation
I wrote previously about my efforts to create a new memory allocator (then called sb_alloc, now renamed to BlockAllocatorContext in my git branch for consistency with existing identifiers) for PostgreSQL. To make a long story short, many of the basic concepts behind that patch still seem sound to me, but significant problems remain, which account for it not having been submitted to the current CommitFest nor attached to this email. Working in this area has led me to a few conclusions - comments welcome. 1. I tried to write a single allocator which would be better than aset.c in two ways: first, by allowing allocation from dynamic shared memory; and second, by being more memory-efficient than aset.c. Heikki told me at PGCon that he thought that was a bad idea, and I now think he was right. Allocating from dynamic shared memory requires the use of relative rather than absolute pointers (because the DSM could be mapped at different addresses in different processes) and, if concurrent allocating and freeing is to be supported, locking. Although neither of these things requires very many extra instructions, the common path for aset.c is extremely short, and we haven't got those instructions to spare. Code like if (lock != NULL) LWLockAcquire(lock, LW_EXCLUSIVE) is quite expensive even if the branch is never taken, apparently because being prepared to call a function at that point requires storing more stuff in our stack frame, and extra instructions to save and restore registers are a painfully expensive on allocation-heavy workloads. On PPC64, to match aset.c's performance, a palloc/pfree cycle must run in ~120 instructions, ~80 cycles, which doesn't leave much room for fluff. 2. After ripping the relative-pointer and locking stuff out of my allocator, I came up with something which performs about like aset.c on allocation, but with significantly better memory efficiency on small allocations. REINDEX pgbench_accounts_pkey saves about 22%, because the SortTuple array saves nothing but the individual tuples take only half as much space, by avoiding the StandardChunkHeader overhead. This seems like a savings worth pursuing, if we can figure out how to get it. Unfortunately, there's some incremental overhead in pfree, amounting to a single test of global variable, even when the new allocator is never used, which is measurable on a microbenchmark; I don't remember the exact numbers off-hand. When the new allocator *is* used, pfree gets a lot slower - mostly because one of the data structures used by the new allocator suffer occasional cache misses during free operations. I think there might be an argument that some hit on pfree would be worth taking in exchange for better space-efficiency, because we mostly free contexts by resetting them anyway; but I haven't yet managed to make that hit small enough to feel especially good about it. 3. The current MemoryContext abstraction layer is inadequate for almost everything one might want to do. The idea of a context that allows only allocation, and ignores requests to free memory, has been discussed more than once. Such an allocator could skip the power-of-two rounding that aset.c does, but it couldn't omit the chunk header, which means that in practice it would save no memory at all for cases like the one mentioned above (REINDEX pgbench_accounts_pkey). While there might be some benefit in other cases, without getting rid of or at least reducing the size of the chunk-header, I expect that the benefits will be too narrow to make this approach worth pursuing. The chunk-header requirement is also a killer for DSM, again because segments can be mapped at different addresses in different backends. I'm inclined to think that if we can't find a way to get rid of the chunk-header requirement, we ought to consider ripping out the whole abstraction layer as a performance-wasting exercise in futility. 4. The MemoryContext layer embeds assumptions not only about the layout of memory, but the performance of various operations. For example, GetMemoryContextSpace() is assumed to be cheap, so there's no interface like void *MemoryContextAlloc(Size request_size, Size *chunk_size) or Size MemoryContextFree(Size chunk_size), but those would be significantly more efficient for my new allocator. Possibly better still would be to have the context itself track utilization and soft-fail when we run out of space; even for aset.c, it makes little sense to refuse to accept another tuple when the AllocSet has a suitable object on the freelist. That's probably not a critical flaw because, since tuples being sorted are likely all about the same size, the waste may be little or nothing. Other allocators might have different space-management strategies, though, where this matters more. 5. It's tempting to look at other ways of solving the parallel sort problem that don't need an allocator - perhaps by simply packing all the tuples into a DSM one after the next. But this is not
Re: [HACKERS] Pg_upgrade and toast tables bug discovered
On Mon, Jul 7, 2014 at 01:44:59PM -0400, Bruce Momjian wrote: On Mon, Jul 7, 2014 at 11:24:51AM -0400, Robert Haas wrote: As far as the reusing of oids, we don't set the oid counter until after the restore, so any new unmatched toast table would given a very low oid. Since we restore in oid order, for an oid to be assigned that was used in the old cluster, it would have to be a very early relation, so I think that reduces the odds considerably. I am not inclined to do anything more to avoid this until I see an actual error report --- trying to address it might be invasive and introduce new errors. That sounds pretty shaky from where I sit. I wonder if the pg_upgrade_support module should have a function create_toast_table_if_needed(regclass) or something like that. Or maybe just create_any_missing_toast_tables(), iterating over all relations. Uh, I guess we could write some code that iterates over all tables and finds the tables that should have TOAST tables, but don't (because binary-upgrade backend mode suppressed their creation), and adds them. However, that would be a lot of code and might be risky to backpatch. The error is so rare I am not sure it is worth it. I tried to create the failure case and it was very difficult, requiring me to create the problem table first, then some dummy stuff to get the offsets right so the oid would collide. Based on the rareness of the bug, I am not sure it is worth it, but if it is, it would be something only for 9.4 (maybe) and 9.5, not something to backpatch. Another idea would be to renumber empty toast tables that conflict during new toast table creation, when in binary upgrade mode. Since the files are always empty in binary upgrade mode, you could just create a new toast table, repoint the old table to use (because it didn't ask for a specific toast table oid because it didn't need one), and then use that one for the table that actually requested the oid. However, if one is a heap (zero size), and one is an index (8k size), it wouldn't work and you would have to recreate the file system file too. That seems a lot more localized than the other approaches. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] things I learned from working on memory allocation
On Mon, Jul 7, 2014 at 12:57 PM, Robert Haas robertmh...@gmail.com wrote: 5. It's tempting to look at other ways of solving the parallel sort problem that don't need an allocator - perhaps by simply packing all the tuples into a DSM one after the next. But this is not easy to do, or at least it's not easy to do efficiently. Tuples enter tuplesort.c through one of the tuplesort_put*() functions, most of which end up calling one of the copytup_*() functions. But you can't easily just change those functions to stick the tuple someplace else, because they don't necessarily get the address of the space to be used from palloc directly. In particular, copytup_heap() calls ExecCopySlotMinimalTuple(), and copytup_cluster() calls heap_copytuple(). heap_copytuple() is simple enough that you might be able to finagle a new API that would make it work, but I'm not sure what we could really do about ExecCopySlotMinimalTuple() except call it and then copy the result. Perhaps that'd be OK for a first version. Perhaps. If my experience with sort support for text is anything to go on, the cost of copying over the tuples isn't really that high relative to the cost of performing the sort, particularly when there are many tuples to sort. OTOH, I'd be concerned about the cost of main memory accesses for pass-by-reference types during parallel tuple sorts in particular. The same concern applies to cases where the tuple proper must be accessed, although presumably that occurs less frequently. The LLVM project's new libc++ library passes remark on a similar issue in their rationale for yet another C++ standard library: For example, it is generally accepted that building std::string using the short string optimization instead of using Copy On Write (COW) is a superior approach for multicore machines It seems likely that they're mostly referencing the apparent trend of memory bandwidth per core stagnation [1], [2]. To get the real benefit of a parallel sort, we must do anything we can to avoid having cores/workers remain idle during the sort while waiting on memory access. I believe that that's the bigger problem. [1] http://www.admin-magazine.com/HPC/Articles/Finding-Memory-Bottlenecks-with-Stream [2] https://software.intel.com/en-us/articles/detecting-memory-bandwidth-saturation-in-threaded-applications -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output
Jeevan Chalke jeevan.cha...@enterprisedb.com writes: Can't we remove this else all-together and print extra line unconditionally? Also can we remove curly braces in if and else near these code changes ? (Attached patch along those lines) I committed this with slight further rearrangement to create what seems like better vertical whitespace choices in verbose vs non-verbose cases, to wit -- -- PostgreSQL database dump -- -- Dumped from database version 9.5devel -- Dumped by pg_dump version 9.5devel SET statement_timeout = 0; ... vs -- -- PostgreSQL database dump -- -- Dumped from database version 9.5devel -- Dumped by pg_dump version 9.5devel -- Started on 2014-07-07 19:05:54 EDT SET statement_timeout = 0; ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
Abhijit Menon-Sen a...@2ndquadrant.com writes: I can confirm that it works fine. I have attached here a very slightly tweaked version of the patch (removed trailing whitespace, and changed some comment text). I'm marking this ready for committer. I don't know enough about XML/XPATH to know if this is a good idea or not, but I did go read the documentation of xmlCopyNode(), and I notice it says the second argument is extended: if 1 do a recursive copy (properties, namespaces and children when applicable) if 2 copy properties and namespaces (when applicable) Do we really want it to copy children? Perhaps the value should be 2? (I have no idea, I'm just raising the question.) I also notice that it says Returns: a new #xmlNodePtr, or NULL in case of error. and the patch is failing to cover the error case. Most likely, that would represent out-of-memory, so it definitely seems to be something we should account for. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Improve bytea error messages
On Sun, Jul 6, 2014 at 4:17 AM, Craig Ringer cr...@2ndquadrant.com wrote: After someone reported somewhat less than informative errors in bytea decoding (http://stackoverflow.com/q/24588866/398670) I thought I'd put together a quick patch to improve the errors here. The first two changes seem fine from here, but I think the use of parentheses in the third one likely violates our message style guidelines. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Built-in binning functions
Petr Jelinek p...@2ndquadrant.com writes: here is a patch implementing varwidth_bucket (naming is up for discussion) function which does binning with variable bucket width. I didn't see any discussion of the naming question in this thread. I'd like to propose that it should be just width_bucket(); we can easily determine which function is meant, considering that the SQL-spec variants don't take arrays and don't even have the same number of actual arguments. Also, the set of functions provided still needs more thought, because the resolution of overloaded functions doesn't really work as nicely as all that. I illustrate: regression=# create function myf(int8) returns int as 'select 0' language sql; CREATE FUNCTION regression=# create function myf(float8) returns int as 'select 1' language sql; CREATE FUNCTION regression=# create function myf(anyelement) returns int as 'select 2' language sql; CREATE FUNCTION regression=# select myf(1); myf - 1 (1 row) So given plain integer arguments, we'll invoke the float8 version not the int8 version, which may be undesirable. The same for int2 arguments. We could fix that by adding bespoke int4 and maybe int2 variants, but TBH, I'm not sure that the specific-type functions are worth the trouble. Maybe we should just have one generic function, and take the trouble to optimize its array-subscripting calculations for either fixed-length or variable-length array elements? Have you got performance measurements demonstrating that multiple implementations really buy enough to justify the extra code? Also, I'm not convinced by this business of throwing an error for a NULL array element. Per spec, null arguments to width_bucket() produce a null result, not an error --- shouldn't this flavor act similarly? In any case I think the test needs to use array_contains_nulls() not just ARR_HASNULL. Lastly, the spec defines behaviors for width_bucket that allow either ascending or descending buckets. We could possibly do something similar in these functions by initially comparing the two endpoint elements to see which one is larger, and then reversing the sense of all the comparisons if the first one is larger. I'm not sure if that's worth the trouble or not, but if the SQL committee thought descending bucket numbering was worthwhile, maybe it's worthwhile here too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tab completion for setting search_path
On Mon, Jul 7, 2014 at 8:51 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jun 24, 2014 at 11:57 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Jun 23, 2014 at 6:25 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Jun 23, 2014 at 3:53 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-23 13:10:34 -0400, Robert Haas wrote: On Mon, Jun 23, 2014 at 9:10 AM, Kevin Grittner kgri...@ymail.com wrote: I would be for excluding the pg_toast, pg_toast_temp_n, and pg_temp_n schemas, and including public and pg_catalog. +1. Jeff, are you willing to update the patch that way? Seems we have something several people can live with ;) I was hoping not to add a third set of filters (separate from the ones already implicit in either \dn or \dnS), but that's OK; as long as I get a doghouse I won't worry much about the color. I've included information_schema as well, in analogy to the inclusion of pg_catalog. Is there any blocker on this patch? The patch looks good to me. The patch makes the tab-completion on search_path display the existing schemas, additionally what about displaying $user, too? I think that some users would want to include $user variable in search_path. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] things I learned from working on memory allocation
On Mon, Jul 7, 2014 at 5:37 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Jul 7, 2014 at 12:57 PM, Robert Haas robertmh...@gmail.com wrote: 5. It's tempting to look at other ways of solving the parallel sort problem that don't need an allocator - perhaps by simply packing all the tuples into a DSM one after the next. But this is not easy to do, or at least it's not easy to do efficiently. Tuples enter tuplesort.c through one of the tuplesort_put*() functions, most of which end up calling one of the copytup_*() functions. But you can't easily just change those functions to stick the tuple someplace else, because they don't necessarily get the address of the space to be used from palloc directly. In particular, copytup_heap() calls ExecCopySlotMinimalTuple(), and copytup_cluster() calls heap_copytuple(). heap_copytuple() is simple enough that you might be able to finagle a new API that would make it work, but I'm not sure what we could really do about ExecCopySlotMinimalTuple() except call it and then copy the result. Perhaps that'd be OK for a first version. Perhaps. If my experience with sort support for text is anything to go on, the cost of copying over the tuples isn't really that high relative to the cost of performing the sort, particularly when there are many tuples to sort. The testing I did showed about a 5% overhead on REINDEX INDEX pgbench_accounts_pkey from one extra tuple copy (cf. 9f03ca915196dfc871804a1f8aad26207f601fd6). Of course that could vary by circumstance for a variety of reasons. OTOH, I'd be concerned about the cost of main memory accesses for pass-by-reference types during parallel tuple sorts in particular. The same concern applies to cases where the tuple proper must be accessed, although presumably that occurs less frequently. I do think that's a problem with our sort implementation, but it's not clear to me whether it's *more* of a problem for parallel sort than it is for single-backend sort. The LLVM project's new libc++ library passes remark on a similar issue in their rationale for yet another C++ standard library: For example, it is generally accepted that building std::string using the short string optimization instead of using Copy On Write (COW) is a superior approach for multicore machines It seems likely that they're mostly referencing the apparent trend of memory bandwidth per core stagnation [1], [2]. To get the real benefit of a parallel sort, we must do anything we can to avoid having cores/workers remain idle during the sort while waiting on memory access. I believe that that's the bigger problem. Yes, I agree that's a problem. There are algorithms for parallel quicksort in the literature which seem to offer promising solutions, but unfortunately these infrastructure issues have prevented me from reaching them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Interesting behavior change of psql
From 9.4 psql has slightly changed it's behavior. Pre 9.4: psql --version psql (PostgreSQL) 9.3.4 psql -U '' test FATAL: no PostgreSQL user name specified in startup packet psql: FATAL: no PostgreSQL user name specified in startup packet 9.4: psql -U '' test psql (9.4beta1, server 9.3.4) Type help for help. test=# Please note that this behavior change was actually brought by libpq: Have libpq's PQconnectdbParams() and PQpingParams() functions process zero-length strings as defaults (Adrian Vondendriesch) Previously, these functions treated zero-length string values as defaults only in some cases. It's not very clear that username is affected by this change here. I just want to remind hackers before you get inquries on this from PostgreSQL users. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] things I learned from working on memory allocation
On Mon, Jul 7, 2014 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote: The testing I did showed about a 5% overhead on REINDEX INDEX pgbench_accounts_pkey from one extra tuple copy (cf. 9f03ca915196dfc871804a1f8aad26207f601fd6). Of course that could vary by circumstance for a variety of reasons. Be careful of the check for pre-sorted input within qsort() giving an overly optimistic picture of things, thereby exaggerating the proportion of time spent copying if taken as generally representative. OTOH, I'd be concerned about the cost of main memory accesses for pass-by-reference types during parallel tuple sorts in particular. The same concern applies to cases where the tuple proper must be accessed, although presumably that occurs less frequently. I do think that's a problem with our sort implementation, but it's not clear to me whether it's *more* of a problem for parallel sort than it is for single-backend sort. If you'll forgive me for going on about my patch on this thread, I think the pgbench -c 4 and -c 1 cases that I tested suggest it is a particular problem for parallel sorts, as there is a much bigger both absolute and proportional difference in transaction throughput between those two with the patch applied. It seems reasonable to suppose the difference would be larger still if we were considering a single parallel sort, as opposed to multiple independent sorts (of the same data) that happen to occur in parallel. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On Tue, Jun 24, 2014 at 10:17:49AM -0700, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: On 06/23/2014 03:52 PM, Andres Freund wrote: True. Which makes me wonder whether we shouldn't default this to something non-zero -- even if it is 5 or 10 days. I'd go for even shorter: 48 hours. I'd suggest 24 hours, but that would trip up some users who just need really long pg_dumps. FWIW, I do not think we should have a nonzero default for this. We could not safely set it to any value that would be small enough to be really useful in the field. BTW, has anyone thought about the interaction of this feature with prepared transactions? I wonder whether there shouldn't be a similar but separately-settable maximum time for a transaction to stay in the prepared state. If we could set a nonzero default on that, perhaps on the order of a few minutes, we could solve the ancient bugaboo that prepared transactions are too dangerous to enable by default. Added as a TODO item. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING PRIMARY KEY syntax extension
On Mon, Jul 7, 2014 at 5:50 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jul 4, 2014 at 12:36 AM, Rushabh Lathia rushabh.lat...@gmail.com wrote: On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: Tom Dunstan pg...@tomd.cc writes: On 4 July 2014 00:07, Tom Lane t...@sss.pgh.pa.us wrote: It sounds to me like designing this for JDBC's getGeneratedKeys method is a mistake. There isn't going to be any way that the driver can support that without having looked at the table's metadata for itself, and if it's going to do that then it doesn't need a crutch that only partially solves the problem anyhow. Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet from whatever was returned. It's CURRENTLY doing that, but it's appending RETURNING * and leaving it up to the caller of getGeneratedKeys() to work out which columns the caller is interested in. I might be mistaken, but as I read the spec for getGeneratedKeys, whether or not a column is in a primary key has got nothing to do with whether that column should be returned. It looks to me like what you're supposed to return is columns with computed default values, eg, serial columns. (Whether we should define it as being *exactly* columns created by the SERIAL macro is an interesting question, but for sure those ought to be included.) Now, the pkey might be a serial column ... but it equally well might not be; it might not have any default expression at all. And there certainly could be serial columns that weren't in the pkey. 100% agree with Tom. Well, we could have a RETURNING GENERATED COLUMNS construct, or something like that. I can certainly see the usefulness of such a thing. As a general note not necessarily related to this specific issue, I think we should be careful not to lose sight of the point of this feature, which is to allow pgsql-jdbc to more closely adhere to the JDBC specification. While it's great if this feature has general utility, if it doesn't help pgsql-jdbc, then it fails to achieve the author's original intention. We need to make sure we're not throwing up too many obstacles in the way of better driver compliance if we want to have good drivers. As a code-level comment, I am not too find of adding yet another value for IndexAttrBitmapKind; every values we compute in RelationGetIndexAttrBitmap adds overhead for everyone, even people not using the feature. Admittedly, it's only paid once per relcache load, but does map_primary_key_to_list() really need this information cached, or can it just look through the index list for the primary key, and then work directly from that? Also, map_primary_key_to_list() tacitly assumes that an auto-updatable view has only 1 from-list item. That seems like a good thing to check with an Assert(), just in case we should support some other case in the future. Another code-level comment is, why we need RowExclusiveLock before calling map_primary_key_to_list() ? I think we should have explanation for that. +/* Handle RETURNING PRIMARY KEY */ +if(query-returningPK) +{ +RangeTblEntry *rte = (RangeTblEntry *) list_nth(query-rtable, query-resultRelation - 1); +Relation rel = heap_open(rte-relid, RowExclusiveLock); +query-returningList = map_primary_key_to_list(rel, query); +heap_close(rel, NoLock); +} -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Rushabh Lathia
Re: [HACKERS] Selectivity estimation for inet operators
On 06 July 2014 20:33, Emre Hasegeli Wrote, Apart from that there is one spell check you can correct -- in inet_his_inclusion_selec comments histogram boundies - histogram boundaries :) I fixed it. New version attached. The debug log statements are also removed. I have done with the review, patch seems fine to me I have one last comment, after clarifying this I can move it to ready for committer. 1. In networkjoinsel, For avoiding the case of huge statistics, only some of the values from mcv and histograms are used (calculated using SQRT). -- But in my opinion, if histograms and mcv both are exist then its fine, but if only mcv's are there in that case, we can match complete MCV, it will give better accuracy. In other function like eqjoinsel also its matching complete MCV. Thanks Regards, Dilip Kumar -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Improve bytea error messages
On 07/08/2014 07:44 AM, Robert Haas wrote: On Sun, Jul 6, 2014 at 4:17 AM, Craig Ringer cr...@2ndquadrant.com wrote: After someone reported somewhat less than informative errors in bytea decoding (http://stackoverflow.com/q/24588866/398670) I thought I'd put together a quick patch to improve the errors here. The first two changes seem fine from here, but I think the use of parentheses in the third one likely violates our message style guidelines. Good point. Better? Putting it in ERRHINT is more appropriate. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 285720d1b028da4fd3a1fc4b5deb006ab7b71959 Mon Sep 17 00:00:00 2001 From: Craig Ringer cr...@2ndquadrant.com Date: Sun, 6 Jul 2014 16:15:21 +0800 Subject: [PATCH] Improve bytea decoding error messages --- src/backend/utils/adt/encode.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c index 46993ba..f09e506 100644 --- a/src/backend/utils/adt/encode.c +++ b/src/backend/utils/adt/encode.c @@ -292,7 +292,7 @@ b64_decode(const char *src, unsigned len, char *dst) else ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(unexpected \=\))); + errmsg(unexpected \=\ while decoding base64 sequence))); } b = 0; } @@ -304,7 +304,7 @@ b64_decode(const char *src, unsigned len, char *dst) if (b 0) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(invalid symbol))); + errmsg(invalid symbol '%c' while decoding base64 sequence, (int) c))); } /* add it to buffer */ buf = (buf 6) + b; @@ -324,7 +324,8 @@ b64_decode(const char *src, unsigned len, char *dst) if (pos != 0) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(invalid end sequence))); + errmsg(invalid base64 end sequence), + errhint(input data is missing padding, truncated or otherwise corrupted))); return p - dst; } -- 1.9.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions
On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: Right now, constraint exclusion code looks only at the provided conditions. If we want avoid table scan based on constraints in the above example, it will need to look at the implied conditions as well. E.g. val2 30 AND val = val2 = val 30. Then the constraint exclusion can see that val 30 AND val 30 are contradictory and infer that the result is going to be empty. We will need to add information about the transitive inferences between operators. Can we do that in PostgreSQL? Will the benefits be worth the code, that needs to be added? I doubt it. The extra code isn't the problem so much, it's the extra planning cycles spent trying to make proofs that will usually fail. What you propose will create a combinatorial explosion in the number of proof paths to be considered. I can understand that it will create combinatorial explosion in the number of predicates that need to be examined by the constraint exclusion. I do not understand where come the paths gets involved here. The constraint exclusion kicks in before paths are created 220 static void 221 set_rel_size(PlannerInfo *root, RelOptInfo *rel, 222 Index rti, RangeTblEntry *rte) 223 { 224 if (rel-reloptkind == RELOPT_BASEREL 225 relation_excluded_by_constraints(root, rel, rte)) 226 { 227 /* 228 * We proved we don't need to scan the rel via constraint exclusion, 229 * so set up a single dummy path for it. Here we only check this for 230 * regular baserels; if it's an otherrel, CE was already checked in 231 * set_append_rel_pathlist(). 232 * 233 * In this case, we go ahead and set up the relation's path right away 234 * instead of leaving it for set_rel_pathlist to do. This is because 235 * we don't have a convention for marking a rel as dummy except by 236 * assigning a dummy path to it. 237 */ 238 set_dummy_rel_pathlist(rel); 239 } and does not create paths for relations excluded by constraints 295 /* 296 * set_rel_pathlist 297 *Build access paths for a base relation 298 */ 299 static void 300 set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, 301 Index rti, RangeTblEntry *rte) 302 { 303 if (IS_DUMMY_REL(rel)) 304 { 305 /* We already proved the relation empty, so nothing more to do */ 306 } Same in the case of join in mark_join_rel() 663 * JOIN_INNER. 664 */ 665 switch (sjinfo-jointype) 666 { 667 case JOIN_INNER: 668 if (is_dummy_rel(rel1) || is_dummy_rel(rel2) || 669 restriction_is_constant_false(restrictlist, false)) 670 { 671 mark_dummy_rel(joinrel); 672 break; 673 } 674 add_paths_to_joinrel(root, joinrel, rel1, rel2, 675 JOIN_INNER, sjinfo, 676 restrictlist); 677 add_paths_to_joinrel(root, joinrel, rel2, rel1, 678 JOIN_INNER, sjinfo, 679 restrictlist); 680 break; BTW, on a side note, I noticed, we use mark_dummy_rel() for joinrels and set_dummy_rel_pathlist() for baserel. Shouldn't we be using the same function everywhere for the same functionality (e.g. adding an append path with no child-paths). For larger relations, the time spent in constraint exclusion might be lesser than the time taken by actual table/index scan and that to when such a scan is not going to produce any rows. So, we might want to apply the technique only when the estimated number of rows/pages are above a certain threshold and may be when the GUC is ON (like we do it today). I can see some more benefits. We can use it to eliminate joins where the constraints on joining relations may cause the join to be empty e.g. ... and applying constraint exclusion to join relations would make that problem even worse. The same case goes with joins, where potentially, the crossproduct of two tables is huge. regards, tom lane -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
[HACKERS] Modifying update_attstats of analyze.c for C Strings
Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry;* * if (* *strcmp(col_type,varchar) == 0* *)* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k],* * CSTRINGOID,* * -2,* * false,* * 'c');* * else* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k],* * stats-statypid[k],* * stats-statyplen[k],* * stats-statypbyval[k],* * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */}* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke