[HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-07 Thread Ashutosh Bapat
Hi,
Here's a table I have
postgres=# \d+ tab1
 Table public.tab1
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 val| integer |   | plain   |  |
 val2   | integer |   | plain   |  |
Check constraints:
tab1_val2_check CHECK (val2  30)
tab1_val_check CHECK (val  30)

and I run following query on it,
postgres=# set constraint_exclusion to on;
SET
postgres=# explain verbose select * from tab1 where val = val2;
 QUERY PLAN
-
 Seq Scan on public.tab1  (cost=0.00..36.75 rows=11 width=8)
   Output: val, val2
   Filter: (tab1.val = tab1.val2)

Given the constraints and the condition in WHERE clause, it can be inferred
that the query will not return any row. However, PostgreSQL scans the table
as seen in the plan.

Right now, constraint exclusion code looks only at the provided conditions.
If we want avoid table scan based on constraints in the above example, it
will need to look at the implied conditions as well. E.g. val2  30 AND val
= val2 = val  30. Then the constraint exclusion can see that val  30 AND
val  30 are contradictory and infer that the result is going to be empty.
We will need to add information about the transitive inferences between
operators. Can we do that in PostgreSQL? Will the benefits be worth the
code, that needs to be added?

I can see some more benefits. We can use it to eliminate joins where the
constraints on joining relations may cause the join to be empty e.g.
postgres=# \d+ tab2
 Table public.tab2
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 val| integer |   | plain   |  |
Check constraints:
tab2_val_check CHECK (val  30)

postgres=# \d+ tab1
 Table public.tab1
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 val| integer |   | plain   |  |
Check constraints:
tab1_val_check CHECK (val  30)

and the query with join is -  select * from tab1, tab2 where tab1.val =
tab2. val OR select * from tab1, tab2 where tab1.val  tab2.val.

This can be extended further for the case of join between two parent
tables, where we will be eliminate joins between some pairs of children.
There are limitations in this case though,
1. Benefits of this optimization will be visible in case of nested loop
joins. Hash and Merge join implicitly eliminate the non-joining children.
2. We need a way to push join down append path, which PostgreSQL doesn't do
today.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
Hi.

We're into the last full week of this CommitFest. Here's an update on
the current state of the queue.

Needs review:   31
Waiting on author:  18
Ready for committer:13
Committed:  19

(I'll send separate topic-wise updates too.)

Reviewers, please try to post whatever you have in the next few days to
give the authors a chance to respond. Even if you haven't completed the
review, please consider posting what you have now and updating it later.
If for any reason you don't expect to be able to complete a review you
are working on, please let me know as soon as possible.

Patch authors, please let me know if you intend to resubmit revisions of
patches that are currently marked waiting on author during this CF. If
it's likely to take more than a few days, it should be moved to the
August CF.

Committers… well, have fun looking at the queue, I guess!

Thanks again to everyone for their help in keeping things moving during
this review cycle. Starting at 97 patches, this wasn't the largest CF
we've ever had, but it's right up there near the top.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 10:36:23 +0530, a...@2ndquadrant.com wrote:

 Server features
 ---
 
 Lag  Lead Window Functions Can Ignore Nulls
 Latest patch currently pending review by Jeff and Álvaro. No
 updates so far.

Jeff has posted a review, and is working on the patch further.

 Using Levenshtein distance to HINT a candidate column name
 No real progress in any direction. Everyone wants the feature, but
 nobody is sure how many suggestions would be useful and how many
 would be annoying.

Peter posted a new patch. Nobody has reviewed it yet.

 Buffer capture facility: check WAL replay consistency
 Some discussion, but no reviewers. Status unclear.
 
 
 http://archives.postgresql.org/message-id/CAB7nPqTFK4=fcrto=lji4vlbx9ah+fv1z1ke6r98ppxuuxw...@mail.gmail.com

No updates.

 Custom Plan API
 Shigeru Hanada has said he plans to post a design review soon.

Any updates? Should this be moved to the next CF?

 delta relations in AFTER triggers
 No code review yet, but there's a proof-of-concept extension that
 uses the feature, and Kevin is working on PL changes to make the
 feature usable, and hopes to commit in this cycle.

Ditto.

 Minmax indexes
 Álvaro will respond to the design questions Heikki raised.

Álvaro is still working on this and hopes to post an update soon.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 11:38:31 +0530, a...@2ndquadrant.com wrote:

 Performance
 ---
 
 scalable LWLocks
 Generic atomics implementation
 Both under active discussion.

Work continues on these.

 KNN-GiST with recheck
 Partial sort
 Some earlier discussion, but no conclusions, and as far as I know
 nobody is working on these patches at the moment.

Heikki said he won't be able to look at these, so unless someone else
plans to review them, they should be moved to the next CF. Alexander is
working on an updated version of the KNN-GiST patch anyway.

 Don't require a NBuffer sized PrivateRefCount array of local buffer pins
 Pending performance tests before commit. Is anyone working on this?

No updates.

 Allow more join types to be removed
 Updated patch pending review from Simon. Updates welcome.

Tom is looking at this now.

 Sort support for text with strxfrm() poor man's keys
 Lots of earlier discussion, but no conclusions as far as I can tell.
 Does anyone want to take a look at this?

 Scaling shared buffer eviction
 Pending review by Andres. Any updates?

 Use unique index for longer pathkeys
 No reviews, no reviewers. I took a quick look, and it's not clear if
 this is useful in itself, or if it just enables more interesting
 optimisations later. Does anyone want to look at this?

No updates for any of these.

 XLogLockBlockRangeForCleanup
 Amit Khandekar plans to post a review this week.

Pending performance results from me, but also a final conclusion from
Amit's review.

 SKIP LOCKED
 Two patches available: original by Simon, updated/merged version by
 Thomas Munro. Simon said he'll look into both with a view towards
 committing the functionality. Timeline not yet clear.

 Allow NOT IN to use anti joins
 Revised patch marked pending review. Updates welcome.

 CSN snapshots
 Work in progress, plenty of discussion, no reviews. Probably nothing
 to do here. (Heikki, are you waiting for feedback or anything?)

No updates for any of these.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Andres Freund
Hi,

On 2014-07-04 22:59:15 +0900, MauMau wrote:
 My customer reported a strange connection hang problem.  He and I couldn't
 reproduce it.  I haven't been able to understand the cause, but I can think
 of one hypothesis.  Could you give me your opinions on whether my hypothesis
 is correct, and a direction on how to fix the problem?  I'm willing to
 submit a patch if necessary.

 The connection attempt is waiting for a reply from the standby.  This is
 strange, because we didn't anticipate that the connection establishment (and
 subsequent SELECT queries) would update something and write some WAL.  The
 doc says:
 
 http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION
 
 When requesting synchronous replication, each commit of a write transaction
 will wait until confirmation is received that the commit has been written to
 the transaction log on disk of both the primary and standby server.
 ...
 Read only transactions and transaction rollbacks need not wait for replies
 from standby servers. Subtransaction commits do not wait for responses from
 standby servers, only top-level commits.
 
 
 [Hypothesis]
 Why does the connection processing emit WAL?
 
 Probably, it did page-at-a-time vacuum during access to pg_database and
 pg_authid for client authentication.  src/backend/access/heap/README.HOT
 describes:

 [How to fix]
 Of course, adding -o '-c synchronous_commit=local' or -o '-c
 synchronous_standby_names=' to pg_ctl start in the recovery script would
 prevent the problem.

 But isn't there anything to fix in PostgreSQL?  I think the doc needs
 improvement so that users won't misunderstand that only write transactions
 would block at commit.

I think we should rework RecordTransactionCommit() to only wait for the
standby if `markXidCommitted' and not if `wrote_xlog'. There really
isn't a reason to make a readonly transaction's commit wait just because
it did some hot pruning.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Kohei KaiGai
 Custom Plan API
 Shigeru Hanada has said he plans to post a design review soon.

 Any updates? Should this be moved to the next CF?

Now I'm working to revise the patch according to his suggestion;
will be completed within a couple of days.
A few issues needs design-level suggestion from committers.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 11:21:53 +0530, a...@2ndquadrant.com wrote:

 SQL commands
 
 
 Event triggers: object creation support
 Enormous patch, no reviews, no reviewers, but it's known to work
 already. Does anyone want to have a look at this? (I thought it
 was being reviewed, and didn't realise otherwise until a couple
 of days ago. Sorry about that.)

No updates. I'll look at the patch myself, but I don't think I'll be
able to post any useful review comments until the beginning of next
week.

 change alter user to be a true alias for alter role
 Original patch worked (Vik) but didn't go far enough towards
 preventing future repetitions of the mistake (Tom) but the suggested
 change didn't turn out to be easy (Vik). I don't think anyone is
 working on this at the moment.

No updates.

 RETURNING PRIMARY KEY syntax extension
 Patch works and has been revised as suggested. Some questions about
 how useful it is. Updates welcome.

Ian will respond to comments soon.

 Allow an unlogged table to be changed to logged GSoC 2014
 Stephen has looked at the patch a little, but isn't sure how much
 time he'll have to complete the review. More eyes are welcome. I
 recommend this patch to anyone who's looking to get started with
 reviewing something substantial: it looks very nicely-done and
 straightforward, but still non-trivial.

Christoph Berg is looking at this, but doesn't expect to be able to post
a review for a few days yet. (The recommendation still stands.)

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 11:14:52 +0530, a...@2ndquadrant.com wrote:

 Security
 
 
 Row-security based on Updatable security barrier views
 Lots of discussion that I haven't dared to look at properly yet. I
 gather there's still plenty of design-level work needed, and this
 is not in any imminent danger of being committed.

Work continues.

 Replication  Recovery
 --
 
 pg_receivexlog add synchronous mode
 Patch works, is generally acceptable. Fujii-san proposed a
 refactoring patch to be applied before this one, and plans
 to commit it soon.

Refactoring patch committed, other patch still pending.

 Compression of Full Page Writes
 Currently under development, timeline unclear. Probably needs to be
 marked returned with feedback and moved to August CF.

Revised patch posted, but lots of work still needed.

 WAL format  API changes
 I'm not sure what's happening here. Will look more closely, but
 updates are welcome from the people who've been participating in
 the discussion/review.

Waiting on Heikki.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 11:09:19 +0530, a...@2ndquadrant.com wrote:

 System administration
 -
 
 pg_hibernator
 Nice feature, some discussion, no code reviews. Status not entirely
 clear, but updated patch available.

Now being reviewed by MauMau.

 Monitoring  control
 
 
 Reducing impact of hints/cleanup for SELECTs
 Pending performance review by Emanuel Calvo. (He said he's posted a
 review, but I couldn't find it. He's travelling right now, will send
 me a pointer later.)

No updates.

 issue log message to suggest VACUUM FULL if a table is nearly empty
 Euler has said he plans to post a review this week.
 
 pg_shmem_allocations view
 Euler has said he plans to post a review this week.

No updates from Euler.

 pg_xlogdump --stats
 Pending review by Dilip Kumar (who plans to post a review this
 week), but Osamu Furuya and Marti Raudsepp have said it looks
 OK generally.

Latest patch posted, some discussion needed on the calling conventions
for the new rm_identify method.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 10:54:05 +0530, a...@2ndquadrant.com wrote:

 Functions
 -
 
 min/max support for inet datatypes
 Pending review by Muhammad Asif Naeem.

No updates.

 Selectivity estimation for inet operators
 Dilip Kumar plans to post a review this week.

Reviewed, and updated patch from Emre available.

 Clients
 ---
 
 Gaussian distribution pgbench
 Fabien Coelho plans to post a review of the latest patch.

Some questions raised about whether we want this feature in this form at
all. Status unclear.

 possibility to set double style for unicode lines
 No reviews, no reviewers. \N{DOUBLE LINED UNHAPPY FACE}[1]

No updates.

 Allow parallel cores to be used by vacuumdb
 Status not entirely clear. Do we want this? Does the patch have
 too much duplicated code? Can it be fixed in this cycle?

I'm not sure what the status of this patch is.

 add line number as prompt option to psql
 Sawada Masahiko plans to post an updated patch soon.

New patch posted, pending review. Should be ready for committer soon.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-02 10:44:23 +0530, a...@2ndquadrant.com wrote:

 Miscellaneous
 -
 
 contrib/fastbloat - tool for quickly assessing bloat stats for a table
 Pending review by Jaime.
 
 showing index update time on EXPLAIN
 Pending updated patch by Jaime.
 
 idle_in_transaction_session_timeout
 Marked as ready for committer, but as far as I can tell the patch
 needs revisions per Tom's comments. It's unclear if Vik is working
 on these, or Kevin, or nobody. Anyone?
 
 logging of replication commands
 Two patches posted: one generally accepted, the other contentious.
 Pending status update.
 
 Refactor SSL code to support other SSL implementations
 Some discussion, but no code reviews yet. Jeff Janes tried it and it
 hung; Andreas Karlsson fixed it and it worked for him. Everyone does
 seem to agree that it's a good idea.
 
 There is also one other ready for committer patch in this category.
 
 Bug fixes
 -
 
 Correctly place DLLs for ECPG apps in bin folder
 Pending review by Muhammad Asif Naeem.
 
 Per table autovacuum vacuum cost parameters behavior change
 Pending review by Álvaro.
 
 Ignore Ctrl-C/Break on Windows
 Christian Ullrich has said he will post an updated patch this week.
 
 There are also three ready for committer patches in this category.

No updates at all.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

2014-07-07 Thread Sergey Muraviov
So what's wrong with the patch?
And what should I change in it for 9.5?


2014-07-07 3:12 GMT+04:00 Greg Stark st...@mit.edu:

 On Sun, Jul 6, 2014 at 8:40 AM, Sergey Muraviov
 sergey.k.murav...@gmail.com wrote:
  Is there anyone who can commit the patch?

 So what I'm inclined to do here (sigh) is commit it into 9.5 and
 revert it in 9.4. I think it's an improvement but I there's enough
 confusion and surprise about the changes from people who weren't
 expecting them and enough surprising side effects from things like
 check_postgres that letting it simmer for a full release so people can
 get used to it might be worthwhile.

 --
 greg




-- 
Best regards,
Sergey Muraviov


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-07-07 Thread Jeff Davis
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
 On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
  Thanks for the detailed feedback, I'm sorry it took so long to
  incorporate it. I've attached the latest version of the patch, fixing
  in particular:

Looking a little more:

* No tests exercise non-const offsets

* No tests for default clauses with IGNORE NULLS

* The use of bitmapsets is quite ugly. It would be nice if the API would
grow the BMS within the memory context in which it was allocated, but I
don't even see that the BMS is necessary. Why not just allocate a
fixed-size array of bits, and forget the BMS?

* Is there a reason you're leaving out first_value/last_value/nth_value?
I think they could be supported without a lot of extra work.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] introduce XLogLockBlockRangeForCleanup()

2014-07-07 Thread Amit Khandekar
On 4 July 2014 19:11, Abhijit Menon-Sen a...@2ndquadrant.com wrote:

 Updated patch attached, thanks.

 Amit: what's your conclusion from the review?


Other than some minor comments as mentioned below, I don't have any more
issues, it looks all good.

XLogLockBlockRangeForCleanup() function header comments has the function
name spelled: XLogBlockRangeForCleanup

In GetBufferWithoutRelcache(), we can call BufferDescriptorGetBuffer(buf)
rather than BufferDescriptorGetBuffer(BufferDescriptors[buf_id]).



 -- Abhijit



Re: [HACKERS] [PATCH] introduce XLogLockBlockRangeForCleanup()

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-07 14:02:15 +0530, amit.khande...@enterprisedb.com wrote:

 Other than some minor comments as mentioned below, I don't have any
 more issues, it looks all good.

Thank you. (Updated patch attached.)

-- Abhijit
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 5f9fc49..dc90c02 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -501,33 +501,9 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)
 	 * here during crash recovery.
 	 */
 	if (HotStandbyActiveInReplay())
-	{
-		BlockNumber blkno;
-
-		for (blkno = xlrec-lastBlockVacuumed + 1; blkno  xlrec-block; blkno++)
-		{
-			/*
-			 * We use RBM_NORMAL_NO_LOG mode because it's not an error
-			 * condition to see all-zero pages.  The original btvacuumpage
-			 * scan would have skipped over all-zero pages, noting them in FSM
-			 * but not bothering to initialize them just yet; so we mustn't
-			 * throw an error here.  (We could skip acquiring the cleanup lock
-			 * if PageIsNew, but it's probably not worth the cycles to test.)
-			 *
-			 * XXX we don't actually need to read the block, we just need to
-			 * confirm it is unpinned. If we had a special call into the
-			 * buffer manager we could optimise this so that if the block is
-			 * not in shared_buffers we confirm it as unpinned.
-			 */
-			buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, blkno,
-			RBM_NORMAL_NO_LOG);
-			if (BufferIsValid(buffer))
-			{
-LockBufferForCleanup(buffer);
-UnlockReleaseBuffer(buffer);
-			}
-		}
-	}
+		XLogLockBlockRangeForCleanup(xlrec-node, MAIN_FORKNUM,
+	 xlrec-lastBlockVacuumed + 1,
+	 xlrec-block);
 
 	/*
 	 * If we have a full-page image, restore it (using a cleanup lock) and
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index b7829ff..a9aea31 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -287,10 +287,6 @@ XLogReadBuffer(RelFileNode rnode, BlockNumber blkno, bool init)
  *
  * In RBM_ZERO and RBM_ZERO_ON_ERROR modes, if the page doesn't exist, the
  * relation is extended with all-zeroes pages up to the given block number.
- *
- * In RBM_NORMAL_NO_LOG mode, we return InvalidBuffer if the page doesn't
- * exist, and we don't check for all-zeroes.  Thus, no log entry is made
- * to imply that the page should be dropped or truncated later.
  */
 Buffer
 XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
@@ -331,8 +327,6 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
 			log_invalid_page(rnode, forknum, blkno, false);
 			return InvalidBuffer;
 		}
-		if (mode == RBM_NORMAL_NO_LOG)
-			return InvalidBuffer;
 		/* OK to extend the file */
 		/* we do this in recovery only - no rel-extension lock needed */
 		Assert(InRecovery);
@@ -375,6 +369,62 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
 	return buffer;
 }
 
+/*
+ * XLogLockBlockRangeForCleanup is used in Hot Standby mode to emulate
+ * the locking effects imposed by VACUUM for nbtrees.
+ */
+void
+XLogLockBlockRangeForCleanup(RelFileNode rnode, ForkNumber forkNum,
+			 BlockNumber startBlkno,
+			 BlockNumber uptoBlkno)
+{
+	BlockNumber blkno;
+	BlockNumber lastblock;
+	BlockNumber	endingBlkno;
+	Buffer		buffer;
+	SMgrRelation smgr;
+
+	Assert(startBlkno != P_NEW);
+	Assert(uptoBlkno != P_NEW);
+
+	/* Open the relation at smgr level */
+	smgr = smgropen(rnode, InvalidBackendId);
+
+	/*
+	 * Create the target file if it doesn't already exist.  This lets us cope
+	 * if the replay sequence contains writes to a relation that is later
+	 * deleted.  (The original coding of this routine would instead suppress
+	 * the writes, but that seems like it risks losing valuable data if the
+	 * filesystem loses an inode during a crash.  Better to write the data
+	 * until we are actually told to delete the file.)
+	 */
+	smgrcreate(smgr, forkNum, true);
+
+	lastblock = smgrnblocks(smgr, forkNum);
+
+	endingBlkno = uptoBlkno;
+	if (lastblock  endingBlkno)
+		endingBlkno = lastblock;
+
+	for (blkno = startBlkno; blkno  endingBlkno; blkno++)
+	{
+		/*
+		 * All we need to do here is prove that we can lock each block
+		 * with a cleanup lock. It's not an error to see all-zero pages
+		 * here because the original btvacuumpage would not have thrown
+		 * an error either.
+		 *
+		 * We don't actually need to read the block, we just need to
+		 * confirm it is unpinned.
+		 */
+		buffer = GetBufferWithoutRelcache(rnode, forkNum, blkno);
+		if (BufferIsValid(buffer))
+		{
+			LockBufferForCleanup(buffer);
+			UnlockReleaseBuffer(buffer);
+		}
+	}
+}
 
 /*
  * Struct actually returned by XLogFakeRelcacheEntry, though the declared
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 07ea665..f7976a0 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ 

Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-07-07 Thread Asif Naeem
On Mon, Jun 30, 2014 at 4:45 PM, Abhijit Menon-Sen a...@2ndquadrant.com
wrote:

 At 2014-06-30 16:35:45 +0500, anaeem...@gmail.com wrote:
 
  pc1dotnetpk:postgresql asif$ patch -p0 
   ~/core/min_max_support_for_inet_datatypes/inet_agg_v4.patch
   can't find file to patch at input line 3
   Perhaps you used the wrong -p or --strip option?
   The text leading up to this was:
   --
   |*** a/src/backend/utils/adt/network.c
   |--- b/src/backend/utils/adt/network.c
   --

 You need to use patch -p1, to strip off the a/b prefix.


Thank you Abhijit, It worked.



 -- Abhijit



Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-07-07 Thread Asif Naeem
Hi Haribabu,

Thank you for sharing the patch. I have spent some time to review the
changes. Overall patch looks good to me, make check and manual testing
seems run fine with it. There seems no related doc/sgml changes ?. Patch
added network_smaller() and network_greater() functions but in PG source
code, general practice seems to be to use “smaller and “larger” as related
function name postfix e.g. timestamp_smaller()/timestamp_larger(),
interval_smaller/interval_larger(), cashsmaller()/cashlarger() etc. Thanks.

Regards,
Muhammad Asif Naeem


On Mon, Jul 7, 2014 at 1:56 PM, Asif Naeem anaeem...@gmail.com wrote:

 On Mon, Jun 30, 2014 at 4:45 PM, Abhijit Menon-Sen a...@2ndquadrant.com
 wrote:

 At 2014-06-30 16:35:45 +0500, anaeem...@gmail.com wrote:
 
  pc1dotnetpk:postgresql asif$ patch -p0 
   ~/core/min_max_support_for_inet_datatypes/inet_agg_v4.patch
   can't find file to patch at input line 3
   Perhaps you used the wrong -p or --strip option?
   The text leading up to this was:
   --
   |*** a/src/backend/utils/adt/network.c
   |--- b/src/backend/utils/adt/network.c
   --

 You need to use patch -p1, to strip off the a/b prefix.


 Thank you Abhijit, It worked.



 -- Abhijit





Re: [HACKERS] pg_xlogdump --stats

2014-07-07 Thread Andres Freund
On 2014-07-04 18:59:07 +0530, Abhijit Menon-Sen wrote:
 At 2014-07-04 11:34:21 +0200, and...@2ndquadrant.com wrote:
 
  I think we're going to have to redefine the
  PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT callsite in configure.in to
  define INT64_MODIFIER='ll/l/I64D'
 
 I've attached a patch to do this, and also add INT64_MODIFIER to
 pg_config.h.in: 0001-modifier.diff
 
 I reran autoconf, and just for convenience I've attached the resulting
 changes to configure: 0002-configure.diff
 
 Then there are the rm_identify changes: 0003-rmid.diff
 
 Finally, the xlogdump patch using INT64_MODIFIER: 0004-xlogdump.diff
 
 I can confirm that this series applies in-order to master, and that the
 result builds cleanly (including after each patch) on my machine, and
 that the resulting pg_xlogdump works as expected.

 Two of the extra calls to psprintf in pg_xlogdump happen at most
 RM_MAX_ID*16 (i.e. O(record-types) not O(records)) times, and the other
 two happen just before exit. It would be easy to use a static buffer and
 snprintf, but I don't think it's worth doing in this case.

Agreed.

 diff --git a/config/c-library.m4 b/config/c-library.m4
 index 8f45010..4821a61 100644
 --- a/config/c-library.m4
 +++ b/config/c-library.m4
 @@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals
  AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS
  
  
 -# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT
 +# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER
  # ---
 -# Determine which format snprintf uses for long long int.  We handle
 -# %lld, %qd, %I64d.  The result is in shell variable
 -# LONG_LONG_INT_FORMAT.
 +# Determine which length modifier snprintf uses for long long int.  We
 +# handle ll, q, and I64.  The result is in shell variable
 +# LONG_LONG_INT_MODIFIER.
  #
  # MinGW uses '%I64d', though gcc throws an warning with -Wall,
  # while '%lld' doesn't generate a warning, but doesn't work.
  #
 -AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT],
 -[AC_MSG_CHECKING([snprintf format for long long int])
 -AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_format,
 -[for pgac_format in '%lld' '%qd' '%I64d'; do
 +AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER],
 +[AC_MSG_CHECKING([snprintf length modifier for long long int])
 +AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_modifier,
 +[for pgac_modifier in 'll' 'q' 'I64'; do
  AC_TRY_RUN([#include stdio.h
  typedef long long int ac_int64;
 -#define INT64_FORMAT $pgac_format
 +#define INT64_FORMAT %${pgac_modifier}d

I'd rather not define INT64_FORMAT here.

 +INT64_FORMAT=\%${LONG_LONG_INT_MODIFIER}d\
 +UINT64_FORMAT=\%${LONG_LONG_INT_MODIFIER}u\
 +INT64_MODIFIER=\$LONG_LONG_INT_MODIFIER\
 +

  AC_DEFINE_UNQUOTED(INT64_FORMAT, $INT64_FORMAT,
 [Define to the appropriate snprintf format for 64-bit 
 ints.])
  
  AC_DEFINE_UNQUOTED(UINT64_FORMAT, $UINT64_FORMAT,
 [Define to the appropriate snprintf format for unsigned 
 64-bit ints.])
  
 +AC_DEFINE_UNQUOTED(INT64_MODIFIER, $INT64_MODIFIER,
 +   [Define to the appropriate snprintf length modifier for 
 64-bit ints.])
 +

I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT,
UINT64_FORMAT ontop, in c.h.

 NOTE: I do not know what to do about pg_config.h.win32. If someone tells
 me what to do, I can submit another patch.

Which would also take care of pg_config.h.win32 - just define
INT64_MODIFIER in there instead of INT64_FORMAT/UINT64_FORMAT and you
should be good.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog add synchronous mode

2014-07-07 Thread Fujii Masao
On Fri, Jul 4, 2014 at 7:45 PM,  furu...@pm.nttdata.co.jp wrote:
  Thanks for reviewing the patch!
 
  I think that this refactoring patch is useful for improving source
  code readability and making the future patches simpler, whether we
  adopt your patch or not. So, barring any objections, I'm thinking to
  commit this refactoring patch.

 Committed!
 It is a patch that added the --fsync-interval option.
 Interface of --fsync-interval option was referring to the pg_recvlogical.c.

 It is not judgement the flush on a per-message basis.
 It is judgment at the time of receipt stop of the message.

 If you specify a zero --fsync-interval make the flush at the same timing as 
 the walreceiver .
 If you do not specify --fsync-interval, you will flush only when switching as 
 WAL files as in the past.

Thanks for revising the patch!

Could you update the status of this patch from Waiting on Author to
Needs Review when you post the revised version of the patch?

+How often should applicationpg_receivexlog/application issue sync
+commands to ensure the received WAL file is safely
+flushed to disk without being asked by the server to do so.

without being asked by the server to do so implies that the server asks
pg_receivexlog to flush WAL files periodically?

 Specifying
+an interval of literal0/literal together the consecutive data.

This text looks corrupted. What does this mean?

+Not specifying an interval disables issuing fsyncs altogether,
+while still reporting progress the server.

No. Even when the option is not specified, WAL files should be flushed at
WAL file switch, like current pg_receivexlog does. If you want to disable
the flush completely, you can change the option so that it accepts -1 which
means to disable the flush, for example.

+printf(_(  -F  --fsync-interval=SECS\n
+  frequency of syncs to the
output file (default: file close only)\n));

It's better to use transaction log files rather than output file here.

SECS should be INTERVAL for the sake of consistency with --stat-interval's
help message?

+ * fsync_interval controls how often we flush to the received
+ * WAL file, in seconds.

seconds should be miliseconds?

The patch changes pg_receivexlog so that it keep processing
the continuous messages without calling stream_stop(). But
as I commented before, stream_stop should be called every time
the message is received? Otherwise pg_basebackup background
WAL receiver might not be able to stop streaming at proper point.

The flush interval is checked and WAL files are flushed only when
CopyStreamReceive() returns 0, i.e., there is no message available
and the timeout occurs. Why did you do that? I'm afraid that
CopyStreamReceive() always waits for at least one second before
WAL files are flushed even when --fsync-interval is set to 0.

Why don't you update output_last_status when WAL file is flushed
and closed?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Fujii Masao
On Mon, Jul 7, 2014 at 4:14 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2014-07-04 22:59:15 +0900, MauMau wrote:
 My customer reported a strange connection hang problem.  He and I couldn't
 reproduce it.  I haven't been able to understand the cause, but I can think
 of one hypothesis.  Could you give me your opinions on whether my hypothesis
 is correct, and a direction on how to fix the problem?  I'm willing to
 submit a patch if necessary.

 The connection attempt is waiting for a reply from the standby.  This is
 strange, because we didn't anticipate that the connection establishment (and
 subsequent SELECT queries) would update something and write some WAL.  The
 doc says:

 http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION

 When requesting synchronous replication, each commit of a write transaction
 will wait until confirmation is received that the commit has been written to
 the transaction log on disk of both the primary and standby server.
 ...
 Read only transactions and transaction rollbacks need not wait for replies
 from standby servers. Subtransaction commits do not wait for responses from
 standby servers, only top-level commits.


 [Hypothesis]
 Why does the connection processing emit WAL?

 Probably, it did page-at-a-time vacuum during access to pg_database and
 pg_authid for client authentication.  src/backend/access/heap/README.HOT
 describes:

 [How to fix]
 Of course, adding -o '-c synchronous_commit=local' or -o '-c
 synchronous_standby_names=' to pg_ctl start in the recovery script would
 prevent the problem.

 But isn't there anything to fix in PostgreSQL?  I think the doc needs
 improvement so that users won't misunderstand that only write transactions
 would block at commit.

 I think we should rework RecordTransactionCommit() to only wait for the
 standby if `markXidCommitted' and not if `wrote_xlog'. There really
 isn't a reason to make a readonly transaction's commit wait just because
 it did some hot pruning.

Sounds good direction. One question is: Can RecordTransactionCommit() avoid
waiting for not only replication but also local WAL flush safely in
such read-only
transaction case?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-07-07 Thread Rahila Syed
Thank you for review comments.

There are still numerous formatting changes required, e.g. spaces around
= and correct formatting of comments. And git diff --check still has
a few whitespace problems. I won't point these out one by one, but maybe
you should run pgindent

I will do this.

Could you look into his suggestions of other places to do the
allocation, please?

I will get back to you on this


Wouldn't it be better to set

bkpb-block_compression = compress_backup_block;

once earlier instead of setting it that way once and setting it to
BACKUP_BLOCK_COMPRESSION_OFF in two other places
Yes.

If you're using VARATT_IS_COMPRESSED() to detect compression, don't you
need SET_VARSIZE_COMPRESSED() in CompressBackupBlock? pglz_compress()
does it for you, but the other two algorithms don't.
Yes we need SET_VARSIZE_COMPRESSED. It is present in wrappers around snappy
and LZ4 namely pg_snappy_compress and pg_LZ4_compress.

But now that you've added bkpb.block_compression, you should be able to
avoid VARATT_IS_COMPRESSED() altogether, unless I'm missing something.
What do you think?
You are right. It can be removed.


Thank you,



On Fri, Jul 4, 2014 at 9:35 PM, Abhijit Menon-Sen a...@2ndquadrant.com
wrote:

 At 2014-07-04 21:02:33 +0530, a...@2ndquadrant.com wrote:
 
   +/*
   + */
   +static const struct config_enum_entry
 backup_block_compression_options[] = {

 Oh, I forgot to mention that the configuration setting changes are also
 pending. I think we had a working consensus to use full_page_compression
 as the name of the GUC. As I understand it, that'll accept an algorithm
 name as an argument while we're still experimenting, but eventually once
 we select an algorithm, it'll become just a boolean (and then we don't
 need to put algorithm information into BkpBlock any more either).

 -- Abhijit



Re: [HACKERS] add line number as prompt option to psql

2014-07-07 Thread Jeevan Chalke
Hi,

Found two (A and B) issues with latest patch:

A:
-- Set prompts
postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[1]=# \set PROMPT2 '%/[%l]%R%# '

postgres[1]=#
postgres[1]=# select
postgres[2]-# *
postgres[3]-# from
postgres[4]-# abc;
ERROR:  relation abc does not exist
LINE 4: abc;
^

Now I used \e to edit the source. Deleted last line i.e. abc; and
returned to prompt, landed at line 4, typed abc;

postgres[1]=# \e
postgres[4]-# abc;
ERROR:  relation abc does not exist
LINE 5: abc;
^

In above steps, error message says LINE 5, where as on prompt abc is at
line 4.

postgres[1]=# select
*
from
abc;
ERROR:  relation abc does not exist
LINE 4: abc;
^

Here I again see error at line 4. Something fishy. Please investigate.
Looks like bug in LINE numbering in error message, not sure though.
But with prompt line number feature, it should be sync to each other.


B:
However, I see that you have removed the code changes related to INT_MAX.
Why?
I have set cur_line to INT_MAX - 2 and then observed that after 2 lines I
start getting negative numbers.

Thanks


On Sun, Jul 6, 2014 at 10:48 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 On Fri, Jun 20, 2014 at 7:17 PM, Jeevan Chalke
 jeevan.cha...@enterprisedb.com wrote:
  Hi Sawada Masahiko,
 
  I liked this feature. So I have reviewed it.
 
  Changes are straight forward and looks perfect.
  No issues found with make/make install/initdb/regression.
 
  However I would suggest removing un-necessary braces at if, as we have
 only
  one statement into it.
 
  if (++cur_line = INT_MAX)
  {
  cur_line = 1;
  }
 
  Also following looks wrong:
 
  postgres[1]=# select
  postgres[2]-# *
  postgres[3]-# from
  postgres[4]-# tab;
   a
  ---
  (0 rows)
 
  postgres[1]=# select
  *
  from
  tab
  postgres[2]-# where t  10;
  ERROR:  column t does not exist
  LINE 5: where t  10;
^
 
  Line number in ERROR is 5 which is correct.
  But line number in psql prompt is wrong.
 
  To get first 4 lines I have simply used up arrow followed by an enter for
  which I was expecting 5 in psql prompt.
  But NO it was 2 which is certainly wrong.
 
  Need to handle above carefully.
 
  Thanks
 
 
  On Thu, Jun 12, 2014 at 10:46 PM, Sawada Masahiko sawada.m...@gmail.com
 
  wrote:
 
  Hi all,
 
  The attached IWP patch is one prompt option for psql, which shows
  current line number.
  If the user made syntax error with too long SQL then psql outputs
  message as following.
 
  ERROR:  syntax error at or near a
  LINE 250: hoge
  ^
  psql teaches me where syntax error is occurred, but it is not kind
  when SQL is too long.
  We can use write SQL with ¥e(editor) command(e.g., emacs) , and we can
  know line number.
  but it would make terminal log dirty . It will make analyzing of log
  difficult after error is occurred.
  (I think that we usually use copy  paste)
 
  After attached this patch, we will be able to use %l option as prompting
  option.
 
  e.g.,
  $ cat ~/.psqlrc
  \set PROMPT2 '%/[%l]%R%# '
  \set PROMPT1 '%/[%l]%R%# '
  $ psql -d postgres
  postgres[1]=# select
  postgres[2]-# *
  postgres[3]-# from
  postgres[4]-# hoge;
 
  The past discussion is following.
 
  
 http://www.postgresql.org/message-id/cafj8prc1rupk6+cha1jpxph3us_zipabdovmceex4wpbp2k...@mail.gmail.com
 
 
  Please give me feedback.
 
  Regards,
 
  ---
  Sawada Masahiko
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
 
 
 
  --
  Jeevan B Chalke
  Principal Software Engineer, Product Development
  EnterpriseDB Corporation
  The Enterprise PostgreSQL Company
 
  Phone: +91 20 30589500
 
  Website: www.enterprisedb.com
  EnterpriseDB Blog: http://blogs.enterprisedb.com/
  Follow us on Twitter: http://www.twitter.com/enterprisedb
 
  This e-mail message (and any attachment) is intended for the use of the
  individual or entity to whom it is addressed. This message contains
  information from EnterpriseDB Corporation that may be privileged,
  confidential, or exempt from disclosure under applicable law. If you are
 not
  the intended recipient or authorized to receive this for the intended
  recipient, any use, dissemination, distribution, retention, archiving, or
  copying of this communication is strictly prohibited. If you have
 received
  this e-mail in error, please notify the sender immediately by reply
 e-mail
  and delete this message.

 Thank you for reviewing patch, and sorry for late response.

 I have updated this patch, and attached it.

  postgres[1]=# select
  *
  from
  tab
  postgres[2]-# where t  10;
  ERROR:  column t does not exist
  LINE 5: where t  10;
 Attached patch can handle this case.

 Please give me feedback.

 Regards,

 ---
 Sawada Masahiko




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 

Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder

2014-07-07 Thread Asif Naeem
Hi MauMau,

Other than my pervious comments, patch looks good to me. Thanks.

Regards,
Muhammad Asif Naeem


On Wed, Feb 26, 2014 at 2:14 AM, Peter Eisentraut pete...@gmx.net wrote:

 You should be able to do this without specifically referencing the names
 libpq or ecpg.  Look into the Makefile, if it defines
 SO_MAJOR_VERSION, then it's a library you are concerned with, and then
 do the copying or moving.



Re: [HACKERS] tab completion for setting search_path

2014-07-07 Thread Fujii Masao
On Tue, Jun 24, 2014 at 11:57 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Mon, Jun 23, 2014 at 6:25 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Mon, Jun 23, 2014 at 3:53 PM, Andres Freund and...@2ndquadrant.com
 wrote:
 On 2014-06-23 13:10:34 -0400, Robert Haas wrote:
 On Mon, Jun 23, 2014 at 9:10 AM, Kevin Grittner kgri...@ymail.com
 wrote:
  I would be for excluding the pg_toast, pg_toast_temp_n, and
  pg_temp_n schemas, and including public and pg_catalog.

 +1.

 Jeff, are you willing to update the patch that way? Seems we have
 something several people can live with ;)

 I was hoping not to add a third set of filters (separate from the ones
 already implicit in either \dn or \dnS), but that's OK; as long as I get a
 doghouse I won't worry much about the color.

 I've included information_schema as well, in analogy to the inclusion of
 pg_catalog.

Is there any blocker on this patch? The patch looks good to me.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-07 Thread David Rowley
On Mon, Jul 7, 2014 at 4:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrow...@gmail.com writes:
  On 6 July 2014 03:20, Tom Lane t...@sss.pgh.pa.us wrote:
  Just to note that I've started looking at this, and I've detected a
 rather
  significant omission: there's no check that the join operator has
 anything
  to do with the subquery's grouping operator.

  hmm, good point. If I understand this correctly we can just ensure that
 the
  same operator is used for both the grouping and the join condition.

 Well, that's sort of the zero-order solution, but it doesn't work if the
 join operators are cross-type.

 I poked around to see if we didn't have some code already for that, and
 soon found that not only do we have such code (equality_ops_are_compatible)
 but actually almost this entire patch duplicates logic that already exists
 in optimizer/util/pathnode.c, to wit create_unique_path's subroutines
 query_is_distinct_for et al.  So I'm thinking what this needs to turn into
 is an exercise in refactoring to allow that logic to be used for both
 purposes.


Well, it seems that might just reduce the patch size a little!
I currently have this half hacked up to use query_is_distinct_for, but I
see there's no code that allows Const's to exist in the join condition. I
had allowed for this in groupinglist_is_unique_on_restrictinfo() and I
tested it worked in a regression test (which now fails). I think to fix
this, all it would take would be to modify query_is_distinct_for to take a
list of Node's rather than a list of column numbers then just add some
logic that skips if it's a Const and checks it as it does now if it's a Var
Would you see a change of this kind a valid refactor for this patch?

 I notice that create_unique_path is not paying attention to the question

 of whether the subselect's tlist contains SRFs or volatile functions.
 It's possible that that's a pre-existing bug.


*shrug*, perhaps the logic for that is best moved into
query_is_distinct_for then? It might save a bug in the future too that way.

Regards

David Rowley


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-07-07 Thread Robert Haas
On Mon, May 26, 2014 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Albe Laurenz laurenz.a...@wien.gv.at writes:
 In addition to data type mapping questions (which David already raised)
 I have one problem when I think of the Oracle FDW:

 Oracle follows the SQL standard in folding table names to upper case.
 So this would normally result in a lot of PostgreSQL foreign tables
 with upper case names, which would be odd and unpleasant.

 I cannot see a way out of that, but I thought I should mention it.

 It seems like it would often be desirable for the Oracle FDW to smash
 all-upper-case names to all-lower-case while importing, so that no quoting
 is needed on either side.  I doubt though that this is so desirable that
 it should happen unconditionally.

 Between this and the type-mapping questions, it seems likely that
 we're going to need a way for IMPORT FOREIGN SCHEMA to accept
 user-supplied control options, which would in general be specific
 to the FDW being used.  (Another thing the SQL committee failed to
 think about.)

Is this part of the SQL standard?  What is it defined to do about
non-table objects?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-07-07 Thread Ronan Dunklau
Le lundi 7 juillet 2014 07:58:33 Robert Haas a écrit :
 On Mon, May 26, 2014 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Albe Laurenz laurenz.a...@wien.gv.at writes:
  In addition to data type mapping questions (which David already raised)
  I have one problem when I think of the Oracle FDW:
  
  Oracle follows the SQL standard in folding table names to upper case.
  So this would normally result in a lot of PostgreSQL foreign tables
  with upper case names, which would be odd and unpleasant.
  
  I cannot see a way out of that, but I thought I should mention it.
  
  It seems like it would often be desirable for the Oracle FDW to smash
  all-upper-case names to all-lower-case while importing, so that no quoting
  is needed on either side.  I doubt though that this is so desirable that
  it should happen unconditionally.
  
  Between this and the type-mapping questions, it seems likely that
  we're going to need a way for IMPORT FOREIGN SCHEMA to accept
  user-supplied control options, which would in general be specific
  to the FDW being used.  (Another thing the SQL committee failed to
  think about.)
 
 Is this part of the SQL standard?  What is it defined to do about
 non-table objects?

The OPTIONS clause is not part of the SQL Standard.

Regarding non-table objects, the standard only talks about tables, and nothing 
else. 

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] RETURNING PRIMARY KEY syntax extension

2014-07-07 Thread Robert Haas
On Fri, Jul 4, 2014 at 12:36 AM, Rushabh Lathia
rushabh.lat...@gmail.com wrote:
 On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Tom Dunstan pg...@tomd.cc writes:
  On 4 July 2014 00:07, Tom Lane t...@sss.pgh.pa.us wrote:
  It sounds to me like designing this for JDBC's getGeneratedKeys method
  is a mistake.  There isn't going to be any way that the driver can
  support
  that without having looked at the table's metadata for itself, and if
  it's going to do that then it doesn't need a crutch that only partially
  solves the problem anyhow.

  Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet
  from whatever was returned. It's CURRENTLY doing that, but it's
  appending
  RETURNING * and leaving it up to the caller of getGeneratedKeys() to
  work
  out which columns the caller is interested in.

 I might be mistaken, but as I read the spec for getGeneratedKeys, whether
 or not a column is in a primary key has got nothing to do with whether
 that column should be returned.  It looks to me like what you're supposed
 to return is columns with computed default values, eg, serial columns.
 (Whether we should define it as being *exactly* columns created by the
 SERIAL macro is an interesting question, but for sure those ought to be
 included.)  Now, the pkey might be a serial column ... but it equally
 well might not be; it might not have any default expression at all.
 And there certainly could be serial columns that weren't in the pkey.

 100% agree with Tom.

Well, we could have a RETURNING GENERATED COLUMNS construct, or
something like that.  I can certainly see the usefulness of such a
thing.

As a general note not necessarily related to this specific issue, I
think we should be careful not to lose sight of the point of this
feature, which is to allow pgsql-jdbc to more closely adhere to the
JDBC specification.  While it's great if this feature has general
utility, if it doesn't help pgsql-jdbc, then it fails to achieve the
author's original intention.  We need to make sure we're not throwing
up too many obstacles in the way of better driver compliance if we
want to have good drivers.

As a code-level comment, I am not too find of adding yet another value
for IndexAttrBitmapKind; every values we compute in
RelationGetIndexAttrBitmap adds overhead for everyone, even people not
using the feature.  Admittedly, it's only paid once per relcache load,
but does
map_primary_key_to_list() really need this information cached, or can
it just look through the index list for the primary key, and then work
directly from that?

Also, map_primary_key_to_list() tacitly assumes that an auto-updatable
view has only 1 from-list item.  That seems like a good thing to check
with an Assert(), just in case we should support some other case in
the future.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use unique index for longer pathkeys.

2014-07-07 Thread Abhijit Menon-Sen
Hi.

I took a quick look at this patch, more or less because nobody else did.

 Duing last CF, I proposed to match long pathkeys against index columns
 during creating index paths. This worked fine but also it is difficult
 to make sure that all side-effects are eliminated. Finally Tom Lane
 suggested to truncate pathkeys while generation of the pathkeys
 itself. So this patch comes.

I found your older patch quite straightforward to understand, but the
new one much more difficult to follow (but that's not saying much, I
am not very familiar with the planner code in general).

Do you have any references to the discussion about the side-effects that
needed to be eliminated with the earlier patch?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-07-07 Thread Robert Haas
On Fri, Jul 4, 2014 at 1:15 AM, Dilip kumar dilip.ku...@huawei.com wrote:
 In attached patch, I have moved pgpipe, piperead functions to src/port/pipe.c

If we want to consider proceeding with this approach, you should
probably separate this into a refactoring patch that doesn't do
anything but move code around and a feature patch that applies on top
of it.

(As to whether this is the right approach, I'm not sure.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster name in ps output

2014-07-07 Thread Robert Haas
On Fri, Jul 4, 2014 at 10:33 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Vik Fearing vik.fear...@dalibo.com writes:
 You mean that if synchronous_standby_names is an empty it automatically
 should be set to cluster_name?

 No, I mean that synchronous_standby_names should look at cluster_name
 first, and if it's not set then unfortunately look at application_name
 for backward compatibility.

 I think you're failing to explain yourself very well.  What you really
 mean is that we should use cluster_name not application_name to determine
 the name that a standby server reports to the master.

 There's something to that, perhaps, but I think that the mechanism we use
 for doing the reporting involves the application_name parameter passed
 through libpq.  It might be a bit painful to change that, and I'm not
 entirely sure it'd be less confusing.

I actually prefer it the way it is now, I think.  Arguably,
application_name is not quite the right thing for identifying a
synchronous standby, because it's intended to answer the question
What *class* of connection is this? rather than Which *precise*
connection is this?.  But I don't think there's anything especially
wrong with having a class that has only 1 member when synchronous
replication is in use; indeed, in some ways it seems quite natural.
That connection is after all unique.

OTOH, AIUI, cluster_name is all about what shows up in the ps output,
and is intended to distinguish multiple clusters running on the same
server.  If you're not doing that, you likely want cluster_name to be
empty, but you might still want to use synchronous replication.  If
you are doing it, you may well want to set it to a value that
distinguishes the server only from others running on the same box,
like maybe the version number or port -- whereas for matching against
synchronous_standby_names, we need a value that's meaningful across
all related servers, but not necessarily distinguishing from other
stuff on the same server.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I think we should rework RecordTransactionCommit() to only wait for the
 standby if `markXidCommitted' and not if `wrote_xlog'. There really
 isn't a reason to make a readonly transaction's commit wait just because
 it did some hot pruning.

Well, see the comment that explains why the logic is like this now:

 * If we didn't create XLOG entries, we're done here; otherwise we
 * should flush those entries the same as a commit record.  (An
 * example of a possible record that wouldn't cause an XID to be
 * assigned is a sequence advance record due to nextval() --- we want
 * to flush that to disk before reporting commit.)

I agree that HOT pruning isn't a reason to make a commit wait, but
nextval() is.

We could perhaps add more flags that would keep track of which sorts of
xlog entries justify a wait at commit, but TBH I'm skeptical of the entire
proposition.  Having synchronous replication on with no live slave *will*
result in arbitrary hangs, and the argument that this particular case
should be exempt seems a bit thin to me.  The sooner the user realizes
he's got a problem, the better.  If read-only transactions don't show a
problem, the user might not realize he's got one until he starts to wonder
why autovac/autoanalyze aren't working.

I think a more useful line of thought would be to see if we can't complain
more loudly when we have no synchronous standby.  Perhaps a WARNING:
waiting forever for lack of a synchronous standby could be emitted when
a transaction starts to wait.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-07 Thread Tom Lane
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 Right now, constraint exclusion code looks only at the provided conditions.
 If we want avoid table scan based on constraints in the above example, it
 will need to look at the implied conditions as well. E.g. val2  30 AND val
 = val2 = val  30. Then the constraint exclusion can see that val  30 AND
 val  30 are contradictory and infer that the result is going to be empty.
 We will need to add information about the transitive inferences between
 operators. Can we do that in PostgreSQL? Will the benefits be worth the
 code, that needs to be added?

I doubt it.  The extra code isn't the problem so much, it's the extra
planning cycles spent trying to make proofs that will usually fail.
What you propose will create a combinatorial explosion in the number
of proof paths to be considered.

 I can see some more benefits. We can use it to eliminate joins where the
 constraints on joining relations may cause the join to be empty e.g.

... and applying constraint exclusion to join relations would make that
problem even worse.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg_upgrade and toast tables bug discovered

2014-07-07 Thread Robert Haas
On Fri, Jul 4, 2014 at 11:12 PM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Jul  4, 2014 at 12:01:37AM -0400, Bruce Momjian wrote:
  The most robust, but not trivial, approach seems to be to prevent toast
  table creation if there wasn't a set_next_toast_pg_class_oid(). Then,
  after all relations are created, iterate over all pg_class entries that
  possibly need toast tables and recheck if they now might need one.

 Wow, that is going to be kind of odd in that there really isn't a good
 way to create toast tables except perhaps add a dummy TEXT column and
 remove it.  There also isn't an easy way to not create a toast table,
 but also find out that one was needed.  I suppose we would have to
 insert some dummy value in the toast pg_class column and come back later
 to clean it up.

 I am wondering what the probability of having a table that didn't need a
 toast table in the old cluster, and needed one in the new cluster, and
 there being an oid collision.

 I think the big question is whether we want to go down that route.

 Here is an updated patch.  It was a little tricky because if the
 mismatched non-toast table is after the last old relation, you need to
 test differently and emit a different error message as there is no old
 relation to complain about.

 As far as the reusing of oids, we don't set the oid counter until after
 the restore, so any new unmatched toast table would given a very low
 oid.  Since we restore in oid order, for an oid to be assigned that was
 used in the old cluster, it would have to be a very early relation, so I
 think that reduces the odds considerably.  I am not inclined to do
 anything more to avoid this until I see an actual error report ---
 trying to address it might be invasive and introduce new errors.

That sounds pretty shaky from where I sit.  I wonder if the
pg_upgrade_support module should have a function
create_toast_table_if_needed(regclass) or something like that.  Or
maybe just create_any_missing_toast_tables(), iterating over all
relations.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Andres Freund
On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I think we should rework RecordTransactionCommit() to only wait for the
  standby if `markXidCommitted' and not if `wrote_xlog'. There really
  isn't a reason to make a readonly transaction's commit wait just because
  it did some hot pruning.
 
 Well, see the comment that explains why the logic is like this now:
 
  * If we didn't create XLOG entries, we're done here; otherwise we
  * should flush those entries the same as a commit record.  (An
  * example of a possible record that wouldn't cause an XID to be
  * assigned is a sequence advance record due to nextval() --- we want
  * to flush that to disk before reporting commit.)

I think we should 'simply' make sequences assign a toplevel xid - then
we can get rid of that special case in RecordTransactionCommit(). And I
think the performance benefit of not having to wait on XLogFlush() for
readonly xacts due to hot prunes far outweighs the decrease due to the
xid assignment/commit record.  I don't think that nextval()s are called
overly much without a later xid assigning statement.

 I agree that HOT pruning isn't a reason to make a commit wait, but
 nextval() is.

Agreed.

 We could perhaps add more flags that would keep track of which sorts of
 xlog entries justify a wait at commit, but TBH I'm skeptical of the entire
 proposition.  Having synchronous replication on with no live slave *will*
 result in arbitrary hangs, and the argument that this particular case
 should be exempt seems a bit thin to me.  The sooner the user realizes
 he's got a problem, the better.  If read-only transactions don't show a
 problem, the user might not realize he's got one until he starts to wonder
 why autovac/autoanalyze aren't working.

Well, the user might just want to log in to diagnose the problem. If he
can't even login to see pg_stat_replication it's a pretty screwed up
situation.

 I think a more useful line of thought would be to see if we can't complain
 more loudly when we have no synchronous standby.  Perhaps a WARNING:
 waiting forever for lack of a synchronous standby could be emitted when
 a transaction starts to wait.

In the OP's case the session wasn't even started - so proper feedback
isn't that easy...
We could special case that by forcing s_c=off until the session started 
properly.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
 Well, see the comment that explains why the logic is like this now:

 I think we should 'simply' make sequences assign a toplevel xid - then
 we can get rid of that special case in RecordTransactionCommit(). And I
 think the performance benefit of not having to wait on XLogFlush() for
 readonly xacts due to hot prunes far outweighs the decrease due to the
 xid assignment/commit record.  I don't think that nextval()s are called
 overly much without a later xid assigning statement.

Yeah, that could well be true.  I'm not sure if there are any other cases
where we have non-xid-assigning operations that are considered part of
what has to be flushed before reporting commit; if there are not, I'd
be okay with changing nextval() this way.

 I think a more useful line of thought would be to see if we can't complain
 more loudly when we have no synchronous standby.  Perhaps a WARNING:
 waiting forever for lack of a synchronous standby could be emitted when
 a transaction starts to wait.

 In the OP's case the session wasn't even started - so proper feedback
 isn't that easy...

Perhaps I'm wrong, but I think a WARNING emitted here would be seen in
psql even though we're still in InitPostgres.  If it isn't, we have a
problem there anyhow, IMO.

 We could special case that by forcing s_c=off until the session started 
 properly.

Ugh.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Fujii Masao
On Tue, Jul 8, 2014 at 1:06 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
 Well, see the comment that explains why the logic is like this now:

 I think we should 'simply' make sequences assign a toplevel xid - then
 we can get rid of that special case in RecordTransactionCommit(). And I
 think the performance benefit of not having to wait on XLogFlush() for
 readonly xacts due to hot prunes far outweighs the decrease due to the
 xid assignment/commit record.  I don't think that nextval()s are called
 overly much without a later xid assigning statement.

 Yeah, that could well be true.  I'm not sure if there are any other cases
 where we have non-xid-assigning operations that are considered part of
 what has to be flushed before reporting commit;

Maybe pg_switch_xlog().

 if there are not, I'd
 be okay with changing nextval() this way.

+1

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-07 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 On Mon, Jul 7, 2014 at 4:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I poked around to see if we didn't have some code already for that, and
 soon found that not only do we have such code (equality_ops_are_compatible)
 but actually almost this entire patch duplicates logic that already exists
 in optimizer/util/pathnode.c, to wit create_unique_path's subroutines
 query_is_distinct_for et al.  So I'm thinking what this needs to turn into
 is an exercise in refactoring to allow that logic to be used for both
 purposes.

 Well, it seems that might just reduce the patch size a little!
 I currently have this half hacked up to use query_is_distinct_for, but I
 see there's no code that allows Const's to exist in the join condition. I
 had allowed for this in groupinglist_is_unique_on_restrictinfo() and I
 tested it worked in a regression test (which now fails). I think to fix
 this, all it would take would be to modify query_is_distinct_for to take a
 list of Node's rather than a list of column numbers then just add some
 logic that skips if it's a Const and checks it as it does now if it's a Var
 Would you see a change of this kind a valid refactor for this patch?

I'm a bit skeptical as to whether testing for that case is actually worth
any extra complexity.  Do you have a compelling use-case?  But anyway,
if we do want to allow it, why does it take any more than adding a check
for Consts to the loops in query_is_distinct_for?  It's the targetlist
entries where we'd want to allow Consts, not the join conditions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster name in ps output

2014-07-07 Thread Alvaro Herrera
Tom Lane wrote:

 Also, -1 for adding another log_line_prefix escape.  If you're routing
 multiple clusters logging to the same place (which is already a bit
 unlikely IMO), you can put distinguishing strings in log_line_prefix
 already.  And it's not like we've got an infinite supply of letters
 for those escapes.

FWIW if we find ourselves short on log_line_prefix letters, we can
always invent more verbose notation -- for instance we could have escape
for each GUC var such as %{config:port} for the port number, and so on.
Probably this will be mostly useless for most GUC params, but it might
come in handy for a few of them.  And of course we could have more
prefixes apart from config:.

Not that this affects the current patch in any way.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-07 Thread Andres Freund
On 2014-07-07 12:06:14 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
  Well, see the comment that explains why the logic is like this now:
 
  I think we should 'simply' make sequences assign a toplevel xid - then
  we can get rid of that special case in RecordTransactionCommit(). And I
  think the performance benefit of not having to wait on XLogFlush() for
  readonly xacts due to hot prunes far outweighs the decrease due to the
  xid assignment/commit record.  I don't think that nextval()s are called
  overly much without a later xid assigning statement.
 
 Yeah, that could well be true.  I'm not sure if there are any other cases
 where we have non-xid-assigning operations that are considered part of
 what has to be flushed before reporting commit; if there are not, I'd
 be okay with changing nextval() this way.

I'm not aware of any adhoc, but I think to actually change it someone
would have to iterate over all wal record types to make sure.

  I think a more useful line of thought would be to see if we can't complain
  more loudly when we have no synchronous standby.  Perhaps a WARNING:
  waiting forever for lack of a synchronous standby could be emitted when
  a transaction starts to wait.
 
  In the OP's case the session wasn't even started - so proper feedback
  isn't that easy...
 
 Perhaps I'm wrong, but I think a WARNING emitted here would be seen in
 psql even though we're still in InitPostgres.

Yes, it is visible.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg_upgrade and toast tables bug discovered

2014-07-07 Thread Bruce Momjian
On Fri, Jul  4, 2014 at 11:12:58PM -0400, Bruce Momjian wrote:
 On Fri, Jul  4, 2014 at 12:01:37AM -0400, Bruce Momjian wrote:
   The most robust, but not trivial, approach seems to be to prevent toast
   table creation if there wasn't a set_next_toast_pg_class_oid(). Then,
   after all relations are created, iterate over all pg_class entries that
   possibly need toast tables and recheck if they now might need one.
  
  Wow, that is going to be kind of odd in that there really isn't a good
  way to create toast tables except perhaps add a dummy TEXT column and
  remove it.  There also isn't an easy way to not create a toast table,
  but also find out that one was needed.  I suppose we would have to
  insert some dummy value in the toast pg_class column and come back later
  to clean it up.
  
  I am wondering what the probability of having a table that didn't need a
  toast table in the old cluster, and needed one in the new cluster, and
  there being an oid collision.
  
  I think the big question is whether we want to go down that route.
 
 Here is an updated patch.  It was a little tricky because if the
 mismatched non-toast table is after the last old relation, you need to
 test differently and emit a different error message as there is no old
 relation to complain about.
 
 As far as the reusing of oids, we don't set the oid counter until after
 the restore, so any new unmatched toast table would given a very low
 oid.  Since we restore in oid order, for an oid to be assigned that was
 used in the old cluster, it would have to be a very early relation, so I
 think that reduces the odds considerably.  I am not inclined to do
 anything more to avoid this until I see an actual error report ---
 trying to address it might be invasive and introduce new errors.

Patch applied back through 9.2.  9.1 and earlier had code that was
different enought that I thought it would cause too many problems, and I
doubt many users are doing major uprades to 9.1, and the bug is rare.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg_upgrade and toast tables bug discovered

2014-07-07 Thread Bruce Momjian
On Mon, Jul  7, 2014 at 11:24:51AM -0400, Robert Haas wrote:
  As far as the reusing of oids, we don't set the oid counter until after
  the restore, so any new unmatched toast table would given a very low
  oid.  Since we restore in oid order, for an oid to be assigned that was
  used in the old cluster, it would have to be a very early relation, so I
  think that reduces the odds considerably.  I am not inclined to do
  anything more to avoid this until I see an actual error report ---
  trying to address it might be invasive and introduce new errors.
 
 That sounds pretty shaky from where I sit.  I wonder if the
 pg_upgrade_support module should have a function
 create_toast_table_if_needed(regclass) or something like that.  Or
 maybe just create_any_missing_toast_tables(), iterating over all
 relations.

Uh, I guess we could write some code that iterates over all tables and
finds the tables that should have TOAST tables, but don't (because
binary-upgrade backend mode suppressed their creation), and adds them.

However, that would be a lot of code and might be risky to backpatch. 
The error is so rare I am not sure it is worth it.  I tried to create
the failure case and it was very difficult, requiring me to create the
problem table first, then some dummy stuff to get the offsets right so
the oid would collide.

Based on the rareness of the bug, I am not sure it is worth it, but if
it is, it would be something only for 9.4 (maybe) and 9.5, not something
to backpatch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-07-07 Thread Robert Haas
On Thu, Jul 3, 2014 at 1:14 AM, Stephen Frost sfr...@snowman.net wrote:
 Alright, apologies for it being a bit later than intended, but here's
 what I've come up with thus far.

 -- policies defined at a table scope
 -- allows using the same policy name for different tables
 -- with quals appropriate for each table
 ALTER TABLE t1 ADD POLICY p1 USING p1_quals;
 ALTER TABLE t1 ADD POLICY p2 USING p2_quals;

 -- used to drop a policy definition from a table
 ALTER TABLE t1 DROP POLICY p1;

 -- cascade required when references exist for the policy
 -- from roles
 ALTER TABLE t1 DROP POLICY p1 CASCADE;

 ALTER TABLE t1 ALTER POLICY p1 USING new_quals;

 -- Controls if any RLS is applied to this table or not
 -- If enabled, all users must access through some policy
 ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;

 -- Associates roles to policies
 ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING p1;
 ALTER TABLE table_name REVOKE ROW ACCESS FROM role_name USING p1;

If you're going to have predicates be table-level and access grants be
table-level, then what's the value in having policies?  You could just
do:

ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals;

As I see it, the only value in having policies as separate objects is
that you can then, by granting access to the policy, give a particular
user a bundle of rights rather than having to grant each right
individually.  But with this design, you've got to create the policy,
then add the quals to it for each table, and then you still have to
give access individually for every row, table combination, so what
value is the policy object itself providing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-07 Thread Greg Stark
On Mon, Jul 7, 2014 at 3:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I doubt it.  The extra code isn't the problem so much, it's the extra
 planning cycles spent trying to make proofs that will usually fail.
 What you propose will create a combinatorial explosion in the number
 of proof paths to be considered.

Well, not necessarily. You only need to consider constraints on
matching columns and only when there's a join column on those columns.
So you could imagine, for example, sorting all the constraints by the
eclass for the non-const side of their expression, then going through
them linearly to see where you have multiple constraints on the same
eclass.

For what it's worth I think there is a case where this is a common
optimization. When you have multiple tables partitioned the same way.
Then you ideally want to be able to turn that from an join of multiple
appends into an append of multiple joins. This is even more important
when it comes to a parallelizing executor since it lets you do the
joins in parallel.

However to get from here to there I guess you would need to turn the
join of the appends into NxM joins of every pair of subscans and then
figure out which ones to exclude. That would be pretty nuts. To do it
reasonably we probably need the partitioning infrastructure we've been
talking about where Postgres would know what the partitioning key is
and the order and range of the partitions so it can directly generate
the matching subjoins in less than n^2 time.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump --stats

2014-07-07 Thread Abhijit Menon-Sen
At 2014-07-07 09:48:44 +0200, and...@2ndquadrant.com wrote:

 I'd suggest only defining INT64_MODIFIER here and build INT64_FORMAT,
 UINT64_FORMAT ontop, in c.h.

Oh, I see. I'm sorry, I misread your earlier suggestion. Regenerated
patches attached. Is this what you had in mind?

-- Abhijit
diff --git a/src/include/c.h b/src/include/c.h
index a48a57a..999f297 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -864,6 +864,9 @@ typedef NameData *Name;
  * 
  */
 
+#define INT64_FORMAT % INT64_MODIFIER d
+#define UINT64_FORMAT % INT64_MODIFIER u
+
 /* msb for char */
 #define HIGHBIT	(0x80)
 #define IS_HIGHBIT_SET(ch)		((unsigned char)(ch)  HIGHBIT)

diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 2a40d61..c2e01fd 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -650,8 +650,8 @@
 /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */
 #undef HAVE__VA_ARGS
 
-/* Define to the appropriate snprintf format for 64-bit ints. */
-#undef INT64_FORMAT
+/* Define to the appropriate snprintf length modifier for 64-bit ints. */
+#undef INT64_MODIFIER
 
 /* Define to 1 if `locale_t' requires xlocale.h. */
 #undef LOCALE_T_IN_XLOCALE
@@ -741,9 +741,6 @@
 /* Define to 1 if your sys/time.h declares `struct tm'. */
 #undef TM_IN_SYS_TIME
 
-/* Define to the appropriate snprintf format for unsigned 64-bit ints. */
-#undef UINT64_FORMAT
-
 /* Define to 1 to build with assertion checks. (--enable-cassert) */
 #undef USE_ASSERT_CHECKING
 
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index 1c9cd82..a238a72 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -529,8 +529,8 @@
 /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */
 #define HAVE__VA_ARGS 1
 
-/* Define to the appropriate snprintf format for 64-bit ints, if any. */
-#define INT64_FORMAT %lld
+/* Define to the appropriate snprintf length modifier for 64-bit ints. */
+#undef INT64_MODIFIER
 
 /* Define to 1 if `locale_t' requires xlocale.h. */
 /* #undef LOCALE_T_IN_XLOCALE */
@@ -601,10 +601,6 @@
 /* Define to 1 if your sys/time.h declares `struct tm'. */
 /* #undef TM_IN_SYS_TIME */
 
-/* Define to the appropriate snprintf format for unsigned 64-bit ints, if any.
-   */
-#define UINT64_FORMAT %llu
-
 /* Define to 1 to build with assertion checks. (--enable-cassert) */
 /* #undef USE_ASSERT_CHECKING */
 
diff --git a/configure.in b/configure.in
index c938a5d..7750c30 100644
--- a/configure.in
+++ b/configure.in
@@ -1636,34 +1636,29 @@ fi
 # If we found long int is 64 bits, assume snprintf handles it.  If
 # we found we need to use long long int, better check.  We cope with
 # snprintfs that use %lld, %qd, or %I64d as the format.  If none of these
-# work, fall back to our own snprintf emulation (which we know uses %lld).
+# works, fall back to our own snprintf emulation (which we know uses %lld).
 
 if test $HAVE_LONG_LONG_INT_64 = yes ; then
   if test $pgac_need_repl_snprintf = no; then
-PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT
-if test $LONG_LONG_INT_FORMAT = ; then
+PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER
+if test $LONG_LONG_INT_MODIFIER = ; then
   # Force usage of our own snprintf, since system snprintf is broken
   pgac_need_repl_snprintf=yes
-  LONG_LONG_INT_FORMAT='%lld'
+  LONG_LONG_INT_MODIFIER='ll'
 fi
   else
 # Here if we previously decided we needed to use our own snprintf
-LONG_LONG_INT_FORMAT='%lld'
+LONG_LONG_INT_MODIFIER='ll'
   fi
-  LONG_LONG_UINT_FORMAT=`echo $LONG_LONG_INT_FORMAT | sed 's/d$/u/'`
-  INT64_FORMAT=\$LONG_LONG_INT_FORMAT\
-  UINT64_FORMAT=\$LONG_LONG_UINT_FORMAT\
 else
   # Here if we are not using 'long long int' at all
-  INT64_FORMAT='%ld'
-  UINT64_FORMAT='%lu'
+  LONG_LONG_INT_MODIFIER='l'
 fi
 
-AC_DEFINE_UNQUOTED(INT64_FORMAT, $INT64_FORMAT,
-   [Define to the appropriate snprintf format for 64-bit ints.])
+INT64_MODIFIER=\$LONG_LONG_INT_MODIFIER\
 
-AC_DEFINE_UNQUOTED(UINT64_FORMAT, $UINT64_FORMAT,
-   [Define to the appropriate snprintf format for unsigned 64-bit ints.])
+AC_DEFINE_UNQUOTED(INT64_MODIFIER, $INT64_MODIFIER,
+   [Define to the appropriate snprintf length modifier for 64-bit ints.])
 
 # Also force use of our snprintf if the system's doesn't support the %z flag.
 if test $pgac_need_repl_snprintf = no; then

diff --git a/config/c-library.m4 b/config/c-library.m4
index 8f45010..4821a61 100644
--- a/config/c-library.m4
+++ b/config/c-library.m4
@@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals
 AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS
 
 
-# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT
+# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER
 # ---
-# Determine which format snprintf uses for long long int.  We handle
-# %lld, %qd, %I64d.  The result is 

[HACKERS] things I learned from working on memory allocation

2014-07-07 Thread Robert Haas
I wrote previously about my efforts to create a new memory allocator
(then called sb_alloc, now renamed to BlockAllocatorContext in my git
branch for consistency with existing identifiers) for PostgreSQL.  To
make a long story short, many of the basic concepts behind that patch
still seem sound to me, but significant problems remain, which account
for it not having been submitted to the current CommitFest nor
attached to this email.  Working in this area has led me to a few
conclusions - comments welcome.

1. I tried to write a single allocator which would be better than
aset.c in two ways: first, by allowing allocation from dynamic shared
memory; and second, by being more memory-efficient than aset.c.
Heikki told me at PGCon that he thought that was a bad idea, and I now
think he was right.  Allocating from dynamic shared memory requires
the use of relative rather than absolute pointers (because the DSM
could be mapped at different addresses in different processes) and, if
concurrent allocating and freeing is to be supported, locking.
Although neither of these things requires very many extra
instructions, the common path for aset.c is extremely short, and we
haven't got those instructions to spare.  Code like if (lock != NULL)
LWLockAcquire(lock, LW_EXCLUSIVE) is quite expensive even if the
branch is never taken, apparently because being prepared to call a
function at that point requires storing more stuff in our stack frame,
and extra instructions to save and restore registers are a painfully
expensive on allocation-heavy workloads.  On PPC64, to match aset.c's
performance, a palloc/pfree cycle must run in ~120 instructions, ~80
cycles, which doesn't leave much room for fluff.

2. After ripping the relative-pointer and locking stuff out of my
allocator, I came up with something which performs about like aset.c
on allocation, but with significantly better memory efficiency on
small allocations.  REINDEX pgbench_accounts_pkey saves about 22%,
because the SortTuple array saves nothing but the individual tuples
take only half as much space, by avoiding the StandardChunkHeader
overhead.  This seems like a savings worth pursuing, if we can figure
out how to get it.  Unfortunately, there's some incremental overhead
in pfree, amounting to a single test of global variable, even when the
new allocator is never used, which is measurable on a microbenchmark;
I don't remember the exact numbers off-hand.  When the new allocator
*is* used, pfree gets a lot slower - mostly because one of the data
structures used by the new allocator suffer occasional cache misses
during free operations.  I think there might be an argument that some
hit on pfree would be worth taking in exchange for better
space-efficiency, because we mostly free contexts by resetting them
anyway; but I haven't yet managed to make that hit small enough to
feel especially good about it.

3. The current MemoryContext abstraction layer is inadequate for
almost everything one might want to do.  The idea of a context that
allows only allocation, and ignores requests to free memory, has been
discussed more than once.  Such an allocator could skip the
power-of-two rounding that aset.c does, but it couldn't omit the chunk
header, which means that in practice it would save no memory at all
for cases like the one mentioned above (REINDEX
pgbench_accounts_pkey).  While there might be some benefit in other
cases, without getting rid of or at least reducing the size of the
chunk-header, I expect that the benefits will be too narrow to make
this approach worth pursuing.  The chunk-header requirement is also a
killer for DSM, again because segments can be mapped at different
addresses in different backends.  I'm inclined to think that if we
can't find a way to get rid of the chunk-header requirement, we ought
to consider ripping out the whole abstraction layer as a
performance-wasting exercise in futility.

4. The MemoryContext layer embeds assumptions not only about the
layout of memory, but the performance of various operations.  For
example, GetMemoryContextSpace() is assumed to be cheap, so there's no
interface like void *MemoryContextAlloc(Size request_size, Size
*chunk_size) or Size MemoryContextFree(Size chunk_size), but those
would be significantly more efficient for my new allocator.  Possibly
better still would be to have the context itself track utilization and
soft-fail when we run out of space; even for aset.c, it makes little
sense to refuse to accept another tuple when the AllocSet has a
suitable object on the freelist.  That's probably not a critical flaw
because, since tuples being sorted are likely all about the same size,
the waste may be little or nothing.  Other allocators might have
different space-management strategies, though, where this matters
more.

5. It's tempting to look at other ways of solving the parallel sort
problem that don't need an allocator - perhaps by simply packing all
the tuples into a DSM one after the next.  But this is not 

Re: [HACKERS] Pg_upgrade and toast tables bug discovered

2014-07-07 Thread Bruce Momjian
On Mon, Jul  7, 2014 at 01:44:59PM -0400, Bruce Momjian wrote:
 On Mon, Jul  7, 2014 at 11:24:51AM -0400, Robert Haas wrote:
   As far as the reusing of oids, we don't set the oid counter until after
   the restore, so any new unmatched toast table would given a very low
   oid.  Since we restore in oid order, for an oid to be assigned that was
   used in the old cluster, it would have to be a very early relation, so I
   think that reduces the odds considerably.  I am not inclined to do
   anything more to avoid this until I see an actual error report ---
   trying to address it might be invasive and introduce new errors.
  
  That sounds pretty shaky from where I sit.  I wonder if the
  pg_upgrade_support module should have a function
  create_toast_table_if_needed(regclass) or something like that.  Or
  maybe just create_any_missing_toast_tables(), iterating over all
  relations.
 
 Uh, I guess we could write some code that iterates over all tables and
 finds the tables that should have TOAST tables, but don't (because
 binary-upgrade backend mode suppressed their creation), and adds them.
 
 However, that would be a lot of code and might be risky to backpatch. 
 The error is so rare I am not sure it is worth it.  I tried to create
 the failure case and it was very difficult, requiring me to create the
 problem table first, then some dummy stuff to get the offsets right so
 the oid would collide.
 
 Based on the rareness of the bug, I am not sure it is worth it, but if
 it is, it would be something only for 9.4 (maybe) and 9.5, not something
 to backpatch.

Another idea would be to renumber empty toast tables that conflict
during new toast table creation, when in binary upgrade mode.  Since the
files are always empty in binary upgrade mode, you could just create a
new toast table, repoint the old table to use (because it didn't ask for
a specific toast table oid because it didn't need one), and then use
that one for the table that actually requested the oid.  However, if one
is a heap (zero size), and one is an index (8k size), it wouldn't work
and you would have to recreate the file system file too.

That seems a lot more localized than the other approaches.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] things I learned from working on memory allocation

2014-07-07 Thread Peter Geoghegan
On Mon, Jul 7, 2014 at 12:57 PM, Robert Haas robertmh...@gmail.com wrote:
 5. It's tempting to look at other ways of solving the parallel sort
 problem that don't need an allocator - perhaps by simply packing all
 the tuples into a DSM one after the next.  But this is not easy to do,
 or at least it's not easy to do efficiently.  Tuples enter tuplesort.c
 through one of the tuplesort_put*() functions, most of which end up
 calling one of the copytup_*() functions.  But you can't easily just
 change those functions to stick the tuple someplace else, because they
 don't necessarily get the address of the space to be used from palloc
 directly.  In particular, copytup_heap() calls
 ExecCopySlotMinimalTuple(), and copytup_cluster() calls
 heap_copytuple().  heap_copytuple() is simple enough that you might be
 able to finagle a new API that would make it work, but I'm not sure
 what we could really do about ExecCopySlotMinimalTuple() except call
 it and then copy the result.  Perhaps that'd be OK for a first
 version.

Perhaps. If my experience with sort support for text is anything to go
on, the cost of copying over the tuples isn't really that high
relative to the cost of performing the sort, particularly when there
are many tuples to sort. OTOH, I'd be concerned about the cost of main
memory accesses for pass-by-reference types during parallel tuple
sorts in particular. The same concern applies to cases where the tuple
proper must be accessed, although presumably that occurs less
frequently.

The LLVM project's new libc++ library passes remark on a similar issue
in their rationale for yet another C++ standard library: For example,
it is generally accepted that building std::string using the short
string optimization instead of using Copy On Write (COW) is a
superior approach for multicore machines It seems likely that
they're mostly referencing the apparent trend of memory bandwidth per
core stagnation [1], [2]. To get the real benefit of a parallel sort,
we must do anything we can to avoid having cores/workers remain idle
during the sort while waiting on memory access. I believe that that's
the bigger problem.

[1] 
http://www.admin-magazine.com/HPC/Articles/Finding-Memory-Bottlenecks-with-Stream
[2] 
https://software.intel.com/en-us/articles/detecting-memory-bandwidth-saturation-in-threaded-applications

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output

2014-07-07 Thread Tom Lane
Jeevan Chalke jeevan.cha...@enterprisedb.com writes:
 Can't we remove this else all-together and print extra line unconditionally?
 Also can we remove curly braces in if and else near these code changes ?
 (Attached patch along those lines)

I committed this with slight further rearrangement to create what seems
like better vertical whitespace choices in verbose vs non-verbose cases,
to wit

--
-- PostgreSQL database dump
--

-- Dumped from database version 9.5devel
-- Dumped by pg_dump version 9.5devel

SET statement_timeout = 0;
...

vs

--
-- PostgreSQL database dump
--

-- Dumped from database version 9.5devel
-- Dumped by pg_dump version 9.5devel

-- Started on 2014-07-07 19:05:54 EDT

SET statement_timeout = 0;
...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-07-07 Thread Tom Lane
Abhijit Menon-Sen a...@2ndquadrant.com writes:
 I can confirm that it works fine. I have attached here a very slightly
 tweaked version of the patch (removed trailing whitespace, and changed
 some comment text).

 I'm marking this ready for committer.

I don't know enough about XML/XPATH to know if this is a good idea or not,
but I did go read the documentation of xmlCopyNode(), and I notice it says
the second argument is

extended: if 1 do a recursive copy (properties, namespaces and children
when applicable) if 2 copy properties and namespaces (when applicable)

Do we really want it to copy children?  Perhaps the value should be 2?
(I have no idea, I'm just raising the question.)

I also notice that it says

Returns: a new #xmlNodePtr, or NULL in case of error.

and the patch is failing to cover the error case.  Most likely, that
would represent out-of-memory, so it definitely seems to be something
we should account for.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Improve bytea error messages

2014-07-07 Thread Robert Haas
On Sun, Jul 6, 2014 at 4:17 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 After someone reported somewhat less than informative errors in bytea
 decoding (http://stackoverflow.com/q/24588866/398670) I thought I'd put
 together a quick patch to improve the errors here.

The first two changes seem fine from here, but I think the use of
parentheses in the third one likely violates our message style
guidelines.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-07-07 Thread Tom Lane
Petr Jelinek p...@2ndquadrant.com writes:
 here is a patch implementing varwidth_bucket (naming is up for 
 discussion) function which does binning with variable bucket width.

I didn't see any discussion of the naming question in this thread.
I'd like to propose that it should be just width_bucket(); we can
easily determine which function is meant, considering that the
SQL-spec variants don't take arrays and don't even have the same
number of actual arguments.

Also, the set of functions provided still needs more thought, because the
resolution of overloaded functions doesn't really work as nicely as all
that.  I illustrate:

regression=# create function myf(int8) returns int as 'select 0' language sql;
CREATE FUNCTION
regression=# create function myf(float8) returns int as 'select 1' language 
sql; 
CREATE FUNCTION
regression=# create function myf(anyelement) returns int as 'select 2' language 
sql;
CREATE FUNCTION
regression=# select myf(1);
 myf 
-
   1
(1 row)

So given plain integer arguments, we'll invoke the float8 version not the
int8 version, which may be undesirable.  The same for int2 arguments.
We could fix that by adding bespoke int4 and maybe int2 variants, but
TBH, I'm not sure that the specific-type functions are worth the trouble.
Maybe we should just have one generic function, and take the trouble to
optimize its array-subscripting calculations for either fixed-length or
variable-length array elements?  Have you got performance measurements
demonstrating that multiple implementations really buy enough to justify
the extra code?

Also, I'm not convinced by this business of throwing an error for a
NULL array element.  Per spec, null arguments to width_bucket()
produce a null result, not an error --- shouldn't this flavor act
similarly?  In any case I think the test needs to use
array_contains_nulls() not just ARR_HASNULL.

Lastly, the spec defines behaviors for width_bucket that allow either
ascending or descending buckets.  We could possibly do something similar
in these functions by initially comparing the two endpoint elements to see
which one is larger, and then reversing the sense of all the comparisons
if the first one is larger.  I'm not sure if that's worth the trouble or
not, but if the SQL committee thought descending bucket numbering was
worthwhile, maybe it's worthwhile here too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tab completion for setting search_path

2014-07-07 Thread Fujii Masao
On Mon, Jul 7, 2014 at 8:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jun 24, 2014 at 11:57 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Mon, Jun 23, 2014 at 6:25 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Mon, Jun 23, 2014 at 3:53 PM, Andres Freund and...@2ndquadrant.com
 wrote:
 On 2014-06-23 13:10:34 -0400, Robert Haas wrote:
 On Mon, Jun 23, 2014 at 9:10 AM, Kevin Grittner kgri...@ymail.com
 wrote:
  I would be for excluding the pg_toast, pg_toast_temp_n, and
  pg_temp_n schemas, and including public and pg_catalog.

 +1.

 Jeff, are you willing to update the patch that way? Seems we have
 something several people can live with ;)

 I was hoping not to add a third set of filters (separate from the ones
 already implicit in either \dn or \dnS), but that's OK; as long as I get a
 doghouse I won't worry much about the color.

 I've included information_schema as well, in analogy to the inclusion of
 pg_catalog.

 Is there any blocker on this patch? The patch looks good to me.

The patch makes the tab-completion on search_path display the existing
schemas, additionally what about displaying $user, too? I think that
some users would want to include $user variable in search_path.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] things I learned from working on memory allocation

2014-07-07 Thread Robert Haas
On Mon, Jul 7, 2014 at 5:37 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Jul 7, 2014 at 12:57 PM, Robert Haas robertmh...@gmail.com wrote:
 5. It's tempting to look at other ways of solving the parallel sort
 problem that don't need an allocator - perhaps by simply packing all
 the tuples into a DSM one after the next.  But this is not easy to do,
 or at least it's not easy to do efficiently.  Tuples enter tuplesort.c
 through one of the tuplesort_put*() functions, most of which end up
 calling one of the copytup_*() functions.  But you can't easily just
 change those functions to stick the tuple someplace else, because they
 don't necessarily get the address of the space to be used from palloc
 directly.  In particular, copytup_heap() calls
 ExecCopySlotMinimalTuple(), and copytup_cluster() calls
 heap_copytuple().  heap_copytuple() is simple enough that you might be
 able to finagle a new API that would make it work, but I'm not sure
 what we could really do about ExecCopySlotMinimalTuple() except call
 it and then copy the result.  Perhaps that'd be OK for a first
 version.

 Perhaps. If my experience with sort support for text is anything to go
 on, the cost of copying over the tuples isn't really that high
 relative to the cost of performing the sort, particularly when there
 are many tuples to sort.

The testing I did showed about a 5% overhead on REINDEX INDEX
pgbench_accounts_pkey from one extra tuple copy (cf.
9f03ca915196dfc871804a1f8aad26207f601fd6).  Of course that could vary
by circumstance for a variety of reasons.

 OTOH, I'd be concerned about the cost of main
 memory accesses for pass-by-reference types during parallel tuple
 sorts in particular. The same concern applies to cases where the tuple
 proper must be accessed, although presumably that occurs less
 frequently.

I do think that's a problem with our sort implementation, but it's not
clear to me whether it's *more* of a problem for parallel sort than it
is for single-backend sort.

 The LLVM project's new libc++ library passes remark on a similar issue
 in their rationale for yet another C++ standard library: For example,
 it is generally accepted that building std::string using the short
 string optimization instead of using Copy On Write (COW) is a
 superior approach for multicore machines It seems likely that
 they're mostly referencing the apparent trend of memory bandwidth per
 core stagnation [1], [2]. To get the real benefit of a parallel sort,
 we must do anything we can to avoid having cores/workers remain idle
 during the sort while waiting on memory access. I believe that that's
 the bigger problem.

Yes, I agree that's a problem.  There are algorithms for parallel
quicksort in the literature which seem to offer promising solutions,
but unfortunately these infrastructure issues have prevented me from
reaching them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Interesting behavior change of psql

2014-07-07 Thread Tatsuo Ishii
From 9.4 psql has slightly changed it's behavior.

Pre 9.4:
psql --version
psql (PostgreSQL) 9.3.4
psql -U '' test
FATAL:  no PostgreSQL user name specified in startup packet
psql: FATAL:  no PostgreSQL user name specified in startup packet

9.4:
psql -U '' test
psql (9.4beta1, server 9.3.4)
Type help for help.

test=# 

Please note that this behavior change was actually brought by libpq:

Have libpq's PQconnectdbParams() and PQpingParams() functions
 process zero-length strings as defaults (Adrian
 Vondendriesch)

 Previously, these functions treated zero-length string values
 as defaults only in some cases.

It's not very clear that username is affected by this change here. I
just want to remind hackers before you get inquries on this from
PostgreSQL users.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] things I learned from working on memory allocation

2014-07-07 Thread Peter Geoghegan
On Mon, Jul 7, 2014 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote:
 The testing I did showed about a 5% overhead on REINDEX INDEX
 pgbench_accounts_pkey from one extra tuple copy (cf.
 9f03ca915196dfc871804a1f8aad26207f601fd6).  Of course that could vary
 by circumstance for a variety of reasons.

Be careful of the check for pre-sorted input within qsort() giving an
overly optimistic picture of things, thereby exaggerating the
proportion of time spent copying if taken as generally representative.

 OTOH, I'd be concerned about the cost of main
 memory accesses for pass-by-reference types during parallel tuple
 sorts in particular. The same concern applies to cases where the tuple
 proper must be accessed, although presumably that occurs less
 frequently.

 I do think that's a problem with our sort implementation, but it's not
 clear to me whether it's *more* of a problem for parallel sort than it
 is for single-backend sort.

If you'll forgive me for going on about my patch on this thread, I
think the pgbench -c 4 and -c 1 cases that I tested suggest it is
a particular problem for parallel sorts, as there is a much bigger
both absolute and proportional difference in transaction throughput
between those two with the patch applied. It seems reasonable to
suppose the difference would be larger still if we were considering a
single parallel sort, as opposed to multiple independent sorts (of the
same data) that happen to occur in parallel.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-07-07 Thread Bruce Momjian
On Tue, Jun 24, 2014 at 10:17:49AM -0700, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
  On 06/23/2014 03:52 PM, Andres Freund wrote:
  True.  Which makes me wonder whether we shouldn't default this to
  something non-zero -- even if it is 5 or 10 days.
 
  I'd go for even shorter: 48 hours.  I'd suggest 24 hours, but that would
  trip up some users who just need really long pg_dumps.
 
 FWIW, I do not think we should have a nonzero default for this.
 We could not safely set it to any value that would be small enough
 to be really useful in the field.
 
 BTW, has anyone thought about the interaction of this feature with
 prepared transactions?  I wonder whether there shouldn't be a similar but
 separately-settable maximum time for a transaction to stay in the prepared
 state.  If we could set a nonzero default on that, perhaps on the order of
 a few minutes, we could solve the ancient bugaboo that prepared
 transactions are too dangerous to enable by default.

Added as a TODO item.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RETURNING PRIMARY KEY syntax extension

2014-07-07 Thread Rushabh Lathia
On Mon, Jul 7, 2014 at 5:50 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Jul 4, 2014 at 12:36 AM, Rushabh Lathia
 rushabh.lat...@gmail.com wrote:
  On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Tom Dunstan pg...@tomd.cc writes:
   On 4 July 2014 00:07, Tom Lane t...@sss.pgh.pa.us wrote:
   It sounds to me like designing this for JDBC's getGeneratedKeys
 method
   is a mistake.  There isn't going to be any way that the driver can
   support
   that without having looked at the table's metadata for itself, and if
   it's going to do that then it doesn't need a crutch that only
 partially
   solves the problem anyhow.
 
   Sure it can - it append RETURNING PRIMARY KEY and hand back a
 ResultSet
   from whatever was returned. It's CURRENTLY doing that, but it's
   appending
   RETURNING * and leaving it up to the caller of getGeneratedKeys() to
   work
   out which columns the caller is interested in.
 
  I might be mistaken, but as I read the spec for getGeneratedKeys,
 whether
  or not a column is in a primary key has got nothing to do with whether
  that column should be returned.  It looks to me like what you're
 supposed
  to return is columns with computed default values, eg, serial columns.
  (Whether we should define it as being *exactly* columns created by the
  SERIAL macro is an interesting question, but for sure those ought to be
  included.)  Now, the pkey might be a serial column ... but it equally
  well might not be; it might not have any default expression at all.
  And there certainly could be serial columns that weren't in the pkey.
 
  100% agree with Tom.

 Well, we could have a RETURNING GENERATED COLUMNS construct, or
 something like that.  I can certainly see the usefulness of such a
 thing.

 As a general note not necessarily related to this specific issue, I
 think we should be careful not to lose sight of the point of this
 feature, which is to allow pgsql-jdbc to more closely adhere to the
 JDBC specification.  While it's great if this feature has general
 utility, if it doesn't help pgsql-jdbc, then it fails to achieve the
 author's original intention.  We need to make sure we're not throwing
 up too many obstacles in the way of better driver compliance if we
 want to have good drivers.

 As a code-level comment, I am not too find of adding yet another value
 for IndexAttrBitmapKind; every values we compute in
 RelationGetIndexAttrBitmap adds overhead for everyone, even people not
 using the feature.  Admittedly, it's only paid once per relcache load,
 but does
 map_primary_key_to_list() really need this information cached, or can
 it just look through the index list for the primary key, and then work
 directly from that?

 Also, map_primary_key_to_list() tacitly assumes that an auto-updatable
 view has only 1 from-list item.  That seems like a good thing to check
 with an Assert(), just in case we should support some other case in
 the future.


Another code-level comment is, why we need RowExclusiveLock before calling
map_primary_key_to_list() ? I think we should have explanation for that.

+/* Handle RETURNING PRIMARY KEY */
+if(query-returningPK)
+{
+RangeTblEntry *rte = (RangeTblEntry *)
list_nth(query-rtable, query-resultRelation - 1);
+Relation rel = heap_open(rte-relid, RowExclusiveLock);
+query-returningList = map_primary_key_to_list(rel,
query);
+heap_close(rel, NoLock);
+}


--
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Rushabh Lathia


Re: [HACKERS] Selectivity estimation for inet operators

2014-07-07 Thread Dilip kumar
On 06 July 2014 20:33, Emre Hasegeli Wrote,
 
  Apart from that there is one spell check you can correct
  -- in inet_his_inclusion_selec comments histogram boundies  -
  histogram boundaries :)
 
 I fixed it.  New version attached.  The debug log statements are also
 removed.

I have done with the review, patch seems fine to me

I have one last comment, after clarifying this I can move it to ready for 
committer.
1. In networkjoinsel, For avoiding the case of huge statistics, only some of 
the values from mcv and histograms are used (calculated using SQRT).
-- But in my opinion, if histograms and mcv both are exist then its fine, but 
if only mcv's are there in that case, we can match complete MCV, it will give 
better accuracy.
   In other function like eqjoinsel also its matching complete MCV.



Thanks  Regards,
Dilip Kumar

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Improve bytea error messages

2014-07-07 Thread Craig Ringer
On 07/08/2014 07:44 AM, Robert Haas wrote:
 On Sun, Jul 6, 2014 at 4:17 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 After someone reported somewhat less than informative errors in bytea
 decoding (http://stackoverflow.com/q/24588866/398670) I thought I'd put
 together a quick patch to improve the errors here.
 
 The first two changes seem fine from here, but I think the use of
 parentheses in the third one likely violates our message style
 guidelines.

Good point.

Better?

Putting it in ERRHINT is more appropriate.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 285720d1b028da4fd3a1fc4b5deb006ab7b71959 Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Sun, 6 Jul 2014 16:15:21 +0800
Subject: [PATCH] Improve bytea decoding error messages

---
 src/backend/utils/adt/encode.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index 46993ba..f09e506 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -292,7 +292,7 @@ b64_decode(const char *src, unsigned len, char *dst)
 else
 	ereport(ERROR,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-			 errmsg(unexpected \=\)));
+			 errmsg(unexpected \=\ while decoding base64 sequence)));
 			}
 			b = 0;
 		}
@@ -304,7 +304,7 @@ b64_decode(const char *src, unsigned len, char *dst)
 			if (b  0)
 ereport(ERROR,
 		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-		 errmsg(invalid symbol)));
+		 errmsg(invalid symbol '%c' while decoding base64 sequence, (int) c)));
 		}
 		/* add it to buffer */
 		buf = (buf  6) + b;
@@ -324,7 +324,8 @@ b64_decode(const char *src, unsigned len, char *dst)
 	if (pos != 0)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(invalid end sequence)));
+ errmsg(invalid base64 end sequence),
+ errhint(input data is missing padding, truncated or otherwise corrupted)));
 
 	return p - dst;
 }
-- 
1.9.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-07 Thread Ashutosh Bapat
On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
  Right now, constraint exclusion code looks only at the provided
 conditions.
  If we want avoid table scan based on constraints in the above example, it
  will need to look at the implied conditions as well. E.g. val2  30 AND
 val
  = val2 = val  30. Then the constraint exclusion can see that val  30
 AND
  val  30 are contradictory and infer that the result is going to be
 empty.
  We will need to add information about the transitive inferences between
  operators. Can we do that in PostgreSQL? Will the benefits be worth the
  code, that needs to be added?

 I doubt it.  The extra code isn't the problem so much, it's the extra
 planning cycles spent trying to make proofs that will usually fail.
 What you propose will create a combinatorial explosion in the number
 of proof paths to be considered.


I can understand that it will create combinatorial explosion in the number
of predicates that need to be examined by the constraint exclusion. I do
not understand where come the paths gets involved here. The constraint
exclusion kicks in before paths are created

 220 static void
 221 set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 222  Index rti, RangeTblEntry *rte)
 223 {
 224 if (rel-reloptkind == RELOPT_BASEREL 
 225 relation_excluded_by_constraints(root, rel, rte))
 226 {
 227 /*
 228  * We proved we don't need to scan the rel via constraint
exclusion,
 229  * so set up a single dummy path for it.  Here we only check
this for
 230  * regular baserels; if it's an otherrel, CE was already
checked in
 231  * set_append_rel_pathlist().
 232  *
 233  * In this case, we go ahead and set up the relation's path
right away
 234  * instead of leaving it for set_rel_pathlist to do.  This is
because
 235  * we don't have a convention for marking a rel as dummy
except by
 236  * assigning a dummy path to it.
 237  */
 238 set_dummy_rel_pathlist(rel);
 239 }

 and does not create paths for relations excluded by constraints

 295 /*
 296  * set_rel_pathlist
 297  *Build access paths for a base relation
 298  */
 299 static void
 300 set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 301  Index rti, RangeTblEntry *rte)
 302 {
 303 if (IS_DUMMY_REL(rel))
 304 {
 305 /* We already proved the relation empty, so nothing more to do
*/
 306 }

Same in the case of join in mark_join_rel()

 663  * JOIN_INNER.
 664  */
 665 switch (sjinfo-jointype)
 666 {
 667 case JOIN_INNER:
 668 if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
 669 restriction_is_constant_false(restrictlist, false))
 670 {
 671 mark_dummy_rel(joinrel);
 672 break;
 673 }
 674 add_paths_to_joinrel(root, joinrel, rel1, rel2,
 675  JOIN_INNER, sjinfo,
 676  restrictlist);
 677 add_paths_to_joinrel(root, joinrel, rel2, rel1,
 678  JOIN_INNER, sjinfo,
 679  restrictlist);
 680 break;

BTW, on a side note, I noticed, we use mark_dummy_rel() for joinrels and
set_dummy_rel_pathlist() for baserel. Shouldn't we be using the same
function everywhere for the same functionality (e.g. adding an append path
with no child-paths).

For larger relations, the time spent in constraint exclusion might be
lesser than the time taken by actual table/index scan and that to when such
a scan is not going to produce any rows. So, we might want to apply the
technique only when the estimated number of rows/pages are above a certain
threshold and may be when the GUC is ON (like we do it today).


  I can see some more benefits. We can use it to eliminate joins where the
  constraints on joining relations may cause the join to be empty e.g.

 ... and applying constraint exclusion to join relations would make that
 problem even worse.


The same case goes with joins, where potentially, the crossproduct of two
tables is huge.


 regards, tom lane




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


[HACKERS] Modifying update_attstats of analyze.c for C Strings

2014-07-07 Thread Ashoke
Hi,

I am trying to implement a functionality that is similar to ANALYZE, but
needs to have different values (the values will be valid and is stored in
inp-str[][]) for MCV/Histogram Bounds in case the column under
consideration is varchar (C Strings). I have written a function
*dummy_update_attstats* with the following changes. Other things remain the
same as in *update_attstats* of *~/src/backend/commands/analyze.c*


*---*
*{*

* ArrayType  *arry;*
* if (*
*strcmp(col_type,varchar) == 0*
*)*
* arry = construct_array(stats-stavalues[k],*
* stats-numvalues[k],*
* CSTRINGOID,*
* -2,*
* false,*
* 'c');*
* else*
* arry = construct_array(stats-stavalues[k],*
* stats-numvalues[k],*
* stats-statypid[k],*
* stats-statyplen[k],*
* stats-statypbyval[k],*
* stats-statypalign[k]);*
* values[i++] = PointerGetDatum(arry); /* stavaluesN */}*
  ---

and I update the hist_values in the appropriate function as:
  ---

*if (strcmp(col_type,varchar) == 0**)*
* hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),*
* false,*
* -2);*
*---*

I tried this based on the following reference :
http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

My issue is : When I use my way for strings, the MCV/histogram_bounds in
pg_stats doesn't have double quotes ( ) surrounding string. That is,

If normal *update_attstats* is used, histogram_bounds for *TPCH
nation(n_name)* are : *ALGERIA   ,ARGENTINA,...*
If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
nation(n_name)* are : *ALGERIA,ARGENTINA,...*

This becomes an issue if the string has ',' (commas), like for example in
*n_comment* column of *nation* table.

Could someone point out the problem and suggest a solution?

Thank you.

-- 
Regards,
Ashoke